UI/UX Atlas
Design Systems Advanced

AI & Design Systems (Figma MCP, Carbon for AI)

Explore how AI tooling — from Figma's Model Context Protocol to IBM's Carbon for AI — is reshaping design systems in 2026: new components, new workflows, new governance challenges.

9 min read

The full lesson

Design systems used to be static. AI is changing that. It is introducing a new class of UI components, reshaping how teams build systems, and creating fresh governance challenges. From IBM’s Carbon for AI library to Figma’s Model Context Protocol (MCP) server — which lets AI agents read and write Figma documents in code — the tooling landscape shifted fast between 2024 and 2026. Knowing where these tools genuinely help, and where they introduce new failure modes, is now a core skill for senior design-system practitioners.

What Changes When AI Enters the Stack

Most design systems were built around a stable assumption: components render static or event-driven state. AI interfaces break that assumption in two key ways.

First, outputs are non-deterministic. A streaming text response, a generated image, or an agentic status update can arrive at any time, at any length, with varying confidence. Components that worked fine for everyday CRUD apps — text fields, tables, modals — need entirely new states: generating, streaming, interrupted, uncertain, failed-gracefully.

Second, the interaction model is richer and more ambiguous. Putting everything through a text prompt (the outdated default) is too limiting. Modern AI-native design pairs structured and conversational UI together, with transparency, override capability, and graceful-failure affordances. Design systems must encode these patterns as first-class components with documented usage — not one-off solutions for each feature team.

New Component Categories for AI Interfaces

A mature AI design system needs at least four new component families beyond the standard interactive kit.

Streaming and Generation States

Static loading spinners are the old way. AI generation can take anywhere from 0.5 seconds to 30 seconds. It can stream character by character, or arrive in chunks. The modern pattern uses a skeleton-to-stream progression:

  1. A skeleton screen holds the layout space while the request is in-flight.
  2. A streaming cursor appears when tokens begin arriving.
  3. Content renders progressively — paragraphs lock in as they complete.
  4. A completion indicator (a subtle check mark or animation) signals the end of generation.

This follows the explicit-state-machine principle: each state (idle → loading → streaming → complete → error) is a distinct, designed variant, not an afterthought.

Confidence and Uncertainty Indicators

AI outputs carry inherent uncertainty. A well-designed system makes that uncertainty legible without overwhelming users. Common patterns include:

  • Inline confidence badges — color-coded chips (using semantic tokens, not hardcoded hex values) that communicate high, medium, or low confidence on a per-claim basis.
  • Source attribution links — especially important for grounded RAG (retrieval-augmented generation) outputs, where the AI answers using specific documents.
  • Uncertainty tooltips — hovering or focusing reveals the model’s stated confidence or source list.

These components must meet WCAG 2.2 AA contrast requirements. They also need proper aria-describedby wiring so screen readers surface the uncertainty metadata, not just the visible text.

Approval and Override Flows

Agentic AI takes actions — booking a meeting, sending a message, modifying data. Letting it execute silently without a confirmation step is an outdated default and a genuine safety risk. Design systems should provide:

  • A proposed-action card that shows what the agent intends to do, with approve, reject, and modify controls.
  • A reversibility indicator — a visually distinct treatment for destructive versus reversible actions.
  • An audit trail component for agentic logs.

Do

Provide approve/reject controls with clear reversibility cues. Design an “undo” affordance for any agentic action taken without explicit confirmation. Use semantic tokens so the danger state renders correctly in both light and dark themes.

Don't

Execute agentic actions silently or show chain-of-thought reasoning as the sole trust signal without giving users meaningful control. Do not use raw color values (e.g., hardcoded red hex) for danger states — they will break in dark mode and violate the token contract.

Prompt and Feedback Primitives

Prompt inputs are more than text areas. A well-specified AI input component includes: character and token count feedback, multi-modal attachment affordances (file, image, clipboard), a send/interrupt toggle that switches function during generation, and appropriate disabled states when the model is unavailable.

IBM Carbon for AI: A Reference Implementation

IBM’s Carbon Design System extended its core library with a dedicated Carbon for AI layer. It has been publicly available and documented since 2024. It is one of the most complete, production-grade implementations of AI UI patterns available — worth studying even if your team uses a different base system.

Key contributions from Carbon for AI:

ComponentWhat it solves
AILabel / slugTags AI-generated or AI-assisted content with a standardized disclosure badge
Chat componentsA full chat-shell suite including ChatMessage, ChatTextInput, ChatFeedback, ChatAvatar
Skeleton variantsExtended loading skeletons mapped to AI content shapes (paragraphs, code, lists)
StatefulNotificationPersistent, dismissible alerts for AI error, uncertainty, and compliance states
Prompt pattern libraryDocumented patterns for prompt construction, suggestion chips, and history

The architectural decision worth borrowing: Carbon for AI does not replace base components — it extends them. The AILabel is a compositional add-on, not a fork of the base Button. This mirrors the three-tier token architecture. Primitive tokens stay stable, and AI-specific semantic tokens (ai-background, ai-border-strong, ai-focus) are layered on top. Teams using this approach keep their existing system intact while adding AI affordances incrementally.

Figma MCP: AI-Driven Design System Tooling

The Model Context Protocol (MCP), standardized by Anthropic in 2024, gives AI agents a common interface for calling external tools. Figma shipped an official MCP server that exposes read and write access to Figma documents. This makes it possible to build agents that can:

  • Inspect component structures and extract token usage programmatically
  • Detect design-token drift — for example, components using hardcoded hex instead of library variables
  • Generate component scaffolding from a design brief
  • Sync Storybook stories against Figma component definitions for consistency auditing

What Figma MCP Actually Exposes

The Figma MCP server surfaces document structure, not just flattened CSS. An agent can walk the node tree, read fills, effects, typography, and constraints, and compare them against a token manifest. In practice, a CI step can now flag when a designer binds a hardcoded #E53935 to a button background instead of the color.danger.default token. That is the same class of drift that previously required manual design reviews.

Key MCP operations relevant to design systems:

  • get_file — fetch a full document tree
  • get_file_components — list all published components with metadata
  • get_file_styles — retrieve named styles (Figma’s equivalent of tokens)
  • get_component_sets — access component variants and their property schema
  • post_comment — leave automated review comments on frames

Integrating Figma MCP into a Design-System Workflow

A practical integration follows this pattern:

  1. Token audit on merge: on every Figma branch merge, an agent calls get_file_styles and diffs against the canonical DTCG token file in the repo. Discrepancies are posted as comments and block the branch from being published.

  2. Component documentation sync: an agent reads component descriptions from Figma and pushes them to the living-docs platform (for example, zeroheight or Storybook MDX). This keeps documentation up to date without the manual copy-paste step that inevitably causes drift.

  3. Accessibility pre-check: an agent inspects text layers and their backgrounds, runs WCAG 2.2 AA contrast checks (using APCA as a supplementary perceptual quality lens), and flags failures before the handoff ever reaches engineering.

Token Architecture for AI Components

AI components are not exempt from the three-tier token discipline. The failure mode is building a rushed AI feature with hardcoded colors and ad-hoc spacing, then having to migrate it during the next design-system update cycle.

The correct approach uses the same primitive → semantic → component hierarchy:

{
  "color": {
    "blue": {
      "600": { "$value": "oklch(52% 0.18 264)", "$type": "color" }
    },
    "ai": {
      "background": {
        "default": { "$value": "{color.blue.600}", "$type": "color" }
      },
      "border": {
        "subtle": { "$value": "{color.blue.200}", "$type": "color" }
      }
    }
  },
  "ai-label": {
    "background": { "$value": "{color.ai.background.default}", "$type": "color" },
    "border":     { "$value": "{color.ai.border.subtle}", "$type": "color" }
  }
}

Using OKLCH at the primitive layer means your AI theme tones are perceptually uniform. A background at luminance 52% and a border at luminance 80% will have a reliably legible contrast ratio without manual tuning. This is a significant improvement over HSL or hand-picked hex palettes, where apparent contrast shifts unpredictably across hues.

Dark mode for AI components follows the same rule as the rest of the system: first-class token values per theme, not color inversion. A streaming-text cursor that is blue in light mode should map to a lighter, lower-saturation blue in dark mode — not the same hex with reduced opacity.

Governance: New Risks, New Controls

AI introduces governance challenges that do not exist in purely static systems.

Model dependency documentation. When a component’s behavior depends on a specific model or API — for example, a SuggestionChip that calls a classification model — that dependency must be documented in the component spec alongside the usual anatomy, usage, and accessibility sections. Model deprecations and behavior changes are the design-system equivalent of breaking API changes.

Disclosure requirements. Regulators in the EU (AI Act) and several US states now mandate that AI-generated content be disclosed to users. The design system must provide standardized disclosure components (like Carbon’s AILabel), and governance must mandate their use. This is not optional configuration.

Content moderation states. AI outputs can be filtered, blocked, or redacted by safety systems. Components that render AI content must have designed states for filtered, redacted, and unavailable — not just a generic error. Leaving this to product teams produces inconsistent and sometimes misleading experiences.

Contribution process for AI components. AI components touch both design-system concerns (tokens, accessibility, dark mode) and product-specific AI backend concerns (model version, confidence thresholds). Contribution reviews therefore need two tracks: a design-system review and a product-AI review. Document this in your governance model and assign clear ownership.

Practical Starting Points

If you are adding AI patterns to an existing design system today, this is a sensible order to follow:

  1. Audit your current loading and error states. Most systems have gaps here that AI exposes immediately. Skeleton screens, streaming states, and graceful-failure messages are the highest-leverage first step.
  2. Add a disclosure primitive. A single AIBadge or AILabel component with documented placement rules prevents 10 different homegrown implementations.
  3. Establish AI-specific semantic tokens. Even three tokens (ai.background, ai.border, ai.text) create a consistent visual identity before you build more complex components.
  4. Set up a Figma MCP audit step. Even a read-only audit that posts comments is far better than no automation. Start with token-drift detection before attempting write operations.
  5. Document governance rules. Decide who owns AI components, what model-dependency documentation looks like, and where disclosure is mandatory. Write it down before you have 20 teams building in parallel.