Generative UI: When AI Architecture Builds the Interface, Not Just the Text

Miguel González Herrera

Frontend Developer

June 16, 2026

For years, interacting with an AI assistant has followed the exact same pattern: you type a question, you receive text. Useful, yes. However, it is fundamentally limited. What happens when you do not need a sentence, but a comparative table, a trend chart, or a metrics dashboard? That is where Generative UI comes into play. This paradigm goes beyond traditional text. It turns the language model into an interface orchestrator. At Sngular, our development team analyzed this technology to transform real-time user experiences.

What Is the Generative UI Paradigm?

Generative UI is a design pattern where a language model does more than output text. Additionally, it decides which interface components to display and what data to inject. Instead of replying "January revenue was €12,400", the model directly renders a bar chart showing all twelve months of the year.

Consequently, the model acts as an orchestration layer. It interprets user intent, selects the most appropriate visual representation, and injects real backend data into the correct component. As a result, the conversation shifts from a plain text exchange into an interactive, adaptive experience.

Two Key Strategies for Implementation

We built a demo application to explore two distinct approaches to this problem. Each strategy offers its own unique advantages and clear use cases.

Strategy 1: Predefined Components (Tool Calling)

The model chooses from a closed catalog of tools: displayTable, displayBarChart, displayLineChart, displayPieChart, and displayStats. Once a tool is selected, the application injects the real data and renders the corresponding React component.

Why does this perform so well? The model never serializes the entire dataset. Instead, it only declares what it wants to display and which fields it requires. The actual data is injected on the server side within the execute function of each tool. Therefore, this has a direct impact on cost reduction. You do not pay output tokens to re-serialize information that already lives in your backend.

JavaScript

// The model selects a tool and the required fields

export const displayBarChartTool = createTool({

  description: 'Displays a bar chart with sales data',

  inputSchema: z.object({

    xField: z.string().describe('Field for the X axis (p. ej., month)'),

    yField: z.string().describe('Field for the Y axis (p. ej., revenue)'),

    title: z.string().describe('Chart title'),

  }),

  execute: async ({ xField, yField, title }) => {

    // Real data is injected here; it is not generated by the model

    return expandVizSpec({ type: 'bar', xField, yField, title });

  },

});

🎥 Video Demo — External API + Predefined Components

Watch on Youtube

Strategy 2: Free JSX Code Generation

In this mode, the model writes the entire React component from scratch. The generated code is compiled and rendered in a sandbox using [react-live]((https://github.com/FormidableLabs/react-live)). Recharts components and React hooks are available in the scope without requiring explicit imports.

JavaScript

// Example of free-form model output

function Visualization({ data }) {

  return (

    <ResponsiveContainer width="100%" height={300}>

      <AreaChart data={data}>

        <Area type="monotone" dataKey="profit" stroke="#6366f1" fill="#6366f120" />

        <XAxis dataKey="month" />

        <YAxis />

        <Tooltip />

      </AreaChart>

    </ResponsiveContainer>

  );

}

An error boundary catches invalid code execution. Thanks to this, a broken visualization will not crash the entire page. In practice, this strategy shines when a user requests something outside the predefined catalog. Examples include radar charts, composite dashboards, or stacked area comparisons.

🎥 Video Demo — External API + Free Component Generation

Watch on YouTube

Where to Execute the Model: Cloud Server vs. Local Browser

Our application adds a second axis of experimentation. The model can run either in the cloud or directly inside the user's browser.

External API (Server-Side)

We use Gemini and Claude (Anthropic) through the AI SDK de Vercel. The advantage is straightforward: access to large models that reason exceptionally well about user intent. Furthermore, they reliably follow complex instructions and generate valid React code with enough consistency for production environments.

Model Comparison — External vs. Local

Criteria	External API (Gemini or Claude)	Local Browser (Gemini Nano)
Cost	Paid per token	Free
Latency	Fast and constant	Slower + initial model download
Quality & Reliability	High; follows instructions accurately	Low; prone to errors and invalid code
Privacy	Data is sent to the server	Everything stays on the client
Availability	Always available (with API key)	Only on compatible Chrome browsers

Local Browser AI via Chrome's Prompt API

Chrome's Prompt API is an experimental web API. This technology exposes Gemini Nano directly to the JavaScript environment of any web page. It is worth noting that Gemini Nano is the smallest version of the Gemini family. The model does not live on a remote server. Instead, it downloads once onto the user's device. From that point onward, all inference happens locally.

For our specific use case, the browser can receive a natural language request. The local system decides which visualization fits best and returns the chart specification. This happens without any data ever leaving the user's machine.

A Powerful Combination: Browser Execution + Predefined Catalogs

The core idea here is that we do not need the model to write code. With the Prompt API in the browser and a closed catalog of components, the model only needs to choose between five options. Then, it indicates which dataset fields to map. This is a micro-decision wrapped inside a strict schema. Therefore, a model of Gemini Nano's size can execute it with high reliability.

The result is a dynamic interface generation workflow boasting massive advantages:

Zero Cost: No API keys, no token billing, and no usage quotas. Inferences are free forever.
Total Privacy: The prompt, dataset, and model output never leave the browser. This is critical for environments managing sensitive medical or financial records. It fulfills a technical compliance requirement that cloud models simply cannot meet.
Acceptable Latency: In predefined mode, the response is a tiny JSON string. Thus, response times remain reasonable on consumer hardware.
Offline Functionality: Once downloaded, network connectivity is no longer required.

🎥 Video Demo — Local Browser AI + Predefined Components

Watch on YouTube

Where Things Get Complex: Free Generation with Gemini Nano

Free mode is a completely different story. Asking a local model to write a full React component is highly complex. It must handle Recharts, hooks, and valid JSX syntax, which pushes the model far outside its comfort zone.

Let us remember the obvious: an LLM is a probabilistic model. Correct output is never fully guaranteed. Consequently, the smaller the model, the wider the distribution of errors, leading to missing imports or malformed data. In practice, free mode with Gemini Nano surfaces two combined issues:

High Wait Times: The output is long, and local inference on integrated CPUs is slow.
Inconsistent Outputs: The model frequently fails to output correct JSX syntax.

Conversely, powerful cloud models via external APIs minimize both challenges. Latency drops significantly while consistency rises, turning free generation into a viable asset.

The Four Matrix Combinations

Our interface combines both selectors into a clear four-option matrix:

	Predefined Components	Free Code Generation
External API	Fast and predictable. The ideal choice for production environments	Maximum flexibility. Open-ended catalog without sacrificing smoothness
Local Browser	Free and private, acceptable response times	Possible, but suffers from high latency and inconsistent responses

Key lessons learned

1. The model doesn’t need to see the data to visualize it: In the default mode, the model only specifies the intent (“I want a bar chart of monthly revenue”). The actual data is fed in by the server. This separation is key to keeping costs under control and avoiding truncated responses.

2. Browser-based AI isn’t a “cheaper” version of the server: it’s a different tool. Gemini Nano in Chrome isn’t a substitute for large models. It’s free and private, but it’s also smaller, slower, and less accurate. It works very well for choosing from a closed catalog of components; for generating free-form React code, errors and wait times are common. The right decision isn’t “local vs. server” in the abstract, but rather understanding what each guarantees and choosing based on the use case.

Conclusion

Generative UI isn’t just a lab trend. It’s a natural evolution of how language models are integrated into real-world products. They act as orchestration layers that decide what to display and how to do it, not as mere text-based chatbots.

The two strategies we’ve explored—predefined catalog and free-form generation—aren’t mutually exclusive. In production, the most sensible approach is to combine them: use tool calling for known and frequent cases, and reserve free generation for when the user needs something that wasn’t anticipated.

Want to learn more about how we apply generative AI in real products? Contact us.

Miguel González Herrera

Frontend Developer

Hi! 👋 I'm Miguel, a software developer at Sngular with a strong focus on JavaScript and open source projects. I enjoy building scalable, maintainable applications and contributing to the developer community.