Mosaic AI is the GenAI and machine learning platform inside Databricks, built from the $1.4B MosaicML acquisition Databricks completed in June 2023. It bundles model serving, custom model training and fine-tuning, agent building, vector search for retrieval, and AI-judge evaluation into one platform, governed by MLflow and Unity Catalog. Databricks markets it as the only unified platform for agent systems, a vendor claim worth evaluating against your own stack.

What are the components of Mosaic AI?

Mosaic AI has six main components: Model Serving for deploying and monitoring GenAI, classical ML, and agents; Mosaic AI Training for pretraining custom LLMs and fine-tuning open-source models; Agent Bricks and the Agent Framework for building and evaluating agents grounded in enterprise data, including Genie Code; AI and Vector Search for retrieval-augmented generation; Agent Evaluation using AI judges to catch regressions and trace root causes; and the Unity AI Gateway that governs every LLM and MCP endpoint.

What governs Mosaic AI agents and models?

Mosaic AI is built on two governance foundations. MLflow, the open-source Apache-2.0 AI engineering platform, handles experiment tracking, model registry, evaluation with over 50 metrics and LLM judges, and deployment. Unity Catalog, open-sourced in June 2024 under Apache 2.0, provides access controls, AI guardrails, rate limits, and data lineage, and can govern models hosted outside Databricks too.

What ROI do Databricks customers report from Mosaic AI?

Databricks publishes several customer ROI figures for Mosaic AI, all of which are vendor-reported customer claims rather than independently audited results. The company cites FactSet at a 44% accuracy gain, Comcast at a 10x cost reduction, Block at $10M in productivity, and ICE at 96% answer accuracy. Treat these as directional marketing evidence and validate against your own workloads before budgeting.

Databricks

What Is Mosaic AI? Databricks' GenAI and Agent Platform

Mosaic AI is the GenAI and machine learning platform that lives inside Databricks. If you already run data engineering, analytics, or BI on the Databricks lakehouse, Mosaic AI is the layer that lets the same teams serve models, fine-tune open-source LLMs, build agents, and run retrieval over enterprise data, all governed by the same access controls you already use for your tables. It is the part of Databricks aimed squarely at the move from dashboards to agents.

This breakdown is written for practitioners deciding whether Mosaic AI belongs in their stack. We cover where it came from, the six components you will actually touch, the governance foundation that holds it together, and the customer ROI numbers Databricks publishes, which we label plainly as vendor-reported so you can weigh them honestly.

$1.4B

MosaicML Acquisition

June 2023

Core Components

Serving, Training, Agents, Search, Eval, Gateway

Governance Foundations

MLflow + Unity Catalog

$400

Free Trial Credits

14-day agent trial, vendor-reported

What Is Mosaic AI?

Mosaic AI is Databricks' answer to a single question: how do you build, serve, and govern AI systems on the same platform that already holds your data? Rather than ship your data out to a separate model vendor and a separate vector database and a separate evaluation tool, Mosaic AI puts those capabilities next to your lakehouse tables. The model serving, the training jobs, the agents, and the retrieval index all read from and write to data your team already controls.

The pitch matters because the alternative is a sprawl of disconnected tools. Most teams that start building with LLMs end up stitching together a model API, an orchestration framework, a vector store, a tracing tool, and a governance layer bolted on after the fact. Mosaic AI collapses that into one platform where lineage and access control are not an afterthought. Databricks goes further and markets it as the only unified platform for agent systems, which is a vendor claim worth testing against your own requirements rather than taking at face value.

Practitioner note: Mosaic AI is not a model you call. It is the platform you build on. The value is consolidation: one governance perimeter, one place to trace a bad answer back through retrieval and prompts, and one set of credentials. The tradeoff is that you are committing to the Databricks ecosystem to get it.

From MosaicML to Mosaic AI

Mosaic AI did not start inside Databricks. It traces directly to the $1.4 billion acquisition of MosaicML in June 2023. MosaicML had built a reputation for efficient large-model training, the unglamorous engineering work of pretraining and fine-tuning models without burning a fortune in compute. Databricks bought that capability and folded it into its broader data platform.

The result is a platform that did not bolt generative AI onto a warehouse as a marketing exercise. The training muscle came from a team whose entire product was making model training cheaper and more reliable. That heritage shows up in the Mosaic AI Training component, where pretraining a custom LLM and fine-tuning an open-source one are first-class workflows rather than tutorials.

$1.4B

Databricks acquired MosaicML in June 2023 to bring efficient large-model training in-house. That acquisition is the foundation Mosaic AI is built on.

Understanding this origin helps you read the platform correctly. Mosaic AI leans toward teams that want to own their models, not just call someone else's API. If your goal is to fine-tune on proprietary data and keep that data inside your own governance perimeter, the MosaicML lineage is why Mosaic AI handles that path well.

The Six Components

Mosaic AI is best understood as six components that cover the full lifecycle of an AI system, from training a model to deploying an agent and proving it works. You rarely use all six at once, but most production agent projects touch four or five of them. Here is what each one does.

Model Serving

One endpoint to deploy, govern, query, and monitor

Serves GenAI, ML, agents

Covers Deploy + monitor

Scope Unified

Mosaic AI Training

Pretrain custom LLMs, fine-tune open models

Pretrain Custom LLMs

Fine-tune Open-source

Classical ML On your data

Agent Bricks

Build and deploy agents grounded in your data

Framework Agent Framework

Includes Genie Code

Grounding Enterprise data

AI / Vector Search

Vector database with real-time sync for RAG

Type Vector DB

Sync Real-time

Use case Retrieval

Agent Evaluation

AI judges for quality, regressions, and root cause

Method AI judges

Catches Regressions

Traces Root cause

Unity AI Gateway

Governs every LLM and MCP endpoint

Governs Every LLM

Also MCP endpoints

Role Control plane

The two pieces practitioners tend to undervalue are Agent Evaluation and the Unity AI Gateway. Evaluation with AI judges is what separates a demo from a system you can put in front of customers, because it catches the silent quality regressions that creep in when you change a prompt or swap a model. The Unity AI Gateway is an LLM-gateway-style control plane that routes and governs every model and MCP call, so policy and rate limits are enforced in one place instead of scattered across application code.

Built on MLflow and Unity Catalog

Mosaic AI does not invent its governance from scratch. It rests on two open foundations that Databricks also maintains in the open-source community, which is part of why the platform feels coherent rather than bolted together.

The open-source, Apache-2.0 AI engineering platform underneath Mosaic AI. It handles experiment tracking, the model registry, evaluation with over 50 metrics and LLM judges, prompt optimization, and deployment. Databricks reports it passes 60M+ monthly downloads, a vendor-reported figure.

The unified governance layer, open-sourced in June 2024 under Apache 2.0. It provides access controls, AI guardrails, rate limits, and data lineage, and it can govern models hosted outside Databricks too, not just the ones you serve internally.

Why does this matter to a practitioner? Because the governance is the same whether you are querying a table, serving a fine-tuned model, or running an agent. The access control that protects your customer data also protects the model that reads it and the agent that acts on it. You do not configure three separate permission systems and hope they agree.

MLflow is also the reason traceability is built in rather than retrofitted. When an agent gives a bad answer, the model registry and experiment tracking give you a path back to which model version, which prompt, and which retrieval results produced it. For anyone who has tried to debug an LLM system without that lineage, this is the difference between an afternoon and a week.

Practitioner note: Both foundations are open source, which lowers lock-in at the governance layer specifically. Unity Catalog being able to govern external models means you are not forced to serve everything inside Databricks to keep a single audit trail. Treat that as a real architectural advantage when you compare platforms.

Customer ROI, Vendor-Reported

Databricks publishes several customer outcomes for Mosaic AI. We are repeating them here because they are part of how the platform is positioned, but read them with the right frame: these are vendor-reported customer claims, not independently audited results. They tell you what is possible for a well-resourced customer, not what you should budget for.

FactSet

Vendor-reported accuracy gain

Reported +44% accuracy

Comcast

Vendor-reported cost reduction

Reported 10x lower cost

Block

Vendor-reported productivity

Reported $10M gained

ICE

Vendor-reported answer accuracy

Reported 96% accuracy

The honest way to use numbers like these is as evidence that the platform can support serious production workloads, paired with healthy skepticism about whether your team, your data, and your use case will land in the same range. A 10x cost reduction usually compares a tuned Mosaic AI deployment against a previous architecture that had its own inefficiencies. Your baseline is different, so your result will be too. Treat these as directional, run your own pilot, and measure against the system you actually run today.

Who Mosaic AI Is For

Mosaic AI is not the right starting point for everyone. It pays off when the data is already on Databricks and the team needs to govern AI the way it governs data. Here is who tends to get the most out of it.

Data platform teams

Groups already running pipelines and analytics on the lakehouse get AI serving, training, and retrieval without standing up a separate stack or moving data out of their governance perimeter.

ML engineers building agents

Agent Bricks, vector search, and AI-judge evaluation cover the build-test-ship loop, so engineers can ground agents in enterprise data and prove quality before release.

Governance and security leads

Unity Catalog and the Unity AI Gateway put access control, guardrails, rate limits, and lineage for models and agents under one audited perimeter, including externally hosted models.

Teams that fine-tune

The MosaicML training heritage makes custom pretraining and open-source fine-tuning first-class, which suits teams that want to own their models rather than only call external APIs.

What to Watch Out For

No platform is free of tradeoffs. Mosaic AI's strengths come with commitments you should weigh before you build on it.

The consolidation that makes Mosaic AI attractive depends on your data and workflows already living on the lakehouse. If they do not, you are buying into the broader platform, not just an AI tool. Factor the platform commitment into the decision.

"The only unified platform for agent systems" is a positioning statement, not an independent finding. Competing platforms make similar claims. Evaluate the specific capabilities you need rather than the superlative.

The 44%, 10x, $10M, and 96% figures are customer outcomes Databricks chose to publish. They are real for those customers under their conditions. Run a scoped pilot and measure against your current system before you forecast savings.

Frequently Asked Questions

Is Mosaic AI a separate product from Databricks?

No. Mosaic AI is the GenAI and machine learning platform inside Databricks, not a standalone purchase. It uses the same lakehouse data, the same Unity Catalog governance, and the same workspace as the rest of Databricks. You reach it through Databricks rather than as a separate service.

Where did Mosaic AI come from?

It originates from the $1.4 billion MosaicML acquisition Databricks completed in June 2023. MosaicML specialized in efficient large-model training, and that capability became the Mosaic AI Training component. The rest of the platform grew around it.

What is the difference between Agent Bricks and the Agent Framework?

Both are part of Mosaic AI's agent tooling. Agent Bricks and the Agent Framework let you build, deploy, and evaluate agents grounded in enterprise data, and they include Genie Code. Treat them as the agent-building surface of Mosaic AI rather than two competing products.

How does Mosaic AI handle retrieval-augmented generation?

Through AI and Vector Search, a vector database with real-time sync. You embed your enterprise data, keep the index synced as the underlying data changes, and let agents retrieve relevant context at query time. Because it is part of Mosaic AI, the same governance applies to the retrieval index as to your tables.

Can Mosaic AI govern models hosted outside Databricks?

Yes. Unity Catalog, which underpins Mosaic AI governance, can govern models hosted outside Databricks, and the Unity AI Gateway governs every LLM and MCP endpoint. That means you can keep a single audit trail and policy layer even when some models live elsewhere.

Video Resources

Databricks Mosaic AI Overview

YouTube Search

High-level walkthrough of the Mosaic AI platform components and how they fit together.

Building Agents with Agent Bricks

YouTube Search

Hands-on look at building and deploying agents grounded in enterprise data.

Vector Search and RAG on Databricks

YouTube Search

Tutorial on wiring AI and Vector Search into a retrieval-augmented generation pipeline.

Databricks

What Is Databricks? The Data Intelligence Platform Explained

The lakehouse foundation that Mosaic AI runs on, from Apache Spark to Unity Catalog.

Databricks

Databricks Pricing: How DBUs and Per-Second Billing Work

A practitioner breakdown of pay-as-you-go DBU rates across data, ML, and AI workloads.

LLM Gateways

What Is an LLM Gateway? Routing, Governance, and Cost Control

How gateways like the Unity AI Gateway centralize policy, rate limits, and routing across models.

Fact-checked against vendor documentation and official sources, June 2026

Databricks, Mosaic AI, MosaicML, Unity Catalog, and Agent Bricks are trademarks of Databricks, Inc. MLflow is a trademark of the LF Projects, LLC. MCP refers to the Model Context Protocol. All other trademarks belong to their respective owners.

Gallery

Contacts

What Is Mosaic AI? Databricks' GenAI and Agent Platform

What Is Mosaic AI?

From MosaicML to Mosaic AI

The Six Components

Built on MLflow and Unity Catalog

Customer ROI, Vendor-Reported

Who Mosaic AI Is For

What to Watch Out For

Frequently Asked Questions

Is Mosaic AI a separate product from Databricks?

Where did Mosaic AI come from?

What is the difference between Agent Bricks and the Agent Framework?

How does Mosaic AI handle retrieval-augmented generation?

Can Mosaic AI govern models hosted outside Databricks?

Video Resources

Services

Learn

Company