Spice.ai OSS

Spice is an open-source SQL query and AI compute engine, written in Rust, for data-driven apps and agents.

Spice provides three industry standard APIs in a lightweight, portable runtime (single ~140 MB binary):

SQL Query APIs: Arrow Flight, Arrow Flight SQL, ODBC, JDBC, and ADBC.
OpenAI-Compatible APIs: HTTP APIs compatible the OpenAI SDK, AI SDK with local model serving (CUDA/Metal accelerated) and gateway to hosted models.
Iceberg Catalog REST APIs: A unified Iceberg Catalog API.

Goal 🎯

Developers can focus on building data apps and AI agents confidently, knowing they are grounded in data.

Spice is primarily used for:

Data Federation: SQL query across any database, data warehouse, or data lake. Learn More.
Data Materialization and Acceleration: Materialize, accelerate, and cache database queries. Read the MaterializedView interview - Building a CDN for Databases
AI apps and agents: An AI-database powering retrieval-augmented generation (RAG) and intelligent agents. Learn More.

Watch 🎥

CMU Databases - Accelerating Data and AI with Spice.ai Open-Source

Spice is built-on industry leading technologies including Apache DataFusion, Apache Arrow, Arrow Flight, SQLite, and DuckDB. If you want to build with DataFusion or using DuckDB, Spice provides a simple, flexible, and production-ready engine you can just use.

Why Spice?

Spice makes it fast and easy to query data from one or more sources using SQL. You can co-locate a managed dataset with your application or machine learning model, and accelerate it with Arrow in-memory, SQLite/DuckDB, or with attached PostgreSQL for fast, high-concurrency, low-latency queries. Accelerated engines give you flexibility and control over query cost and performance.

Spice simplifies building data-driven AI applications and agents by making it fast and easy to query, federate, and accelerate data from one or more sources using SQL, while grounding AI in real-time, reliable data. Co-locate datasets with apps and AI models to power AI feedback loops, enable RAG and search, and deliver fast, low-latency data-query and AI-inference with full control over cost and performance.

How is Spice different?

AI-Native Runtime: Spice combines data query and AI inference in a single engine, for data-grounded AI and accurate AI.
Application-Focused: Designed to run distributed at the application and agent level, often as a 1:1 or 1:N mapping between app and Spice instance, unlike traditional data systems built for many apps on one centralized database. It’s common to spin up multiple Spice instances—even one per tenant or customer.
Dual-Engine Acceleration: Supports both OLAP (Arrow/DuckDB) and OLTP (SQLite/PostgreSQL) engines at the dataset level, providing flexible performance across analytical and transactional workloads.
Disaggregated Storage: Separation of compute from disaggregated storage, co-locating local, materialized working sets of data with applications, dashboards, or ML pipelines while accessing source data in its original storage.
Edge to Cloud Native: Deploy as a standalone instance, Kubernetes sidecar, microservice, or cluster—across edge/POP, on-prem, and public clouds. Chain multiple Spice instances for tier-optimized, distributed deployments.

How does Spice compare?

Data Query and Analytics

Feature	Spice	Trino / Presto	Dremio	ClickHouse	Materialize
Primary Use-Case	Data & AI apps/agents	Big data analytics	Interactive analytics	Real-time analytics	Real-time analytics
Primary deployment model	Sidecar	Cluster	Cluster	Cluster	Cluster
Federated Query Support	✅	✅	✅	―	―
Acceleration/Materialization	✅ (Arrow, SQLite, DuckDB, PostgreSQL)	Intermediate storage	Reflections (Iceberg)	Materialized views	✅ (Real-time views)
Catalog Support	✅ (Iceberg, Unity Catalog)	✅	✅	―	―
Query Result Caching	✅	✅	✅	✅	Limited
Multi-Modal Acceleration	✅ (OLAP + OLTP)	―	―	―	―
Change Data Capture (CDC)	✅ (Debezium)	―	―	―	✅ (Debezium)

AI Apps and Agents

Feature	Spice	LangChain	LlamaIndex	AgentOps.ai	Ollama
Primary Use-Case	Data & AI apps	Agentic workflows	RAG apps	Agent operations	LLM apps
Programming Language	Any language (HTTP interface)	JavaScript, Python	Python	Python	Any language (HTTP interface)
Unified Data + AI Runtime	✅	―	―	―	―
Federated Data Query	✅	―	―	―	―
Accelerated Data Access	✅	―	―	―	―
Tools/Functions	✅	✅	✅	Limited	Limited
LLM Memory	✅	✅	―	✅	―
Evaluations (Evals)	✅	Limited	―	Limited	―
Search	✅ (VSS)	✅	✅	Limited	Limited
Caching	✅ (Query and results caching)	Limited	―	―	―
Embeddings	✅ (Built-in & pluggable models/DBs)	✅	✅	Limited	―

✅ = Fully supported
❌ = Not supported
Limited = Partial or restricted support

Example Use-Cases

Data-grounded Agentic AI Applications

OpenAI-compatible API: Connect to hosted models (OpenAI, Anthropic, xAI) or deploy locally (Llama, NVIDIA NIM). AI Gateway Recipe
Federated Data Access: Query using SQL and NSQL (text-to-SQL) across databases, data warehouses, and data lakes with advanced query push-down for fast retrieval across disparate data sources. Federated SQL Query Recipe
Search and RAG: Search and retrieve context with accelerated embeddings for retrieval-augmented generation (RAG) workflows. Vector Search over GitHub Files
LLM Memory and Observability: Store and retrieve history and context for AI agents while gaining deep visibility into data flows, model performance, and traces. LLM Memory Recipe | Monitoring Features Documentation

Database CDN and Query Mesh

Data Acceleration: Co-locate materialized datasets in Arrow, SQLite, and DuckDB with applications for sub-second query. DuckDB Data Accelerator Recipe
Resiliency and Local Dataset Replication: Maintain application availability with local replicas of critical datasets. Local Dataset Replication Recipe
Responsive Dashboards: Enable fast, real-time analytics by accelerating data for frontends and BI tools. Sales BI Dashboard Demo
Simplified Legacy Migration: Use a single endpoint to unify legacy systems with modern infrastructure, including federated SQL querying across multiple sources. Federated SQL Query Recipe

Retrieval-Augmented Generation (RAG)

Unified Search with Vector Similarity: Perform efficient vector similarity search across structured and unstructured data sources. Vector Search over GitHub Files
Semantic Knowledge Layer: Define a semantic context model to enrich data for AI. Semantic Model Feature Documentation
Text-to-SQL: Convert natural language queries into SQL using built-in NSQL and sampling tools for accurate query. Text-to-SQL Recipe
Model and Data Evaluations: Assess model performance and data quality with integrated evaluation tools. Language Model Evaluations Recipe

FAQ

Is Spice a cache? No specifically; you can think of Spice data acceleration as an active cache, materialization, or data prefetcher. A cache would fetch data on a cache-miss while Spice prefetches and materializes filtered data on an interval, trigger, or as data changes using CDC. In addition to acceleration Spice supports results caching.
Is Spice a CDN for databases? Yes, a common use-case for Spice is as a CDN for different data sources. Using CDN concepts, Spice enables you to ship (load) a working set of your database (or data lake, or data warehouse) where it's most frequently accessed, like from a data-intensive application or for AI context.

Watch a 30-sec BI dashboard acceleration demo

See more demos on YouTube.

Intelligent Applications and Agents

Spice enables developers to build data-grounded AI applications and agents by co-locating data and ML models with applications. Read more about the vision to enable the development of intelligent AI-driven applications.

Connect with us

Build an app with Spice and send us feedback and suggestions at hey@spice.ai or on Discord, X, or LinkedIn.
File an issue if you see something not quite working correctly.
Join our team (We're hiring!)
Contribute code or documentation to the project (see CONTRIBUTING.md).

Why Spice?​

How is Spice different?​

How does Spice compare?​

Data Query and Analytics​

AI Apps and Agents​

Example Use-Cases​

Data-grounded Agentic AI Applications​

Database CDN and Query Mesh​

Retrieval-Augmented Generation (RAG)​

FAQ​

Watch a 30-sec BI dashboard acceleration demo​

Intelligent Applications and Agents​

Connect with us​