Spice.ai OSS
Spice is an open-source SQL query and AI compute engine, written in Rust, for data-driven apps and agents.
Spice provides three industry standard APIs in a lightweight, portable runtime (single ~140 MB binary):
- SQL Query APIs: Arrow Flight, Arrow Flight SQL, ODBC, JDBC, and ADBC.
- OpenAI-Compatible APIs: HTTP APIs compatible the OpenAI SDK, AI SDK with local model serving (CUDA/Metal accelerated) and gateway to hosted models.
- Iceberg Catalog REST APIs: A unified Iceberg Catalog API.
Developers can focus on building data apps and AI agents confidently, knowing they are grounded in data.
Spice is primarily used for:
- Data Federation: SQL query across any database, data warehouse, or data lake. Learn More.
- Data Materialization and Acceleration: Materialize, accelerate, and cache database queries. Read the MaterializedView interview - Building a CDN for Databases
- AI apps and agents: An AI-database powering retrieval-augmented generation (RAG) and intelligent agents. Learn More.
Spice is built-on industry leading technologies including Apache DataFusion, Apache Arrow, Arrow Flight, SQLite, and DuckDB. If you want to build with DataFusion or using DuckDB, Spice provides a simple, flexible, and production-ready engine you can just use.
Why Spice?β
Spice makes it fast and easy to query data from one or more sources using SQL. You can co-locate a managed dataset with your application or machine learning model, and accelerate it with Arrow in-memory, SQLite/DuckDB, or with attached PostgreSQL for fast, high-concurrency, low-latency queries. Accelerated engines give you flexibility and control over query cost and performance.
Spice simplifies building data-driven AI applications and agents by making it fast and easy to query, federate, and accelerate data from one or more sources using SQL, while grounding AI in real-time, reliable data. Co-locate datasets with apps and AI models to power AI feedback loops, enable RAG and search, and deliver fast, low-latency data-query and AI-inference with full control over cost and performance.
How is Spice different?β
-
AI-Native Runtime: Spice combines data query and AI inference in a single engine, for data-grounded AI and accurate AI.
-
Application-Focused: Designed to run distributed at the application and agent level, often as a 1:1 or 1:N mapping between app and Spice instance, unlike traditional data systems built for many apps on one centralized database. Itβs common to spin up multiple Spice instancesβeven one per tenant or customer.
-
Dual-Engine Acceleration: Supports both OLAP (Arrow/DuckDB) and OLTP (SQLite/PostgreSQL) engines at the dataset level, providing flexible performance across analytical and transactional workloads.
-
Disaggregated Storage: Separation of compute from disaggregated storage, co-locating local, materialized working sets of data with applications, dashboards, or ML pipelines while accessing source data in its original storage.
-
Edge to Cloud Native: Deploy as a standalone instance, Kubernetes sidecar, microservice, or clusterβacross edge/POP, on-prem, and public clouds. Chain multiple Spice instances for tier-optimized, distributed deployments.
How does Spice compare?β
Data Query and Analyticsβ
Feature | Spice | Trino / Presto | Dremio | ClickHouse | Materialize |
---|---|---|---|---|---|
Primary Use-Case | Data & AI apps/agents | Big data analytics | Interactive analytics | Real-time analytics | Real-time analytics |
Primary deployment model | Sidecar | Cluster | Cluster | Cluster | Cluster |
Federated Query Support | β | β | β | β | β |
Acceleration/Materialization | β (Arrow, SQLite, DuckDB, PostgreSQL) | Intermediate storage | Reflections (Iceberg) | Materialized views | β (Real-time views) |
Catalog Support | β (Iceberg, Unity Catalog) | β | β | β | β |
Query Result Caching | β | β | β | β | Limited |
Multi-Modal Acceleration | β (OLAP + OLTP) | β | β | β | β |
Change Data Capture (CDC) | β (Debezium) | β | β | β | β (Debezium) |
AI Apps and Agentsβ
Feature | Spice | LangChain | LlamaIndex | AgentOps.ai | Ollama |
---|---|---|---|---|---|
Primary Use-Case | Data & AI apps | Agentic workflows | RAG apps | Agent operations | LLM apps |
Programming Language | Any language (HTTP interface) | JavaScript, Python | Python | Python | Any language (HTTP interface) |
Unified Data + AI Runtime | β | β | β | β | β |
Federated Data Query | β | β | β | β | β |
Accelerated Data Access | β | β | β | β | β |
Tools/Functions | β | β | β | Limited | Limited |
LLM Memory | β | β | β | β | β |
Evaluations (Evals) | β | Limited | β | Limited | β |
Search | β (VSS) | β | β | Limited | Limited |
Caching | β (Query and results caching) | Limited | β | β | β |
Embeddings | β (Built-in & pluggable models/DBs) | β | β | Limited | β |
β
= Fully supported
β = Not supported
Limited = Partial or restricted support
Example Use-Casesβ
Data-grounded Agentic AI Applicationsβ
- OpenAI-compatible API: Connect to hosted models (OpenAI, Anthropic, xAI) or deploy locally (Llama, NVIDIA NIM). AI Gateway Recipe
- Federated Data Access: Query using SQL and NSQL (text-to-SQL) across databases, data warehouses, and data lakes with advanced query push-down for fast retrieval across disparate data sources. Federated SQL Query Recipe
- Search and RAG: Search and retrieve context with accelerated embeddings for retrieval-augmented generation (RAG) workflows. Vector Search over GitHub Files
- LLM Memory and Observability: Store and retrieve history and context for AI agents while gaining deep visibility into data flows, model performance, and traces. LLM Memory Recipe | Monitoring Features Documentation
Database CDN and Query Meshβ
- Data Acceleration: Co-locate materialized datasets in Arrow, SQLite, and DuckDB with applications for sub-second query. DuckDB Data Accelerator Recipe
- Resiliency and Local Dataset Replication: Maintain application availability with local replicas of critical datasets. Local Dataset Replication Recipe
- Responsive Dashboards: Enable fast, real-time analytics by accelerating data for frontends and BI tools. Sales BI Dashboard Demo
- Simplified Legacy Migration: Use a single endpoint to unify legacy systems with modern infrastructure, including federated SQL querying across multiple sources. Federated SQL Query Recipe
Retrieval-Augmented Generation (RAG)β
- Unified Search with Vector Similarity: Perform efficient vector similarity search across structured and unstructured data sources. Vector Search over GitHub Files
- Semantic Knowledge Layer: Define a semantic context model to enrich data for AI. Semantic Model Feature Documentation
- Text-to-SQL: Convert natural language queries into SQL using built-in NSQL and sampling tools for accurate query. Text-to-SQL Recipe
- Model and Data Evaluations: Assess model performance and data quality with integrated evaluation tools. Language Model Evaluations Recipe
FAQβ
-
Is Spice a cache? No specifically; you can think of Spice data acceleration as an active cache, materialization, or data prefetcher. A cache would fetch data on a cache-miss while Spice prefetches and materializes filtered data on an interval, trigger, or as data changes using CDC. In addition to acceleration Spice supports results caching.
-
Is Spice a CDN for databases? Yes, a common use-case for Spice is as a CDN for different data sources. Using CDN concepts, Spice enables you to ship (load) a working set of your database (or data lake, or data warehouse) where it's most frequently accessed, like from a data-intensive application or for AI context.
Watch a 30-sec BI dashboard acceleration demoβ
See more demos on YouTube.
Intelligent Applications and Agentsβ
Spice enables developers to build data-grounded AI applications and agents by co-locating data and ML models with applications. Read more about the vision to enable the development of intelligent AI-driven applications.
Connect with usβ
- Build an app with Spice and send us feedback and suggestions at hey@spice.ai or on Discord, X, or LinkedIn.
- File an issue if you see something not quite working correctly.
- Join our team (We're hiring!)
- Contribute code or documentation to the project (see CONTRIBUTING.md).