DuckDB is the SQLite for analytics — an embedded columnar database without a server. Gigabytes of data at speeds rivaling Spark.
DuckDB — Analytics without Infrastructure¶
In-process OLAP without a server — runs inside your application.
import duckdb
result = duckdb.sql("""
SELECT region, COUNT(*) AS orders, SUM(total_czk) AS revenue
FROM 'data/orders/*.parquet'
WHERE order_date >= '2026-01-01'
GROUP BY region ORDER BY revenue DESC
""").fetchdf()
# Different formats without import
duckdb.sql("SELECT * FROM 'data.csv' LIMIT 10")
duckdb.sql("SELECT * FROM 's3://bucket/*.parquet'")
When to Use DuckDB¶
- Local analysis — ad-hoc queries
- Prototyping — testing SQL
- CI/CD — testing dbt locally
- Data science — SQL in Jupyter
Summary¶
DuckDB is revolutionary for local analytics. Zero setup, SQL over files, integration with pandas.
duckdbolapembeddedanalytics