New Feature
23 days ago

DuckDB SQL Engine Lightweight Analytics with Full Lineage & Catalog Integration

Ilum now supports DuckDB as a third SQL engine alongside Spark and Trino, giving you a fast, in-process analytics engine that's fully wired into the platform, not a standalone toy.

What makes this different from standalone DuckDB
DuckDB in Ilum isn't just an embedded database running in isolation. Every query you run is tracked through data lineage, connected to the internal data catalog, and backed by the new DuckLake catalog (created automatically when you enable the module). Nothing lives only in memory, your tables, schemas, and metadata are persisted and visible across the platform, just like Spark and Trino workloads.

What you can do
  • Run ad-hoc analytical queries with sub-second response times - aggregations, joins, window functions - without spinning up a cluster.
  • Query existing tables from Ilum's data catalog directly. No need to re-register or recreate anything - if it's in the catalog, DuckDB can read it.
  • Track every DuckDB operation in data lineage, the same way Spark and Trino jobs are tracked. Full visibility into what was read, written, and transformed.
  • Use DuckDB as a query engine in Apache Superset and other BI tools connected to Ilum, dramatically speeding up dashboard queries on catalog data.
  • Work with the DuckLake catalog out of the box - it's provisioned automatically and integrated with the platform's metadata layer.

Three engines, one platform
With this release, Ilum offers a clear engine strategy: DuckDB for lightweight, interactive workloads, Trino for federated and medium-scale queries,  Spark for heavy batch processing and ETL.

Pick the right tool per query, all from the same SQL editor, same catalog, same lineage graph. In upcoming releases, DuckDB will become the default engine for smaller workloads, making everyday data exploration even faster.

Where to find it
Open the SQL Editor, select DuckDB from the engine picker, and start querying. Your existing catalog tables are already available.

Introduced in 6.7.0