New
Feature
Project Nessie catalog (versioned, Git-like data catalog)
A new, versioned data catalog you can use instead of (or alongside) Hive Metastore. Nessie adds branches/tags and Git-like operations for tables, so you can isolate changes, test safely, and roll back if needed.
-
What you can do
- Create branches (e.g., feature_x, qa) to develop pipelines without touching main.
- Switch active branch per workspace/job and run SQL against that branch.
- Merge a branch back to main once validated, tag important points for reproducibility.
- Use it directly from the UI (branch selector/management) and from the SQL editor (run queries against the selected branch).
- Keep Hive Metastore for legacy jobs while moving new/changed tables to Nessie incrementally.
-
Why it helps
- Safe, zero-copy experimentation on datasets.
- Repeatable runs (pin to a tag) and quick rollback if a job/regression slips through.
- Cleaner dev→test→prod promotion with explicit merges instead of ad-hoc table swaps.
-
Compatibility
- Recommended with Iceberg tables, other formats depend on engine support.
- Can run side-by-side with Hive Metastore, choose catalog per job/workspace.
-
How to start
- Enable the Nessie module and add a Nessie catalog entry in ILUM.
- In the UI, pick your active branch before running SQL or jobs.
- Update orchestrated jobs to reference the intended catalog/branch.
-
Operational notes
- Lineage & versions record the catalog + branch for every read/write.
- Include the Nessie metadata store in your backup/DR plan (same RPO/RTO targets as your tables).
- If you switch the default catalog from Hive to Nessie, review jobs that assume hive paths/catalog names.
https://ilum.cloud/docs/features/catalogs/nessie
https://projectnessie.org/
Available in version: 6.6.0
https://projectnessie.org/
Available in version: 6.6.0