The File Manager now covers the day-to-day work you do with objects without leaving ILUM. You can browse buckets and folders with clear breadcrumbs, quickly filter or sort by name, size, or modified time, and create new folders where you have write access. Uploads support multiple files at once with visible progress, and downloads work for single files or multi-select from the action bar.
Browse & context
Clear breadcrumbs for cluster → storage → bucket → folder.
Filter & Sort by name/extension, size, or modified time; quick search box.
Upload
Multi-file upload from the toolbar (Upload).
Progress indicator per file, safe server-side size checks.
Works in any bucket/folder you have write access to.
Create folder
Create new folders anywhere you have permissions (Create Folder).
Preview (no download needed)
Inline Preview badges next to files; opens a modal viewer.
View CSV / JSON / XML / Text content directly in the UI. Parquet/Delta support in next release.
Data Lineage Search (find datasets, columns, and jobs instantly)
Navigating big lineage graphs is painful. We’ve added a search bar to Lineage (and unified it with Jobs / Datasets views) so you can jump straight to what you need.
What’s new
Global lineage search, type a table/dataset name, column (e.g., AccountID), job name/ID, or storage path (e.g., s3://ilum-data/...) and we’ll locate and highlight the node(s) on the lineage graph.
Namespace aware - toggle Search all namespaces or limit to the current namespace; results respect your scope.
Quick actions - from results: View Details, Open SQL, or Jump to Job.
Fresh index - we index the Hive Metastore, column metadata, and lineage edges after each successful run or schema change, so search stays current.
Where: open any dataset’s Lineage tab (search at the top).
We’ve rebuilt the Job Statistics view in ILUM to expose the key Apache Spark runtime metrics, useful for debugging, capacity planning, and post-run reviews on Kubernetes.
What’s new
Top-line job status Completion %, total tasks, active executors, and allocated memory, with manual Refresh.
Resource gauges (driver/executor/total)
Total: memory utilization vs. cluster allocation (e.g., 14.78% of 24 GB), total cores, total executors.
Driver: memory utilization (e.g., 4.44% of 12 GB), driver cores.
Executors: memory utilization (e.g., 25.13% of 12 GB), memory/cores per executor, active/dead executor count, and aggregate cores/memory.
Task Health Monitor Clear outcome summary with counts for Completed / Failed / Skipped tasks (e.g., 108 completed, 98 skipped (optimized)) and visual split. Includes Avg Task Time and Total Task Time for quick SLA checks.
Shuffle operations panel Read/Write records and bytes with a simple chart and a read↔write ratio indicator-handy for identifying skew and unnecessary shuffle I/O.
Why it helps
Fast detection of over-allocation (low memory/CPU utilization vs. requested resources).
Immediate visibility into optimizer effects (e.g., many skipped tasks/stages).
Quick sizing and tuning decisions without scraping Spark UI pages.
We’ve shipped a major upgrade to Ilum’s lineage experience, purpose-built to help data teams see how data moves across bronze → silver → gold layers, understand job health at a glance, and trace the blast radius of any change in seconds.
What’s new
Smart Job Clustering No more spaghetti graphs. Ilum automatically groups similar jobs into compact clusters, so complex pipelines stay readable while preserving drill-down to the underlying runs.
Layer-aware lineage (Bronze / Silver / Gold) Instantly track how entities evolve across refinement layers. Each table card shows key fields and types, with clear upstream/downstream edges.
Operational overlays on the graph Every job node now surfaces last run, avg duration, and success rate right on the canvas—perfect for spotting hot paths and bottlenecks during incident review.
Version-aware datasets Open any dataset and jump to the Versions tab to see schema changes and lifecycle events (e.g., OVERWRITE) before they surprise downstream consumers.
ERD ↔ Lineage toggle Switch between an entity-relationship view for model design and a lineage view for runtime flow—two perspectives, one source of truth.
Faster navigation Mini-map, zoom/pan, multi-select, and clean badges (e.g., OPERATIONAL) make large graphs effortless to explore.
Improved SQL Editor with Selective Query Execution! 🎯
We’ve upgraded the Ilum SQL Editor with a powerful new feature: select and run specific queries directly from a single view—no need to execute the entire notebook or clean up your workspace first.
Key Enhancements:
🧩 Query Picker – Quickly choose which query to execute from a list of all defined queries in your notebook. 🎯 Focused Execution – Run only the SQL block you need, saving time and avoiding unnecessary data scans. 🧭 Better Navigation – Easily jump between queries and manage your workflow more efficiently. 💡 Cleaner Debugging – Isolate and test specific parts of your SQL logic without affecting the rest of the notebook.
This improvement brings more control and flexibility to your data exploration and debugging workflows. Whether you're refining a single query or testing logic step-by-step, Ilum makes it faster and easier.
We've upgraded Ilum’s SQL Editor to support notebook-style operations, making it easier than ever to explore, query, and analyze big data at scale. Starting now, you can write & execute SQL in structured cells, just like in Jupyter Notebook, instead of running your Apache Spark SQL queries one by one. This enables you to run SQL or SQLite queries similar to a Jupyter Notebook. So instead of running Apache Spark SQL queries one by one, you can write and execute SQL in structured cells.
Key Improvements: - Notebook-style SQL execution – Run and organize queries in cells for a more interactive SQL analytics experience. - Persistent query history – Save and revisit SQL jobs for improved workflow management. - Multi-cell execution – Execute individual cells or entire SQL notebooks for fast, scalable data exploration. - Integrated with BI & Data Platforms – Query data stored in cloud data lakes, data warehouses, and on-prem storage via JDBC integration.
With this update, it is easier to query SQL inside Ilum and manage data lakehouse, business intelligence and big data analysis. Check it out today and meet the better way to manage your structured and semi-structured data! 🚀