Improved Feature
6 days ago

Improved Spark Job Statistics

We’ve rebuilt the Job Statistics view in ILUM to expose the key Apache Spark runtime metrics, useful for debugging, capacity planning, and post-run reviews on Kubernetes.

What’s new
  • Top-line job status
    Completion %, total tasks, active executors, and allocated memory, with manual Refresh.
  • Resource gauges (driver/executor/total)
    • Total: memory utilization vs. cluster allocation (e.g., 14.78% of 24 GB), total cores, total executors.
    • Driver: memory utilization (e.g., 4.44% of 12 GB), driver cores.
    • Executors: memory utilization (e.g., 25.13% of 12 GB), memory/cores per executor, active/dead executor count, and aggregate cores/memory.
  • Task Health Monitor
    Clear outcome summary with counts for Completed / Failed / Skipped tasks (e.g., 108 completed, 98 skipped (optimized)) and visual split. Includes Avg Task Time and Total Task Time for quick SLA checks.
  • Shuffle operations panel
    Read/Write records and bytes with a simple chart and a read↔write ratio indicator-handy for identifying skew and unnecessary shuffle I/O.
Why it helps
  • Fast detection of over-allocation (low memory/CPU utilization vs. requested resources).
  • Immediate visibility into optimizer effects (e.g., many skipped tasks/stages).
  • Quick sizing and tuning decisions without scraping Spark UI pages.
Where to find it