Improved Feature
26 days ago

Improved Spark Job Statistics

We’ve rebuilt the Job Statistics view in ILUM to expose the key Apache Spark runtime metrics, useful for debugging, capacity planning, and post-run reviews on Kubernetes.

What’s new
  • Top-line job status
    Completion %, total tasks, active executors, and allocated memory, with manual Refresh.
  • Resource gauges (driver/executor/total)
    • Total: memory utilization vs. cluster allocation (e.g., 14.78% of 24 GB), total cores, total executors.
    • Driver: memory utilization (e.g., 4.44% of 12 GB), driver cores.
    • Executors: memory utilization (e.g., 25.13% of 12 GB), memory/cores per executor, active/dead executor count, and aggregate cores/memory.
  • Task Health Monitor
    Clear outcome summary with counts for Completed / Failed / Skipped tasks (e.g., 108 completed, 98 skipped (optimized)) and visual split. Includes Avg Task Time and Total Task Time for quick SLA checks.
  • Shuffle operations panel
    Read/Write records and bytes with a simple chart and a read↔write ratio indicator-handy for identifying skew and unnecessary shuffle I/O.
Why it helps
  • Fast detection of over-allocation (low memory/CPU utilization vs. requested resources).
  • Immediate visibility into optimizer effects (e.g., many skipped tasks/stages).
  • Quick sizing and tuning decisions without scraping Spark UI pages.
Where to find it