New Feature

Resource Efficiency & Config Recommendations for Apache Spark on Kubernetes

We’ve added a new optimization suite in ILUM that converts runtime telemetry into actionable Spark configuration, useful for data engineering teams tuning Spark on Kubernetes for performance and cost.

What’s new

Resource Efficiency Score (0–100)
Radar across Memory / Executor / I/O / Tasks with per-dimension panels (e.g., memory utilization %, executor active %, I/O data-processing ratio, task balance).
Live cost signals
- Accumulated Cost (runtime × cluster rate) with wall-clock duration.
- Cost Efficiency score with waste breakdown by Memory / CPU / Executors to spot over-provisioning.
Configuration Recommendations (modes: Balanced, Performance, Cost)
Each recommendation includes a rationale, expected impact (e.g., -15% execution time), and confidence. One-click copy of the exact Spark config.

Example recommendations (auto-derived from workload patterns)

# Adaptive Query Execution
spark.sql.adaptive.enabled = true

# Adaptive partition coalescing
spark.sql.adaptive.coalescePartitions.enabled = true

# Executor memory sizing (value depends on observed usage)
spark.executor.memory = 5g

Why it’s suggested: We correlate stage metrics, skew, shuffle sizes, and executor idleness to pick safe defaults. AQE reduces expensive shuffles; coalescing prevents tiny partitions; memory sizing balances spill risk vs. idle RAM.

Where to find it

Jobs → Optimization tab: left side shows Resource/Cost scores; right side lists recommendations with rationale and impact estimates.

Why this helps (practical outcomes)

Faster iteration on Spark tuning without trawling logs.
Concrete levers for cost optimization (reduce idle executors, right-size memory, coalesce partitions).
Transparent reasoning you can review and commit via your normal workflow.