Skip to main content

Documentation Index

Fetch the complete documentation index at: https://ray-preview.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

CPU profiling with py-spy

The dashboard’s “Stack Trace” action on a task or actor invokes py-spy to capture a flame graph for that worker. CLI equivalent:
py-spy record -d 30 -p <pid> -o profile.svg
To capture across many workers, combine with ray exec:
ray exec cluster.yaml "py-spy record -d 30 -p \$(pgrep -f ray::IDLE) -o /tmp/profile.svg"

Memory profiling with memray

memray run -o /tmp/profile.bin python my_script.py
memray flamegraph /tmp/profile.bin
For per-actor memory, attach memray to a running worker:
memray attach <pid>

Chrome timeline

ray.timeline("/tmp/ray-trace.json")
Or from CLI:
ray timeline -o /tmp/trace.json
Open the resulting JSON in chrome://tracing to see per-task and per-actor activity over time.

Deep dives

  • ray::IDLE workers waiting for work — usually fine, but a sea of them suggests over-provisioning.
  • High gcs_* time in profiles suggests the GCS is the bottleneck. Scale the head node up.
  • High pickling time suggests large arguments — use ray.put.

Next steps

Tracing

Distributed tracing.

State API

Cluster state queries.