Ray’s observability stack covers four layers:Documentation Index
Fetch the complete documentation index at: https://ray-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Dashboard
Browser UI for live cluster state.
Metrics
Prometheus-compatible metrics for every Ray component.
Logs
Structured per-worker logs aggregated by node.
Profiling
py-spy, memray, and Chrome trace integrations.
When to reach for what
| Symptom | Tool |
|---|---|
| ”Tasks aren’t running” | Dashboard → Tasks tab; ray status |
| ”Cluster is slow” | Dashboard → Timeline; ray timeline |
| ”Memory is climbing” | Profiling → memray; metrics → object store |
| ”Worker crashed” | Logs → worker stderr; State API |
| ”Latency is high” | Tracing; per-deployment metrics |
Next steps
Key concepts
What Ray exports and where it ends up.
Getting started
Set up Prometheus, Grafana, and the dashboard.