Skip to main content

Documentation Index

Fetch the complete documentation index at: https://ray-preview.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

The autoscaler runs on the head node and provisions or terminates worker VMs based on resource demand.

Bounds

max_workers: 32
available_node_types:
  cpu-worker:
    min_workers: 0
    max_workers: 16
  gpu-worker:
    min_workers: 0
    max_workers: 8
min_workers is the floor for that node type; max_workers is the per-type cap. The cluster-wide max_workers is an absolute cap.

Idle timeout

idle_timeout_minutes: 5
Idle workers are terminated after this many minutes.

Upscaling speed

upscaling_speed: 1.0
How aggressively to scale up. 1.0 doubles the cluster on every check; 0.5 adds half. Higher values reduce time-to-capacity at the cost of bursty bills.

Resource matching

The autoscaler picks the cheapest node type that satisfies a pending resource request. Use custom resources to differentiate node classes:
available_node_types:
  high-mem:
    node_config: { InstanceType: r5.4xlarge }
    resources: { high_memory: 1 }
Then in your code:
@ray.remote(resources={"high_memory": 1})
def big_task(...):
    ...

Spot instances

For preemptible workers, configure the provider’s spot options (AWS InstanceMarketOptions, GCP preemptible: true, Azure priority: Spot).

Verifying

ray exec cluster.yaml "tail -f /tmp/ray/session_latest/logs/monitor*"

Next steps

Best practices

Run >100 nodes reliably.

Logging

Aggregate logs across nodes.