Skip to main content

Documentation Index

Fetch the complete documentation index at: https://ray-preview.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Ray’s scheduler matches tasks and actors to nodes based on requested resources and a scheduling strategy. For workloads that need precise placement — gang scheduling, locality, or anti-affinity — use placement groups to reserve resources up front.

Resource requests

Each task or actor declares resources via num_cpus, num_gpus, memory, and a resources={...} dictionary for custom labels.
@ray.remote(num_cpus=4, num_gpus=1, resources={"high_memory": 1})
def train(...):
    ...
Custom resources are advertised by nodes when they start:
ray start --head --resources='{"high_memory": 8}'

Scheduling strategies

Use scheduling_strategy on .options() to control placement.
StrategyBehavior
"DEFAULT"Spread across the cluster while honoring locality.
"SPREAD"Spread tasks/actors across as many nodes as possible.
NodeAffinitySchedulingStrategy(node_id, soft)Pin to a specific node (soft or hard).
PlacementGroupSchedulingStrategy(pg, bundle_index)Schedule into a placement group bundle.
from ray.util.scheduling_strategies import NodeAffinitySchedulingStrategy

train.options(
    scheduling_strategy=NodeAffinitySchedulingStrategy(node_id="...", soft=False)
).remote()

Placement groups

A placement group reserves a set of resource bundles across the cluster up front. Tasks and actors then schedule into specific bundles.
from ray.util.placement_group import placement_group

pg = placement_group(
    bundles=[{"CPU": 4, "GPU": 1}] * 4,
    strategy="STRICT_PACK",
)
ray.get(pg.ready())

Strategies

StrategyEffect
PACKTry to pack bundles onto as few nodes as possible.
SPREADSpread bundles across as many nodes as possible.
STRICT_PACKLike PACK, but fail if bundles can’t fit on one node.
STRICT_SPREADOne bundle per node; fail if not possible.

Schedule into a bundle

from ray.util.scheduling_strategies import PlacementGroupSchedulingStrategy

@ray.remote
class Worker:
    ...

workers = [
    Worker.options(
        scheduling_strategy=PlacementGroupSchedulingStrategy(
            placement_group=pg,
            placement_group_bundle_index=i,
        )
    ).remote()
    for i in range(4)
]

Release resources

Placement groups are reference-counted; they release their reservations when the Python handle is garbage-collected. Call ray.util.remove_placement_group(pg) to release explicitly.

Locality-aware scheduling

When a task takes an ObjectRef argument, Ray prefers to schedule it on a node that already holds the object. This avoids unnecessary data movement.

Memory-aware scheduling

Set memory=... to request a memory allocation. Ray tracks memory usage and refuses to schedule tasks on nodes without enough headroom.

Next steps

Resources

Configure node resources at cluster start time.

Fault tolerance

Recovery semantics for tasks, actors, and placement groups.