Scaling Config

Basic usage
Resource customization
Placement strategies
Trainer resources
Heterogeneous workers
Validate the config
Next steps

ScalingConfig is the smallest object you’ll touch when scaling a training job. It controls how many workers run, what each worker gets, and how they’re placed.

Basic usage

from ray.train import ScalingConfig

ScalingConfig(num_workers=8, use_gpu=True)

This launches eight workers, each on its own GPU.

Resource customization

ScalingConfig(
    num_workers=4,
    use_gpu=True,
    resources_per_worker={
        "CPU": 4,
        "GPU": 1,
        "memory": 16 * 1024**3,
        "high_memory": 1,        # custom resource
    },
)

resources_per_worker follows the same shape as Ray’s @ray.remote(...) resource spec.

Placement strategies

Workers are scheduled into a placement group. Choose how to lay it out:

ScalingConfig(num_workers=4, placement_strategy="PACK")

Strategy	Effect
`PACK` (default)	Pack onto as few nodes as possible.
`SPREAD`	Spread across as many nodes as possible.
`STRICT_PACK`	All workers on one node, or fail.
`STRICT_SPREAD`	One worker per node, or fail.

Trainer resources

A small “trainer” actor coordinates workers. Set its resources separately:

ScalingConfig(num_workers=4, trainer_resources={"CPU": 1})

Heterogeneous workers

Ray Train doesn’t yet support different resource specs per worker out of the box. For mixed workloads (e.g., one parameter server actor + N learners), build your own coordinator on top of Ray Core.

Validate the config

trainer = TorchTrainer(..., scaling_config=ScalingConfig(num_workers=4, use_gpu=True))
trainer.preprocess_datasets()  # raises if the cluster can't satisfy the request

Next steps

Run config

Storage, naming, callbacks.

Distributed PyTorch

See ScalingConfig in real training jobs.

Checkpointing Run Config

⌘I

Ray Data

Ray Train

Ray Tune

Ray Serve

Ray RLlib

Ray LLM

Scaling Config

Basic usage

Resource customization

Placement strategies

Trainer resources

Heterogeneous workers

Validate the config

Next steps

Run config

Distributed PyTorch

Ray Data

Ray Train

Ray Tune

Ray Serve

Ray RLlib

Ray LLM

Documentation Index

​Basic usage

​Resource customization

​Placement strategies

​Trainer resources

​Heterogeneous workers

​Validate the config

​Next steps

Run config

Distributed PyTorch

Basic usage

Resource customization

Placement strategies

Trainer resources

Heterogeneous workers

Validate the config

Next steps