Skip to main content

Documentation Index

Fetch the complete documentation index at: https://ray-preview.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

RLlib checkpoints capture the full algorithm state — RL modules, learners, replay buffers, and config — so training can resume from where it left off.

Save

checkpoint_path = algo.save("/tmp/ppo-cartpole/")
The directory is portable; copy it to S3 or any shared filesystem to resume elsewhere.

Restore

from ray.rllib.algorithms.algorithm import Algorithm

algo = Algorithm.from_checkpoint(checkpoint_path)
algo.train()

Load only the RL module

For deployment, reload just the network without the optimizer or buffer:
from ray.rllib.core.rl_module.rl_module import RLModule

module = RLModule.from_checkpoint(f"{checkpoint_path}/learner_group/learner/rl_module")
Use module.forward_inference(...) for low-latency inference.

Auto-checkpointing in Tune

tuner = tune.Tuner(
    "PPO",
    param_space=config.to_dict(),
    run_config=ray.train.RunConfig(
        checkpoint_config=ray.train.CheckpointConfig(
            checkpoint_frequency=10,
            num_to_keep=3,
        ),
    ),
)
Tune saves every 10 iterations and keeps the three best.

Next steps

Training

Inside the iteration loop.

Offline RL

Use logged data instead of rollouts.