Checkpoints

Save
Restore
Load only the RL module
Auto-checkpointing in Tune
Next steps

RLlib checkpoints capture the full algorithm state — RL modules, learners, replay buffers, and config — so training can resume from where it left off.

Save

checkpoint_path = algo.save("/tmp/ppo-cartpole/")

The directory is portable; copy it to S3 or any shared filesystem to resume elsewhere.

Restore

from ray.rllib.algorithms.algorithm import Algorithm

algo = Algorithm.from_checkpoint(checkpoint_path)
algo.train()

Load only the RL module

For deployment, reload just the network without the optimizer or buffer:

from ray.rllib.core.rl_module.rl_module import RLModule

module = RLModule.from_checkpoint(f"{checkpoint_path}/learner_group/learner/rl_module")

Use module.forward_inference(...) for low-latency inference.

Auto-checkpointing in Tune

tuner = tune.Tuner(
    "PPO",
    param_space=config.to_dict(),
    run_config=ray.train.RunConfig(
        checkpoint_config=ray.train.CheckpointConfig(
            checkpoint_frequency=10,
            num_to_keep=3,
        ),
    ),
)

Tune saves every 10 iterations and keeps the three best.

Next steps

Training

Inside the iteration loop.

Offline RL

Use logged data instead of rollouts.

Training Loop Offline RL

⌘I

Ray Data

Ray Train

Ray Tune

Ray Serve

Ray RLlib

Ray LLM

Checkpoints

Save

Restore

Load only the RL module

Auto-checkpointing in Tune

Next steps

Training

Offline RL

Ray Data

Ray Train

Ray Tune

Ray Serve

Ray RLlib

Ray LLM

Documentation Index

​Save

​Restore

​Load only the RL module

​Auto-checkpointing in Tune

​Next steps

Training

Offline RL

Save

Restore

Load only the RL module

Auto-checkpointing in Tune

Next steps