A single call toDocumentation Index
Fetch the complete documentation index at: https://ray-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
algo.train() runs one iteration of the loop:
Rollouts
Each EnvRunner steps its environment(s), producing a batch of episodes. The exploration policy uses the latest weights from the central RLModule.
Postprocessing
Connectors compute returns, advantages, and any algorithm-specific batch-level fields.
Configure the loop
Inspect a single iteration
Custom loops
For full control, drop down toalgo.step (the legacy stack uses Algorithm.training_step). The new-stack equivalent is to subclass the algorithm and override the iteration logic.
Stop conditions
Use Ray Tune’s stop config:Next steps
Checkpoints
Save and restore RLlib state.
Offline RL
Skip rollouts and train from logged data.