Skip to main content

Documentation Index

Fetch the complete documentation index at: https://ray-preview.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

This page collects patterns that show up repeatedly in production Ray applications, paired with the anti-patterns to avoid.

Pipelines: stage tasks with object refs

Pattern. Build pipelines by passing ObjectRefs between tasks. Ray transfers data between nodes as needed and can schedule downstream stages near upstream outputs.
@ray.remote
def load(uri): ...

@ray.remote
def transform(data): ...

@ray.remote
def score(data): ...

results = [score.remote(transform.remote(load.remote(u))) for u in uris]
ray.get(results)
Anti-pattern. Don’t ray.get between stages — it serializes the pipeline.

Fan out with ray.put

Pattern. When many tasks read the same large object, write it once with ray.put and pass the resulting ref.
big = ray.put(load_dataset())
results = ray.get([process.remote(big, i) for i in range(1000)])
Anti-pattern. Passing a multi-megabyte argument directly into every .remote() call re-serializes the data each time and inflates worker memory.

Use actors for warm models

Pattern. Load a model once into an actor, then call its methods repeatedly.
@ray.remote(num_gpus=1)
class Predictor:
    def __init__(self):
        self.model = load_model()

    def predict(self, batch):
        return self.model(batch)

predictor = Predictor.remote()
ray.get([predictor.predict.remote(b) for b in batches])
Anti-pattern. Loading the model inside a task means every task pays the load cost.

Pool of actors for parallel inference

Pattern. Create a pool of actors and round-robin requests across them.
from ray.util import ActorPool

pool = ActorPool([Predictor.remote() for _ in range(8)])
for result in pool.map(lambda actor, item: actor.predict.remote(item), inputs):
    ...

Limit pending tasks with ray.wait

Pattern. When producing more tasks than memory can hold, gate submission with ray.wait.
pending = []
for item in stream:
    if len(pending) >= 100:
        done, pending = ray.wait(pending, num_returns=10)
    pending.append(process.remote(item))
ray.get(pending)
Anti-pattern. Submitting millions of tasks at once causes the scheduler and object store to back up.

Use placement groups for gang scheduling

Pattern. When you need N workers running together, reserve them with a placement group before submitting work. See placement groups.

Batch small tasks

Pattern. Many small tasks add scheduling overhead. Batch them when possible.
@ray.remote
def process_batch(items):
    return [process(i) for i in items]

batches = [items[i:i+128] for i in range(0, len(items), 128)]
results = ray.get([process_batch.remote(b) for b in batches])
Anti-pattern. Submitting one task per row of a dataset is rarely the right shape — use Ray Data for that.

Don’t ray.get inside tasks

Calling ray.get from inside a remote function blocks a worker, defeating the point of distribution. If a task needs to wait on another, refactor so the dependency is passed as an argument instead.

Avoid global state

Tasks and actors run in separate processes. Module-level globals modified inside a worker won’t be visible to the driver or to other workers.

Next steps

Walkthrough

See these patterns in a worked example.

Handling dependencies

Runtime environments for per-task and per-actor dependencies.