Skip to main content

Documentation Index

Fetch the complete documentation index at: https://ray-preview.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Ray Core lets you turn any Python function or class into a distributed computation. This walkthrough covers the three primitives you’ll use every day.

Start Ray

import ray
ray.init()
ray.init() starts a local Ray runtime. To connect to an existing cluster, pass address="auto" or the head node address.

Remote functions (tasks)

Add @ray.remote to a function to make it a task. Calling .remote() schedules the task on the cluster and returns an ObjectRef that you resolve with ray.get.
@ray.remote
def f(x):
    return x * x

futures = [f.remote(i) for i in range(8)]
results = ray.get(futures)
Tasks run in parallel across all available CPUs (and across the cluster, when connected to one).

Specify resources

@ray.remote(num_cpus=2, num_gpus=1)
def train_model(...):
    ...
You can also override resources per call:
train_model.options(num_gpus=2).remote(...)

Pass objects between tasks

ObjectRef values are passed directly between tasks; Ray handles transferring data efficiently.
@ray.remote
def add(a, b):
    return a + b

@ray.remote
def double(x):
    return x * 2

a = add.remote(1, 2)
b = double.remote(a)
print(ray.get(b))  # 6

Remote classes (actors)

Actors are Python classes that run as long-lived workers and hold state across method calls.
@ray.remote
class Counter:
    def __init__(self):
        self.value = 0

    def increment(self):
        self.value += 1
        return self.value

    def get(self):
        return self.value

counter = Counter.remote()
ray.get([counter.increment.remote() for _ in range(5)])
print(ray.get(counter.get.remote()))  # 5
Actors are useful for:
  • Loading a model into GPU memory once and reusing it across calls
  • Maintaining a shared parameter server, replay buffer, or accumulator
  • Wrapping any service that benefits from warm state

Objects

Use ray.put to place a Python object in the distributed object store, returning an ObjectRef. This avoids re-serializing the same data into every task call.
import numpy as np

big = np.ones((10_000, 10_000))
ref = ray.put(big)

@ray.remote
def total(arr):
    return arr.sum()

print(ray.get(total.remote(ref)))
Multiple tasks on the same node share the object via shared memory, with zero-copy reads.

Wait for partial results

Use ray.wait to process results as they become available.
refs = [f.remote(i) for i in range(20)]
ready, not_ready = ray.wait(refs, num_returns=5)
# ready contains 5 finished refs; the rest are still in flight

Cancellation

ref = slow_task.remote()
ray.cancel(ref)

Putting it together

Here’s a full example that downloads URLs in parallel and summarizes their content with a stateful actor.
import requests
import ray

ray.init()

@ray.remote
def fetch(url: str) -> str:
    return requests.get(url).text

@ray.remote
class Summary:
    def __init__(self):
        self.total_chars = 0
        self.docs = 0

    def add(self, text: str):
        self.total_chars += len(text)
        self.docs += 1

    def report(self):
        return {"docs": self.docs, "avg_chars": self.total_chars // max(1, self.docs)}

urls = ["https://example.com" for _ in range(10)]
summary = Summary.remote()
texts = ray.get([fetch.remote(u) for u in urls])
ray.get([summary.add.remote(t) for t in texts])
print(ray.get(summary.report.remote()))

Next steps

Key concepts

Tasks, actors, objects, scheduling, and resources in depth.

Patterns and anti-patterns

Battle-tested guidance for structuring Ray applications.