Documentation Index
Fetch the complete documentation index at: https://ray-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
The KubeRay operator runs Ray on Kubernetes. It manages three custom resources:
| CRD | Purpose |
|---|
RayCluster | A long-running Ray cluster. |
RayJob | Ephemeral cluster that runs a single Ray job and then tears down. |
RayService | Long-running Ray Serve deployment with rolling upgrades and zero-downtime updates. |
Prerequisites
- Kubernetes 1.23+
- Helm 3.x
- A cluster with at least 4 vCPUs and 8 GiB of memory free
Install KubeRay
helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm repo update
helm install kuberay-operator kuberay/kuberay-operator -n kuberay-system --create-namespace
Create a RayCluster
apiVersion: ray.io/v1
kind: RayCluster
metadata:
name: hello
spec:
rayVersion: "2.43.0"
enableInTreeAutoscaling: true
headGroupSpec:
rayStartParams: {}
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.43.0
resources:
requests: { cpu: "2", memory: "4Gi" }
limits: { cpu: "2", memory: "4Gi" }
workerGroupSpecs:
- groupName: cpu
replicas: 1
minReplicas: 0
maxReplicas: 10
rayStartParams: {}
template:
spec:
containers:
- name: ray-worker
image: rayproject/ray:2.43.0
resources:
requests: { cpu: "4", memory: "8Gi" }
limits: { cpu: "4", memory: "8Gi" }
kubectl apply -f raycluster.yaml
kubectl get raycluster
Submit a job
RAY_ADDRESS=http://hello-head-svc.default.svc:8265 \
ray job submit -- python my_script.py
Or use a RayJob resource:
apiVersion: ray.io/v1
kind: RayJob
metadata:
name: my-job
spec:
entrypoint: python my_script.py
rayClusterSpec:
...
Deploy a Ray Serve app
apiVersion: ray.io/v1
kind: RayService
metadata:
name: my-service
spec:
serveConfigV2: |
applications:
- name: app
import_path: my_module:app
deployments:
- name: Service
num_replicas: 4
rayClusterConfig:
...
KubeRay performs zero-downtime rolling updates when you change serveConfigV2.
Next steps
Quickstart: RayCluster
Create your first RayCluster.
User guides
Configuration, autoscaling, GPU, storage, observability.