Documentation Index Fetch the complete documentation index at: https://ray-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
A RayCluster is the simplest KubeRay resource: a head pod and one or more worker groups. Use it for interactive workloads, multi-tenant clusters, and any case where you want a Ray cluster to outlive a single job.
Manifest
apiVersion : ray.io/v1
kind : RayCluster
metadata :
name : my-cluster
spec :
rayVersion : "2.43.0"
enableInTreeAutoscaling : true
headGroupSpec :
rayStartParams :
dashboard-host : "0.0.0.0"
template :
spec :
containers :
- name : ray-head
image : rayproject/ray:2.43.0
ports :
- containerPort : 6379
- containerPort : 8265
- containerPort : 10001
resources :
requests : { cpu : "2" , memory : "4Gi" }
limits : { cpu : "2" , memory : "4Gi" }
workerGroupSpecs :
- groupName : cpu
replicas : 1
minReplicas : 0
maxReplicas : 10
rayStartParams : {}
template :
spec :
containers :
- name : ray-worker
image : rayproject/ray:2.43.0
resources :
requests : { cpu : "4" , memory : "8Gi" }
limits : { cpu : "4" , memory : "8Gi" }
kubectl apply -f raycluster.yaml
kubectl get raycluster my-cluster -w
Add a GPU worker group
- groupName : gpu
replicas : 0
minReplicas : 0
maxReplicas : 8
template :
spec :
containers :
- name : ray-worker
image : rayproject/ray:2.43.0-gpu
resources :
requests :
cpu : "8"
memory : "32Gi"
nvidia.com/gpu : 1
limits :
cpu : "8"
memory : "32Gi"
nvidia.com/gpu : 1
The autoscaler launches GPU workers on demand. Idle GPU workers terminate after the configured timeout.
Connect to the cluster
kubectl port-forward service/my-cluster-head-svc 8265:8265
ray job submit --address http://localhost:8265 -- python my_script.py
Update the cluster
Edit the manifest (e.g., bump maxReplicas) and reapply. KubeRay reconciles the difference.
Tear down
kubectl delete raycluster my-cluster
Next steps
RayJob Run jobs on ephemeral clusters.
Autoscaling Configure scaling behavior.