A single Ray cluster can host any number of Serve applications. Each application has its own deployment graph, route prefix, and configuration.Documentation Index
Fetch the complete documentation index at: https://ray-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Define multiple apps
Deploy
Update one app
Editmulti-app.yaml (e.g., bump num_replicas for search) and re-run serve deploy. Only the affected application is reconfigured.
Independent autoscaling
Each app’s deployments autoscale based on their own traffic. A burst on/search doesn’t pull replicas away from /classify.
Shared resources
Apps share the cluster’s resource pool. If both apps’ autoscalers want every available GPU at the same time, the controller arbitrates via the placement-group scheduler.When to split
Use multi-app when:- Different teams own different services on the same cluster.
- One application has different scaling characteristics than another.
- You want fault isolation between independent apps.
Next steps
Develop and deploy
Application lifecycle management.
Production guide
Multi-tenant production guidance.