task in an individual pod ❏ Steps with heavier workloads run their critical tasks with Ray cluster ❏ Intermediate datasets and ML models are stored in S3 6
of operation of software where multiple independent instances of one or multiple applications operate in a shared environment. The instances (tenants) are logically isolated, but physically integrated.” 8 Soft Multi-tenancy Easier to implement Hard Multi-tenancy Harder to Implement Generally “Better” Stricter Isolation of Tenants
Ray cluster creation for new tenants 2. Cost-effective We operate less amount of Ray cluster in general and saves cost for cluster creation time 3. Reduced Maintenance Overhead The number of cluster under maintenance is drastically lower 9
faster PoC process ❏ Not designed for high number of concurrent connections ❏ Run Python code with Ray cluster as if you are running it on local machine ❏ Very minimal changes required in codebase
Ray Job submission is a mechanism to submit locally developed and tested applications to a remote Ray cluster. It simplifies the experience of packaging, deploying, and managing a Ray application.
scripts With RuntimeEnv Object ❏ working_dir The working directory for the Ray workers (local or S3) ❏ py_modules Python modules to be available for import in the Ray workers ❏ pip A list of pip requirements ❏ env_vars Environment variables to set … and more! With Dictionary
will be located in its own S3 path ❏ Tenant’s data and models are logically isolated Each job will have access to its own S3 path, specified by `working_dir`
This tells Ray scheduler to reserve the required resource to ensure the tasks’ requirements does not exceed the nodes’ capacity ❏ Specifying resources does not enforce physical isolation of resources
Submission with Ray Job submission 2. Tenant Isolation with datasource separation, Python dependency management, and environment variables 3. Resource Management by providing CPU and memory requirement, alongside with task queue integration