Leveraging the Possibilities of Ray Serve in Implementing a Scalable, Fully Automated Digital Authentication Service (Tanja Bayer, Widas Technologie Services GmbH)

Leveraging the Possibilities of Ray Serve in Implementing a Scalable,
Fully Automated Digital Authentication Service Tanja Bayer Machine Learning Engineer @ Widas Technologie Services GmbH Ray Summit - 2021

About Me •Tanja Bayer •M.Sc. Industrial Engineering and Management @Karlsruhe
Institute for Technology •Machine Learning Engineer @ Widas Technologie Services GmbH •Focus on Computer Vision and Deployment of Machine Learning Services

By the end of this talk, you will know: •
What a fully automated digital authentication service is • Why we use Ray and Ray Serve to implement it • How our initial PoC evolved to a scalable and stable architecture • How we addressed some challenges by using Ray specific implementations

What is a Fully Automated Digital Authentication Service? • Online
authentication for contracts • Requires a valid identity card • Completely automated process

How does the Process Look Like? 2 Identity Card Scan
(Back Side) 1 Identity Card Scan (Front Side) 3 Face Scan

Understanding the Data Processing 1 2 3 Input Processi ng
Output

How did we get Started? • aiohttp + gunicorn for
asynchronous requests • kafka for interprocess communication • redis for data sharing

Implementation Challenges • Combination of different REST triggers • Algorithms
are really CPU heavy • Models are GPU heavy • Huge difference between computation time of different tasks • Heavy interconnection between different model outcomes

Why using Ray? Microservices Distributed and Parallel Processing ++ ++
++ ++ Scalability ++ ++ ++ ++ Support for Python + ML ++ • + + Handling large data ++ -- ? -- Large Community and Adaption ++ ++ + Depends on Framework used ++ Exeeds requirments -- Does not meet requirements • Requirments are met ? No rating possible

Why Ray Serve? • One framework for everything • Native
support of composing ML models • Easy possibility of Shadow-Testing

Architecture - General Ray-Head Node

Ray-Head Node Architecture – CPU Cluster

Architecture – CPU Cluster • Resources for serve backends are
configured when scaling them up • Assign less resources per backend • Put more nodes (more cpu) on this cluster than cores available ray start --address=$HOST_VIP --num-cpus=$NO_CORES+x --resources='{"CPU_RESOURCES": y} ' • OMP_NUM_THREADS=1 • cv2.setNumThreads(0)

Architecture – GPU Cluster Ray-Head Node

Architecture – GPU Cluster ray start --head --num-gpus=x Ray-Head Node
• ML Models • Head Node • Starting the ray serve backends

Architecture – Api Cluster Ray-Head Node

Architecture – Api Cluster ray start --address=$HOST_VIP --num-cpus=x --resources='{"API_RESOURCES": y,
"ACTOR_RESOURCES": z} ' • High availability • API Backends & Data Actors

Using Ray Actors for Syncing Data How can tasks be
synchronized?

Using Ray Actors for Syncing Data def trigger(self, step: str):
if step == 'face': self.ready_face.set() elif step == 'front': self.ready_card_front.set() … @ray.remote(num_cpus=0.1) class CaseActor: def __init__(self): self.ready_face = asyncio.Event() self.ready_card_front = asyncio.Event() self.ready_card_back = asyncio.Event() async def wait(self, step: str): if step == 'face': await self.ready_face.wait() elif step == 'front': await self.ready_front.wait() ... def is_initialized(self) -> bool: return True

Writing the Main Logic async def main_thread(self, case_id: str): actor
= ray.get_actor(case_id) class Main: def __init__(self): self.model_1 = serve.get_handle('model_1') self.model_2 = serve.get_handle('model_2') trigger_face = actor.wait.remote('face') trigger_front = actor.wait.remote('front') handle_1 = self.model_1.remote(trigger_face) handle_2 = self.model_2.remote(trigger_face, trigger_front ) finished, pending = await asyncio.wait([handle_1, handle_2])

Initializing the Actor and Triggering the Events async def setup(request:
Request) -> JSONResponse: case_id = uuid.uuid4() actor = CaseActor.options(name=case_id, lifetime='detached').remote() async def upload(request: Request) -> JSONResponse: actor = ray.get_actor(request.query_params.get('case_id')) await actor.trigger.remote(request.query_params.get('upload_type')) return JSONResponse({'data': 'SUCCESS'}) await actor.is_initialized.remote() main_thread.remote(case_id) return JSONResponse({'case_id': case_id})

Sum Up & Outlook • Speedup of processing time approx.
x4 compared to initial implementation • Reduction of used Frameworks • Easy scalability and possibility to provide the same service with different configurations • Really excited to see new features coming in Ray and Ray serve, e.g. the automated load-based autoscaling feature for deployments

Thank You for Listening Special Thanks to Anyscale and the
Ray Team for hosting this Summit We are hiring! Mail: [email protected] Phone: +49 7044 95103 100 www.widas.de

Leveraging the Possibilities of Ray Serve in Im...

Leveraging the Possibilities of Ray Serve in Implementing a Scalable, Fully Automated Digital Authentication Service (Tanja Bayer, Widas Technologie Services GmbH)

Anyscale

More Decks by Anyscale

Other Decks in Technology

Featured

Transcript