Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Leveraging the Possibilities of Ray Serve in Im...

Leveraging the Possibilities of Ray Serve in Implementing a Scalable, Fully Automated Digital Authentication Service (Tanja Bayer, Widas Technologie Services GmbH)

The implementation of an online video authentication service involves a large variety of different kind of algorithms, ranging from classic decision making models over neural networks and computer vision techniques. To reach a decision as fast as possible, these algorithms need to be parallelized to the full extent and executed on both CPU and GPU cores. At the same time, large amount of video data have to be shared efficiently between tasks. Considering all requirements, Ray, and especially Ray Serve, provide an optimal solution with their platform. In this talk, Tanja Bayer shares reasons to choose Ray Serve over other platforms like Kafka Streams or PySpark. She will describe the process of implementing Widas Technologie Services GmbH's service with Ray and will take a closer look on how they tackled some of the challenges, like synchronizing the output of their models to combine their feedback into one final decision.

Anyscale

July 20, 2021
Tweet

More Decks by Anyscale

Other Decks in Technology

Transcript

  1. Leveraging the Possibilities of Ray Serve in Implementing a Scalable,

    Fully Automated Digital Authentication Service Tanja Bayer Machine Learning Engineer @ Widas Technologie Services GmbH Ray Summit - 2021
  2. About Me •Tanja Bayer •M.Sc. Industrial Engineering and Management @Karlsruhe

    Institute for Technology •Machine Learning Engineer @ Widas Technologie Services GmbH •Focus on Computer Vision and Deployment of Machine Learning Services
  3. By the end of this talk, you will know: •

    What a fully automated digital authentication service is • Why we use Ray and Ray Serve to implement it • How our initial PoC evolved to a scalable and stable architecture • How we addressed some challenges by using Ray specific implementations
  4. What is a Fully Automated Digital Authentication Service? • Online

    authentication for contracts • Requires a valid identity card • Completely automated process
  5. How does the Process Look Like? 2 Identity Card Scan

    (Back Side) 1 Identity Card Scan (Front Side) 3 Face Scan
  6. How did we get Started? • aiohttp + gunicorn for

    asynchronous requests • kafka for interprocess communication • redis for data sharing
  7. Implementation Challenges • Combination of different REST triggers • Algorithms

    are really CPU heavy • Models are GPU heavy • Huge difference between computation time of different tasks • Heavy interconnection between different model outcomes
  8. Why using Ray? Microservices Distributed and Parallel Processing ++ ++

    ++ ++ Scalability ++ ++ ++ ++ Support for Python + ML ++ • + + Handling large data ++ -- ? -- Large Community and Adaption ++ ++ + Depends on Framework used ++ Exeeds requirments -- Does not meet requirements • Requirments are met ? No rating possible
  9. Why Ray Serve? • One framework for everything • Native

    support of composing ML models • Easy possibility of Shadow-Testing
  10. Architecture – CPU Cluster • Resources for serve backends are

    configured when scaling them up • Assign less resources per backend • Put more nodes (more cpu) on this cluster than cores available ray start --address=$HOST_VIP --num-cpus=$NO_CORES+x --resources='{"CPU_RESOURCES": y} ' • OMP_NUM_THREADS=1 • cv2.setNumThreads(0)
  11. Architecture – GPU Cluster ray start --head --num-gpus=x Ray-Head Node

    • ML Models • Head Node • Starting the ray serve backends
  12. Architecture – Api Cluster ray start --address=$HOST_VIP --num-cpus=x --resources='{"API_RESOURCES": y,

    "ACTOR_RESOURCES": z} ' • High availability • API Backends & Data Actors
  13. Using Ray Actors for Syncing Data def trigger(self, step: str):

    if step == 'face': self.ready_face.set() elif step == 'front': self.ready_card_front.set() … @ray.remote(num_cpus=0.1) class CaseActor: def __init__(self): self.ready_face = asyncio.Event() self.ready_card_front = asyncio.Event() self.ready_card_back = asyncio.Event() async def wait(self, step: str): if step == 'face': await self.ready_face.wait() elif step == 'front': await self.ready_front.wait() ... def is_initialized(self) -> bool: return True
  14. Writing the Main Logic async def main_thread(self, case_id: str): actor

    = ray.get_actor(case_id) class Main: def __init__(self): self.model_1 = serve.get_handle('model_1') self.model_2 = serve.get_handle('model_2') trigger_face = actor.wait.remote('face') trigger_front = actor.wait.remote('front') handle_1 = self.model_1.remote(trigger_face) handle_2 = self.model_2.remote(trigger_face, trigger_front ) finished, pending = await asyncio.wait([handle_1, handle_2])
  15. Initializing the Actor and Triggering the Events async def setup(request:

    Request) -> JSONResponse: case_id = uuid.uuid4() actor = CaseActor.options(name=case_id, lifetime='detached').remote() async def upload(request: Request) -> JSONResponse: actor = ray.get_actor(request.query_params.get('case_id')) await actor.trigger.remote(request.query_params.get('upload_type')) return JSONResponse({'data': 'SUCCESS'}) await actor.is_initialized.remote() main_thread.remote(case_id) return JSONResponse({'case_id': case_id})
  16. Sum Up & Outlook • Speedup of processing time approx.

    x4 compared to initial implementation • Reduction of used Frameworks • Easy scalability and possibility to provide the same service with different configurations • Really excited to see new features coming in Ray and Ray serve, e.g. the automated load-based autoscaling feature for deployments
  17. Thank You for Listening Special Thanks to Anyscale and the

    Ray Team for hosting this Summit We are hiring! Mail: [email protected] Phone: +49 7044 95103 100 www.widas.de