Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Leveraging the Possibilities of Ray Serve in Implementing a Scalable, Fully Automated Digital Authentication Service (Tanja Bayer, Widas Technologie Services GmbH)

Leveraging the Possibilities of Ray Serve in Implementing a Scalable, Fully Automated Digital Authentication Service (Tanja Bayer, Widas Technologie Services GmbH)

The implementation of an online video authentication service involves a large variety of different kind of algorithms, ranging from classic decision making models over neural networks and computer vision techniques. To reach a decision as fast as possible, these algorithms need to be parallelized to the full extent and executed on both CPU and GPU cores. At the same time, large amount of video data have to be shared efficiently between tasks. Considering all requirements, Ray, and especially Ray Serve, provide an optimal solution with their platform. In this talk, Tanja Bayer shares reasons to choose Ray Serve over other platforms like Kafka Streams or PySpark. She will describe the process of implementing Widas Technologie Services GmbH's service with Ray and will take a closer look on how they tackled some of the challenges, like synchronizing the output of their models to combine their feedback into one final decision.

Anyscale

July 20, 2021
Tweet

More Decks by Anyscale

Other Decks in Technology

Transcript

  1. Leveraging the Possibilities of Ray Serve in Implementing a
    Scalable, Fully Automated Digital Authentication Service
    Tanja Bayer
    Machine Learning Engineer @ Widas Technologie Services
    GmbH
    Ray Summit - 2021

    View full-size slide

  2. About Me
    •Tanja Bayer
    •M.Sc. Industrial Engineering
    and Management @Karlsruhe Institute
    for Technology
    •Machine Learning Engineer @ Widas
    Technologie Services GmbH
    •Focus on Computer Vision
    and Deployment of Machine Learning
    Services

    View full-size slide

  3. By the end of this talk, you will know:
    • What a fully automated digital authentication service is
    • Why we use Ray and Ray Serve to implement it
    • How our initial PoC evolved to a scalable and stable
    architecture
    • How we addressed some challenges by using Ray
    specific implementations

    View full-size slide

  4. What is a Fully
    Automated Digital
    Authentication Service?
    • Online authentication for contracts
    • Requires a valid identity card
    • Completely automated process

    View full-size slide

  5. How does the Process Look Like?
    2
    Identity Card
    Scan
    (Back Side)
    1
    Identity Card
    Scan
    (Front Side)
    3
    Face
    Scan

    View full-size slide

  6. Understanding the Data Processing
    1
    2
    3
    Input Processi
    ng
    Output

    View full-size slide

  7. How did we get Started?
    • aiohttp + gunicorn for
    asynchronous requests
    • kafka for interprocess
    communication
    • redis for data sharing

    View full-size slide

  8. Implementation Challenges
    • Combination of different REST
    triggers
    • Algorithms are really CPU heavy
    • Models are GPU heavy
    • Huge difference between
    computation time of different tasks
    • Heavy interconnection between
    different model outcomes

    View full-size slide

  9. Why using Ray?
    Microservices
    Distributed and
    Parallel
    Processing
    ++ ++ ++ ++
    Scalability ++ ++ ++ ++
    Support for
    Python + ML
    ++ • + +
    Handling large
    data
    ++ -- ? --
    Large Community
    and Adaption
    ++ ++ +
    Depends on
    Framework
    used
    ++ Exeeds
    requirments
    -- Does not meet
    requirements
    • Requirments are
    met
    ? No rating
    possible

    View full-size slide

  10. Why Ray Serve?
    • One framework for everything
    • Native support of composing ML models
    • Easy possibility of Shadow-Testing

    View full-size slide

  11. Architecture - General
    Ray-Head
    Node

    View full-size slide

  12. Ray-Head
    Node
    Architecture – CPU Cluster

    View full-size slide

  13. Architecture – CPU Cluster
    • Resources for serve
    backends are configured
    when scaling them up
    • Assign less resources
    per backend
    • Put more nodes (more
    cpu) on this cluster than
    cores available
    ray start --address=$HOST_VIP --num-cpus=$NO_CORES+x
    --resources='{"CPU_RESOURCES": y} '
    • OMP_NUM_THREADS=1
    • cv2.setNumThreads(0)

    View full-size slide

  14. Architecture – GPU Cluster
    Ray-Head
    Node

    View full-size slide

  15. Architecture – GPU Cluster
    ray start --head --num-gpus=x
    Ray-Head
    Node
    • ML Models
    • Head Node
    • Starting the ray serve
    backends

    View full-size slide

  16. Architecture – Api Cluster
    Ray-Head
    Node

    View full-size slide

  17. Architecture – Api Cluster
    ray start --address=$HOST_VIP --num-cpus=x --resources='{"API_RESOURCES": y,
    "ACTOR_RESOURCES": z} '
    • High availability
    • API Backends & Data Actors

    View full-size slide

  18. Using Ray Actors for Syncing Data
    How can tasks be
    synchronized?

    View full-size slide

  19. Using Ray Actors for Syncing Data
    def trigger(self, step: str):
    if step == 'face':
    self.ready_face.set()
    elif step == 'front':
    self.ready_card_front.set()

    @ray.remote(num_cpus=0.1)
    class CaseActor:
    def __init__(self):
    self.ready_face = asyncio.Event()
    self.ready_card_front = asyncio.Event()
    self.ready_card_back = asyncio.Event()
    async def wait(self, step: str):
    if step == 'face':
    await self.ready_face.wait()
    elif step == 'front':
    await self.ready_front.wait()
    ...
    def is_initialized(self) -> bool:
    return True

    View full-size slide

  20. Writing the Main Logic
    async def main_thread(self, case_id: str):
    actor = ray.get_actor(case_id)
    class Main:
    def __init__(self):
    self.model_1 = serve.get_handle('model_1')
    self.model_2 = serve.get_handle('model_2')
    trigger_face = actor.wait.remote('face')
    trigger_front = actor.wait.remote('front')
    handle_1 = self.model_1.remote(trigger_face)
    handle_2 = self.model_2.remote(trigger_face, trigger_front )
    finished, pending = await asyncio.wait([handle_1, handle_2])

    View full-size slide

  21. Initializing the Actor and Triggering the Events
    async def setup(request: Request) -> JSONResponse:
    case_id = uuid.uuid4()
    actor = CaseActor.options(name=case_id, lifetime='detached').remote()
    async def upload(request: Request) -> JSONResponse:
    actor = ray.get_actor(request.query_params.get('case_id'))
    await actor.trigger.remote(request.query_params.get('upload_type'))
    return JSONResponse({'data': 'SUCCESS'})
    await actor.is_initialized.remote()
    main_thread.remote(case_id)
    return JSONResponse({'case_id': case_id})

    View full-size slide

  22. Sum Up & Outlook
    • Speedup of processing time approx. x4 compared to initial implementation
    • Reduction of used Frameworks
    • Easy scalability and possibility to provide the same service with different
    configurations
    • Really excited to see new features coming in Ray and Ray serve, e.g. the
    automated load-based autoscaling feature for deployments

    View full-size slide

  23. Thank You for Listening
    Special Thanks to Anyscale and the Ray
    Team for hosting this Summit
    We are hiring!
    Mail: [email protected]
    Phone: +49 7044 95103 100
    www.widas.de

    View full-size slide