Upgrade to Pro — share decks privately, control downloads, hide ads and more …

apidays Paris 2024 - AI In My API Gateway ... B...

apidays
December 22, 2024

apidays Paris 2024 - AI In My API Gateway ... But Why?, Mathieu Ancelin, APIM

AI In My API Gateway ... But Why?
Mathieu Ancelin, CTO - Cloud at APIM

apidays Paris 2024 - The Future API Stack for Mass Innovation
December 3 - 5, 2024

------

Check out our conferences at https://www.apidays.global/

Do you want to sponsor or talk at one of our conferences?
https://apidays.typeform.com/to/ILJeAaV8

Learn more on APIscene, the global media made by the community for the community:
https://www.apiscene.io

Explore the API ecosystem with the API Landscape:
https://apilandscape.apiscene.io/

apidays

December 22, 2024
Tweet

More Decks by apidays

Other Decks in Programming

Transcript

  1. APIDays 2024 Mathieu ANCELIN CTO & co-founder @ Cloud APIM

    Developer @ SERLI Creator & lead @ Otoroshi @TrevorReznik mathieu-ancelin
  2. APIDays 2024 Cloud APIM API Management & reverse proxy http

    as a service • managed otoroshi instances • serverless / gitops • authify / auth. enforcement • Webshield / WAF • LLM Endpoints https://www.cloud-apim.com @cloudapim cloud-apim
  3. APIDays 2024 API Gateways • Expose your internal/external APIs to

    their consumers ◦ API Management ? • Apply the same level of control across all your APIs ◦ regardless of the underlying technology • Monitoring and observability • etc Can be hard to configure !
  4. APIDays 2024 why this talk ? • I’m an API

    Gateway developer • I’m not an GenAI fanboy • I see more and more projects using AI libraries that tends to replicate API gateway behaviors • I’m convinced that consuming LLM APIs through an API Gateway is better for you ◦ ensure controlled use of AI/LLMs in your organization ◦ organization rules compliance ◦ legal compliance ◦ costs efficiency ◦ eventually can help you “configure” your gateway in fun ways ;) • Just focus on your product ◦ let us do the heavy lifting
  5. APIDays 2024 Unified interface • Multiple providers, same API ◦

    can be vendor specific ◦ can be OpenAI compatible • Allows Your Applications to Work with Any Provider/Model Without Rewriting Everything ◦ lots of existing products already works with the OpenAI API ◦ you just have to learn one API and focus on your product
  6. APIDays 2024 Provider supports • OpenAI • Azure OpenAI •

    Mistral • Anthropic • Ollama • Cloudflare • Scaleway • Cohere • Gemini • Groq • HuggingFace • OVH AI Endpoints • X.ai • etc lot of providers are supported without you worrying about it
  7. APIDays 2024 API Resilience • Automatic Retries ◦ timeout management

    ◦ network errors ◦ API errors ◦ circuit breaker • Fallback providers ◦ local (ollama) ◦ remote • Model Load Balancing ◦ multiple instances of the same “local” model ◦ different remote models ◦ hybrid • Parallel calls
  8. APIDays 2024 Observability and reporting Every request to an LLM

    provider generates data for further analysis • the provider used • the model used • the prompt • the response received • token consumption metrics • consumer identity ◦ apikey ◦ token ◦ connected user Who ? When ? What ?
  9. APIDays 2024 Costs optimization • Manage token consumption with quotas

    ◦ highly flexible solution ◦ quotas, grouping and time window based on whatever you want in the http request • Cache ◦ simple ◦ semantic ▪ based on embeddings and vector search database • Reporting based on observability data
  10. APIDays 2024 Prompts guardrails Optimize the Privacy of Your Organization's

    Data and the Security of Your Users Through a Set of "Guardrails" Applied to Prompts and/or Responses. Better legal and organization rules compliance • regex, characters count, words count, sentences count, contains, semantic contains • webhook • Generic LLM calls • pre-configured LLM ◦ no gibberish ◦ no secret leakage ◦ no PIF ◦ language moderation ◦ semantic match ◦ no hallucinations ◦ no gender bias ◦ no racial bias ◦ no personal health informations ◦ no toxic language ◦ no prompt jailbreak ◦ etc
  11. APIDays 2024 Enhanced security Leverage the Full Range of Authentication

    and Authorization Mechanisms Provided by the Gateway • apikeys, jwt tokens, biscuit tokens • OAuth2, OIDC, LDAP, SAMLv2 • OPA Rego, RBAC, ACLs, contextual authorizations Handling Secrets for LLM Provider Access • the gateway can do it on the fly • can use secret vaults support for that
  12. APIDays 2024 Prompts engineering • Prompt contexts ◦ Global or

    Local Model Configuration ◦ Adding Contextual Information ◦ Adapting the Model to Your Organization ◦ Standardizing Responses • Prompt templating ◦ Simplifying the Creation of LLM-Based APIs ◦ No-Code Approach
  13. APIDays 2024 Run tool_calls at the edge • Define your

    tool_calls function directly on the gateway ◦ using whatever language you want ◦ compiled to WASM • Run those functions directly on the gateway ◦ Avoid Complex Protocol Handling ◦ avoid code duplication across organization ◦ organization wide common functions usage ◦ optimizations across organization ◦ just focus on your product