Upgrade to Pro — share decks privately, control downloads, hide ads and more …

API and Data Mocking for Python

API and Data Mocking for Python

When working in embedded and observability domains, I’ve used Python scripting to retrieve and pre-process data from external sources, and one of the issues I’ve seen is the difficulty to reliably test data pipelines against external services: API limits and pay-per-use costs, service outages, etc, etc. So, can we model (aka “mock”) the services to reliably test our data ingestion pipelines?. Sure we can!

In this talk I will show a few ways to build test services, databases and API providers with the help of Testcontainers and WireMock available on Python, thanks to container tech. Then, we will extend the approach by adding the generation of fake data with help of Faker libraries or Synthesized that can be used for both relational data and data sequences.

Oleg Nenashev

December 12, 2023
Tweet

More Decks by Oleg Nenashev

Other Decks in Technology

Transcript

  1. Outline • Intro to API integration testing • Testcontainers and

    WireMock • Using them together • And Python examples! (mostly) 3 Slides
  2. > whoami @oleg_nenashev oleg-nenashev Dr. Nenashev / Mr. Jenkins Developer

    tools hacker Community builder & DevRel consultant #RussiansAgainstPutin #StandWithUkraine
  3. • Build, Test and CI automation in Jenkins • Zenoss

    plugins for database monitoring • Jupyter playbooks • MkDocs • Finding excuses to not use Python :=( 6 censored censored censored /me and Python
  4. 8 wiremock.org IF (request_url) THEN (response) * * it gets

    MUCH more complex WireMock Config JSON: Client library response request Client App Mock API Server HTTP/2
  5. 9

  6. 10

  7. 12 GROWTH IN WEB APIS SINCE 2005 Over 90% of

    developers use APIs. Skyrocket growth of APIs JANUARY 2006 JANUARY 2008 JANUARY 2010 JANUARY 2012 JANUARY 2014 JANUARY 2016 JANUARY 2018 MONTH The growth over time of the Programmable Web API API directory to more than 22,000 entries 22000 20000 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 TOTAL API COUNT Programmable Web * Gartner Hype Cycle for APIs, 2022
  8. Most of data comes from APIs • opendata.swiss • kidsfirstdrc.org/portal/portal-about-data

    • vs.inf.ethz.ch/res • github.com/public-apis/public-apis • www.kaggle.com/datasets • data.world 17
  9. 21 Build Unit tests Publish Reports Integration tests Publish Reports

    NOW “Shift Left” Fast integrations tests are critical
  10. Ways to do integration testing 1. Testing against Production/Staging servers

    2. Testing against a simplified/containerized instance (e.g. Testcontainers) 3. Mocking at the API provider level (e.g. WireMock) 4. Mocking at the code level 22 Slow Fast Nope Accu- rate
  11. 25 wiremock.org IF (request_url) THEN (response) * * it gets

    MUCH more complex WireMock Config JSON: Client library response request Client App Mock API Server HTTP/2
  12. 27 WireMock 101 Open Source tool for building mock APIs

    Available beyond Java, on Python too You can can: • Create stable development environments • Isolate from unstable 3rd party APIs • Simulate APIs that don't exist yet wiremock.org
  13. WireMock Cloud by WireMock Inc. 28 • WireMock creator is

    a co-founder of the Inc. • WireMock Cloud - SaaS for end-to-end API mocking • Private beta: K8s Edition for managed / on-premises wiremock.io
  14. WireMock for Python wiremock.org/docs/solutions/python wiremock.readthedocs.io • Python SDK • REST

    API Client library • Pytest and Robot Framework integrations • Testcontainers Module * Custom logo is approved by the Python Software Foundation 31 More: github.com/wiremock/python-wiremock
  15. 33 WireMock Standalone client from wiremock.constants import Config from wiremock.client

    import * Config.base_url = 'https://mockserver.example.com/__admin/' mapping = Mapping( priority= 100, request=MappingRequest( method=HttpMethods.GET, url= '/hello' ), response=MappingResponse( status= 200, body= 'hi' ), persistent= False, ) mapping = Mappings.create_mapping(mapping=mapping) all_mappings = Mappings.retrieve_all_mappings() More: github.com/wiremock/python-wiremock Client library response request Test Mock API Server HTTP/2
  16. Using WireMock in unittest 34 class MyTestClassBase(TestCase): @classmethod def setUpClass(cls):

    wm = self.wiremock_server = WireMockServer() wm.start() Config.base_url = 'http://localhost:{}/__admin' .format(wm.port) @classmethod def tearDownClass(cls): self.wiremock_server.stop() More: github.com/wiremock/python-wiremock unittest
  17. 40

  18. Docker • Popular container engine • Developer-friendly • Huge ecosystem

    • DockerHub and Container Registries • Docker Compose – multi-container apps • Universal image format (OCI) https://www.docker.com/
  19. 45

  20. Testcontainers for Python 48 • Started by Sergey Pirogov, now

    - Till Hoffmann • github.com/testcontainers/testcontainers-python • testcontainers-python.readthedocs.io testcontainers.com
  21. Containers are not always slow! • On-demand image build •

    Caching Docker image builds • Suspending containers between tests • Graceful termination ◦ github.com/testcontainers/moby-ryuk 50 testcontainers.com
  22. WireMock as a Proxy 55 Tests Real API Provider OR

    • Recording • Fault injection • Protocol Verification
  23. WireMock Record & Playback wiremock.org/docs/record-playback • Record communications • Capture

    data • Replay the requests • Reverse-engineer protocols to OpenAPI 56
  24. WireMock Faker Extension github.com/wiremock/wiremock-faker-extension • There’re many Data::Faker library ports

    ◦ Python: joke2k/faker ◦ Ruby: faker-ruby/faker (logo source!) ◦ Java: datafaker-net/datafaker • We use the Java one • It can be included in a Testcontainer image 57
  25. WireMock Faker Extension wiremock.org/docs/response-templating/#fake-data-helpers 58 "response": { "status": 200, "jsonBody":

    { "namme": {{random 'Name.first_name'}} "surname": {{random 'Name.last_name'}} "postcode": {{random 'Address.postcode_by_state.AL' }} } }
  26. Synthesized’ Data offerings 60 Synthesized Scientific Data Kit (SDK) Synthesized

    Test Data Kit (TDK) Synthesized Fairlens (Data Bias) www.synthesized.io Data sets / Streams *.CSV Relational datasets (for databases)
  27. 61 Testcontainers Module for Synthesized TDK • Only TDK -

    Relational Data • Only Java Module No Runtime Use at the moment, But NOT a showstopper
  28. 62 TDK SDK Database Module for Testcontainers App Tests WireMock

    Module Build On-Demand Your App Tests • Random Fetch • CSV => JSON generator
  29. Python WireMock - My Wishlist wiremock/python-wiremock/issues • Feature parity with

    WireMock Java in the SDK • More features in the Testcontainers module • Removing the JVM process implementation • Integrations with FastAPI and Hug • Making Python a first class citizen in WireMock docs 67
  30. Takeaways 70 • Shift left your integration testing • Mock

    APIs and Data • There are tools for that, including WireMock and Testcontainers • WireMock and Testcontainers co-exist well • They are available in Python!
  31. Credits to • All WireMock contributors • WireMock Inc. Team

    • All Testcontainers contributors and AtomicJar folks • All FOSS contributors 74