Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Alternatives to XA 2PC Transactions

Paul Done
December 27, 2020

Alternatives to XA 2PC Transactions

Why, from experience, I believe XA 2PC (2-Phase-Commit) Transactions have too many downsides to warrant using them for their upsides, in most or nearly all situations. Here, I explore the reasons why it is problematic and I then explore some alternative strategies.

Note, to be clear, an XA 2PC transaction is not the same as a [local] ACID transaction, although there is some overlap in concepts and principles.

Paul Done

December 27, 2020
Tweet

More Decks by Paul Done

Other Decks in Programming

Transcript

  1. Alternatives to XA Two-Phase
    Commit Transactions
    Paul Done | Executive Solutions Architect | MongoDB Inc. | @TheDonester

    View Slide

  2. Agenda
    Global Transactions Spanning Multiple
    Heterogeneous Systems
    The Problem With XA
    Alternative Patterns For Updating
    Heterogeneous Systems
    Comparing Approaches

    View Slide

  3. Global Transactions Spanning
    Multiple Heterogeneous Systems

    View Slide

  4. Database ACID Transactions
    A single unit of logic composed of multiple different database operations,
    which exhibits the following properties
    ● Atomic: either completes in its entirety or has no effect whatsoever (rolls back and is not
    left only partially complete)
    ● Consistent: each transaction observes the latest current database state in the correct
    write ordering
    ● Isolated: the state of an inflight transaction is not visible to other concurrent inflight
    transactions (and vice versa)
    ● Durable: changes are persisted and cannot be lost if there is a system failure
    Achieving complete
    consistency & complete
    isolation, simultaneously, is
    effectively mutually exclusive
    for real world databases and
    workloads - the reasons why
    are described by me in a
    different presentation

    View Slide

  5. Database ACID Transactions - Local or Global?
    ACID Transactions are commonly provided by all mainstream Relational DBs +
    just a very few NoSQL DBs (notably MongoDB)
    ● Where all of a single transaction’s operations are performed against the same
    database instance - known as ‘Local Transactions’
    ● Not exclusive to just database technologies - many implementations of other types
    of stateful resources, like message brokers, also support ACID transactions
    But things become much more challenging if a single transaction is composed
    of operations which update two or more different databases and/or message
    brokers - known as ‘Global Transactions’

    View Slide

  6. Two-Phase Commit (2PC)
    Pattern for enabling an application to perform a distributed global transaction
    across 2 or more [often heterogeneous] stateful resources
    ● Resources are usually databases or message brokers (hosting queues or topics)
    ● Controlled by a Transaction Manager (a.k.a Transaction Monitor)
    ○ This might be a responsibility embedded within one of the participant resources or may be a separate
    standalone system
    When an application calls commit(), two key phases are then instigated by the
    Transaction Manager
    1. PREPARE phase: Each resource votes Yes or No to indicate if it can fulfil the transaction
    (ensuring all related intermediary state is first persisted, in case it fails in the meantime)
    (if all vote Yes, the Transaction Manager persists the decision to a durable log, in case of system failure)
    2. COMMIT phase: Each resource is then instructed to persist its changes & can’t renege on this

    View Slide

  7. Two-Phase Commit (2PC) Examples
    There are various different implementations of the 2PC pattern out there,
    with examples including:
    ● IBM CICS
    ○ “Customer Information Control System” - middleware for managing online transaction
    processing on a Mainframe, where a CICS transaction may span multiple datasources (e.g. IMS,
    DB2, MQ Series, VSAM) using the SNA “LU6.2” systems communications protocol
    ● Oracle DBLINKs
    ○ A multi-statement transaction initiated and managed via one Oracle DB instance, with some of
    the updates applied there, but the rest of the updates applied in a separate Oracle DB instance,
    which is linked into by the first DB, under the covers
    ○ To the client application, it just appears that it is updating a single database within the
    transaction and it just uses the normal database driver & API
    ● The X/Open XA standard
    ○ a.k.a. XA/2PC or XA
    ○ Described in later slides...

    View Slide

  8. Two-Phase Commit Flow
    Diagrams copied from: https://fizalihsan.github.io/technology/transaction.html
    ALL RESOURCES COMMITTED EXAMPLE ALL RESOURCES ROLLED BACK EXAMPLE
    Each resource is a
    different database
    instance or
    message broker

    View Slide

  9. XA - a standard for 2PC
    A specification released in 1991 by the X/Open consortium (which later merged with The Open Group)
    Defines a protocol for coordinating transactions between heterogeneous technologies
    ● Defines global transactions which it calls XA Transactions
    ● Defines a Transaction Manager / Monitor which it calls the XA Coordinator
    ● Defines the one or more stateful Resource Managers (e.g. a DB) which it calls XA Resources
    Example XA Transaction Manager implementations:
    ● Tuxedo, CORBA servers, Java Enterprise Edition application servers (e.g. WebLogic, WebSphere, JBoss)
    Example XA Transaction Resource implementations:
    ● Databases: Oracle, IBM DB2, MS SQL Server, Postgres, MySQL (but NOT MongoDB or other NoSQL)
    ● Message Brokers: IBM MQSeries, ActiveMQ, Java EE app svrs (but NOT Kafka or RabbitMQ)
    Resource vendors need to implement XA in their own Drivers which they provide to users
    ● Example: Oracle provides a version of its JDBC driver which supports the XA API for Java applications
    to use and a version of its ODBC driver which supports the XA API for C/C++ applications to use

    View Slide

  10. XA & Java Enterprise Edition (Java EE*)
    XA is programming language agnostic, but is most commonly seen in combination with Java
    * Formerly called J2EE
    If you used to build Java EE apps running on Java application servers, you probably came
    across XA
    ● Java Transaction API (JTA) was the way your Java code said to start, commit or abort a global XA transaction
    ● You might have integrated with different databases via their provided XA JDBC Drivers
    ● You might have configured message brokers via their provided XA Java Message Service (JMS) Drivers
    Java EE
    Application Server
    Diagram copied from: https://fizalihsan.github.io/technology/transaction.html

    View Slide

  11. The Problem With XA

    View Slide

  12. XA - The Challenges
    1. Eventual Consistency
    2. Reduced High Availability
    3. Poor Performance
    4. Operational Complexity
    5. Interoperability Issues

    View Slide

  13. XA is only Eventually Consistent (1 of 5)
    Real world example
    An enterprise Java app running on a WebLogic application server, placing a message on an IBM
    MQSeries queue and inserting a record in an Oracle database within a single distributed transaction
    Oracle DB
    (XA Resource
    Manager)
    MQSeries
    Message Queue
    (XA Resource
    Manager)
    1. ENQUEUE Msg
    2. INSERT Record
    WebLogic
    (XA
    Transaction
    Manager)
    XA
    Transaction
    #1

    View Slide

  14. XA is only Eventually Consistent (2 of 5)
    Real world example
    An enterprise Java application running on a WebLogic application server, receiving a message off an IBM
    MQSeries queue and then reading a record from an Oracle database within a single distributed transaction
    Oracle DB
    (XA Resource
    Manager)
    MQSeries
    Message Queue
    (XA Resource
    Manager)
    1. DEQUEUE Msg
    2. READ Record
    WebLogic
    (XA
    Transaction
    Manager)
    XA
    Transaction
    #2

    View Slide

  15. XA is only Eventually Consistent (3 of 5)
    Real world example
    2 consecutive transactions - the second transactions receives the message that the first transaction
    put on the queue and reads the record that the first transaction put into the database
    Oracle DB
    (XA Resource
    Manager)
    MQSeries
    Message Queue
    (XA Resource
    Manager)
    1. ENQUEUE Msg
    2. INSERT Record
    WebLogic
    (XA
    Transaction
    Manager)
    XA
    Transaction
    #1
    1. DEQUEUE Msg
    2. READ Record
    WebLogic
    (XA
    Transaction
    Manager)
    XA
    Transaction
    #2
    SO EVERYTHING WORKS FINE, RIGHT?

    View Slide

  16. XA is only Eventually Consistent (4 of 5)
    Real world example
    2 consecutive transactions - the second transactions receives the message from the queue and
    attempts to read the record from the database but sometimes this record is missing
    SO EVERYTHING WORKS FINE, RIGHT? WRONG! XA IS EVENTUALLY CONSISTENT!
    Oracle DB
    (XA Resource
    Manager)
    MQSeries
    Message Queue
    (XA Resource
    Manager)
    1. ENQUEUE Msg
    2. INSERT Record
    WebLogic
    (XA
    Transaction
    Manager)
    XA
    Transaction
    #1
    1. DEQUEUE Msg
    2. READ Record FAILURE
    WebLogic
    (XA
    Transaction
    Manager)
    XA
    Transaction
    #2
    SOMETIMES
    THE RECORD
    DOES NOT
    EXIST YET

    View Slide

  17. XA is only Eventually Consistent (5 of 5)
    Real world example - so what happened?
    Surely this should always work?
    1. 1st transaction puts the data in the DB as part of the same transaction that puts the message on the queue
    2. Only when the message was successfully committed to the queue, can the 2nd transaction be kicked off
    3. Yet, the subsequent listener code can’t always find the data in the DB, that was inserted by the previous
    transaction
    So why is the database record inserted by the 1st transaction sometimes
    missing for the 2nd transaction?
    ● The final commit action performed against each of the 2 resources is initiated in parallel
    ● This is asynchronous because a resource may have temporarily gone down - the system keeps retrying
    ● The 2 resources each take non-deterministic durations to make their change durable (time to persist to disk
    is variable)
    ● There's no way of guaranteeing that they both achieve this in exactly the same instant - THEY NEVER WILL
    ● So the solution you’ve just seen is subject to RACE CONDITION* failures!
    * Oracle even acknowledges this in the section “Avoiding the XA Race Condition” of its documentation at:
    https://www.oracle.com/technetwork/products/clustering/overview/distributed-transactions-and-xa-163941.pdf

    View Slide

  18. XA suffers from reduced High Availability
    The Transaction Manager is typically a single point of failure
    ● If it goes down, it will have to be recovered (which may need to be on a
    different host, if the host has irrevocably failed)
    ● Whilst down, the databases will invariably have pending transactions, stuck
    and holding locks, thus preventing other DB operations from proceeding if
    attempting to access the same locked data
    Risk of deadlocks grinding the overall system to a halt
    ● XA doesn’t allow the system as a whole to detect deadlocks to automatically
    & safely back these out
    ● When deadlock occurs, some of the data will be inaccessible indefinitely
    due to the held record locks
    ● Has a cascading effect on other transactions trying to take locks on the
    same records, which will in turn back up (even happens for ‘livelocks’ where
    one resource only is temporarily down/blocked - cascading upwards)

    View Slide

  19. XA leads to poor Performance
    Chatty protocol between transaction coordinator and
    participants
    ● More network hops to negotiate preparedness, adding latency
    ● More writes to disk, adding latency (transaction manager commit log, each
    resource’s pending transactions log)
    Requires resources (e.g. DBs) to adopt a pessimistic locking
    approach
    ● Database locks held for longer, causing backup of work and
    decreasing throughout
    ● Messages held pending longer on queues before able to be delivered
    ● Each component moves in rigid lock step with every other component
    - the system is as fast as the slowest component

    View Slide

  20. XA results in increased Operational Complexity
    More moving parts
    ● More technologies to learn, install, configure, monitor, patch & build
    disaster recovery procedures for
    Harder root cause analysis
    ● Challenging for a single expert to hunt down the ‘rogue’ transaction
    issue without help from more domain experts
    ● Almost impossible to fix if heuristics exceptions occur
    More frequent emergencies to resolve
    ● Restore a failed Transaction Manager and its commit log
    Consistent backups are hard / impossible
    ● To take a consistent snapshot across multiple data & log stores,
    requires periodically taking the whole system down
    ● Also will face data loss between those snapshots

    View Slide

  21. XA requirement for Interoperability has compromises
    Interoperability matrix hell
    ● Very hard to determine a line of XA compatibility through a set of
    technologies & their different versions
    ● Hard to achieve zero compatibility issues due to complexity &
    ambiguity in the specification and/or implementation bugs
    ● Technology sprawl must be in lock-step: transaction manager,
    databases, message brokers, XA drivers
    A Microservices architecture becomes almost
    impossible to achieve
    ● When transaction boundaries cross domain boundaries
    ● Example:
    In eCommerce, how do you take a new customer order for a product
    using the Orders Microservice and then decrement the stock quantity
    using the Inventory Microservice, as one atomic operation?

    View Slide

  22. Alternative Patterns For Updating
    Heterogeneous Systems

    View Slide

  23. Sender Resubmissions + Receiver Dups-Detection: Via DBs
    Updating two resources as a “Single Unit” to achieve Exactly-Once delivery
    Sender Code
    (pool of threads each
    running continuously taking
    new jobs when available -
    naturally performs
    resubmissions)
    Sender’s Local DB
    - Business Data
    - OPTIONAL Jobs Metadata
    Receiver Code
    (performs duplicables
    detection)
    Receiver’s Local DB
    - Business Data
    - OPTIONAL Receipts Metadata
    RPC
    (e.g. REST,
    GraphQL)
    If the request or
    response is lost in
    transit, that’s fine
    Characteristics:
    ● Exactly-once delivery of messages
    (atomic)
    ● Eventually consistent
    ● Doesn’t require either DB to be
    compliant with a specific 2PC protocol

    View Slide

  24. Sender Resubmissions + Receiver Dups-Detection: Via DBs
    Updating two resources as a “Single Unit” to achieve Exactly-Once delivery
    Sender Code
    (pool of threads each
    running continuously taking
    new jobs when available -
    naturally performs
    resubmissions)
    Sender’s Local DB
    - Business Data
    - OPTIONAL Jobs Metadata
    Receiver Code
    (performs duplicables
    detection)
    Receiver’s Local DB
    - Business Data
    - OPTIONAL Receipts Metadata
    RPC
    1. Regular code performs
    business data updates in local DB
    & puts job with unique id in local
    DB as part of one normal local DB
    transaction
    2. Pool thread code reads an
    outstanding job from local DB and
    makes RPC call to receiver
    component passing the
    appropriate business info + job id
    3. For each RPC call which
    responds with success, mark job
    as completed (or delete its record)
    in local DB
    4. If no response received, the call
    times out and the job is then
    eligible to be processed again by
    a subsequent thread because it
    has not been marked as complete
    1. For each received RPC call
    check in local DB if the job’s id
    has already been processed - if
    it has, make no DB changes
    and just return success
    2. If not a duplicate, perform
    business data updates in local
    DB & record job id in local DB
    as part of one normal local DB
    transaction, before returning
    success

    View Slide

  25. Sender Resubmissions + Receiver Dups-Detection: Via DBs
    Updating two resources as a “Single Unit” to achieve Exactly-Once delivery
    Sender Code
    (pool of threads each
    running continuously taking
    new jobs when available -
    naturally performs
    resubmissions)
    Sender’s Local DB
    - Business Data
    - OPTIONAL Jobs Metadata
    Receiver Code
    (performs duplicables
    detection)
    Receiver’s Local DB
    - Business Data
    - OPTIONAL Receipts Metadata
    RPC
    If business data records naturally track
    their own ‘state’ in one more more
    business fields (e.g. ‘order status’), then
    no need for jobs metadata to also be
    stored and the job id can just be the
    business record’s unique DB id
    Sound familiar? This is
    what mongod does for
    its Retryable Writes
    feature!
    If inserted business data records
    and its data model are naturally
    idempotent then no need to record
    what jobs have been processed in
    this local DB - duplicates will be
    naturally filtered out by the data
    model
    If updating more than 2
    databases, see Orchestration /
    Choreography patterns later

    View Slide

  26. Sender Resubmissions + Receiver Dup-Detection: Via Queues
    Variant using Queues from a Messaging Broker
    Message Broker
    Message
    Queue A
    Message
    Queue B
    Message
    Queue C
    Application
    A’s Code
    Application
    A’s DB
    Dequeue
    Msg
    Enqueue
    Msg
    Application
    B’s Code
    Application
    B’s DB
    Dequeue
    Msg
    Enqueue
    Msg
    “Micro-transaction” 1 “Micro-transaction” 2
    “Business-transaction”
    Characteristics:
    ● Exactly-once delivery of messages
    (atomic)
    ● Eventually consistent
    ● Doesn’t require DBs or Queues to be
    compliant with a specific 2PC protocol

    View Slide

  27. Sender Resubmissions + Receiver Dup-Detection: Via Queues
    Variant using Queues from a Messaging Broker
    Message Broker
    Message
    Queue A
    Message
    Queue B
    Message
    Queue C
    Application
    A’s Code
    Application
    A’s DB
    Dequeue
    Msg
    Enqueue
    Msg
    Application
    B’s Code
    Application
    B’s DB
    Dequeue
    Msg
    Enqueue
    Msg
    “Micro-transaction” 1 “Micro-transaction” 2
    “Business-transaction”
    If the same message broker
    is used for both queues, a
    local messaging native
    transaction can span both
    the enqueue & dequeue
    operations atomically - if
    inserting the message in the
    database fails or times-out,
    the local enqueue/dequeue
    operation will be rolled back
    and the messages will
    automatically be re-delivered
    soon afterwards
    Like the previous pattern, the business data inserts
    are either naturally idempotent, or otherwise, it is also
    required for the code to do an insert of the unique
    msg id into receipts metadata records in the local DB
    too, to filter out duplicate business data updates (if
    the application code detects a duplicate it should just
    commit the local message broker enqueue/dequeue
    transaction without inserting a new record)

    View Slide

  28. Formalised “Outbox” Pattern
    Essentially a combined variation of the previous two patterns
    Taken from “Pattern: Transactional outbox” @ Microservices.io
    1. An application service updates data in regular tables of the local database and encapsulates the remaining changes in a
    command it puts in a special Outbox table in the same local database, as part of one single local database transaction
    2. A separate Message Relay component/process reads commands from Outbox table
    3. The Message Relay process keeps trying to send the command to the target system via a message broker - this can produce
    duplicates so the command must be inherently idempotent from the receiver’s perspective

    View Slide

  29. Business Transaction Choreography Vs Orchestration
    Distributed changes can be implemented by either approach
    DB
    Code
    DB
    Code
    DB
    Code
    DB
    Code
    DB
    Code
    OR
    Orchestrator
    DB
    Orchestrator
    Logic
    DB
    Code
    DB
    Code
    DB
    Code
    DB
    Code
    DB
    Code
    1
    2
    3a
    4a
    3b
    Choreography Orchestration

    View Slide

  30. Business Transaction Choreography Vs Orchestration
    Both of the mentioned examples (DB oriented vs Queue oriented) for the retries +
    dups-detection pattern can be implemented by Choreography or by Orchestration
    For Orchestration, the Orchestrator performs the job management, tracking and workflow
    coordination
    ● In fact, this is essentially what Business Process Management (BPM) tools do (i.e. manage potentially
    long-lived business transactions - important: make sure the BPM tool can treat each process step as a local
    transaction)
    Both also allow for real world compensation workflows to be included in the business
    transaction - example:
    ● If an eCommerce order for a product cannot be fulfilled because it transpires, later on, that the inventory
    database didn’t accurately reflect the product’s stock quantity in the physical warehouse, a compensation
    action can then be executed, asynchronously, to to cancel the order and to reimburse the customer

    View Slide

  31. There Is Precedent For This: “Sagas”
    A term and approach coined in an industry whitepaper from 1987
    English dictionary definition of the word ‘Saga’:
    ● “A very long story with dramatic events or parts”
    The whitepaper’s definition of a ‘Saga’:
    ● “Long lived transaction that can be broken up into transactions, but still executed as a unit”
    Documents approach for enabling long running transactions with
    compensating actions to revert state
    ● Was originally designed for use against a single database to break up a long running transaction that
    would otherwise hold database locks for too long
    ● However, many microservices related articles now discuss using a distributed adaption of the
    pattern, formalised for coordination where transactional boundaries span multiple microservices
    ● Potentially overkill if you just need a simple integration between just 2 different DBs - if that’s the
    case just follow the simple application specific patterns highlighted earlier

    View Slide

  32. Comparing Approaches

    View Slide

  33. Comparison: Transactional Patterns For Updating Multiple Resources
    XA Protocol Choreography
    Patterns
    Orchestration
    Patterns
    Throughput Performance Low High Medium
    Mandates Support for a Native 2PC Protocol In All Resources Yes No No
    Complexity of Tracing & Issue Diagnosis Medium Medium Easy
    Ability to Fix State When Things Go Wrong Very hard
    (nearly impossible)
    OK
    (compensation)
    OK
    (compensation)
    High Availability of Typical Implementations Low Medium Medium
    Interoperability with Microservices HTTP APIs (e.g.. REST/GraphQL) No Yes Yes
    Supports Long-running Business Transactions No Yes Yes
    Supports Compensation Workflows No Yes Yes
    Application Code Complexity to Manage ‘Transaction’ Lifecycle Low Medium Medium
    ACID Properties
    Atomic Yes Yes Yes
    Consistency Eventual Eventual Eventual
    Isolation Strong
    (using locking)
    Weak Weak
    Durable Yes Yes Yes

    View Slide

  34. Hang on, doesn’t MongoDB itself use 2PC internally?
    Yes, for distributed transactions across multiple shards in the same cluster, but...
    ● It’s not addressing the same use cases as XA
    ○ Transactions across multiple heterogeneous technologies Vs transactions across a single distributed DB
    ● Many issues are related to the XA protocol specifically rather than the 2PC pattern generally
    ● Many issues exacerbated when multiple vendor technologies involved in a distributed transaction
    MongoDB’s distributed transactions within single cluster (compared to XA’s issues):
    ● Eventual Consistency MongoDB favours providing a “snapshot” read concern for a synchronized
    view of the data across shards (and without requiring locks for isolation)
    ● Reduced High Availability Automated replica failover for each shard & its set of inflight
    transactions
    ● Poor Performance Locking not used - higher concurrency (fail fast) & lower latency (no blocking)
    ● Operational Complexity Single cluster so easy to manage
    ● Interoperability Issues None - all distributed elements are part of same technology stack &
    version

    View Slide

  35. Further Reading
    ● Designing Data-Intensive Applications book, Chapter 9, section ‘Distributed
    Transactions and Consensus’ by Martin Kleppmann (O'Reilly, 2016).
    ● Your Coffee Shop Doesn’t Use Two-Phase Commit by Gregor Hohpe
    ● Myth: Why Banks Are BASE Not Acid - Availability Is Revenue by Eric Brewer
    ● The Hardest Part About Microservices: Your Data by Christian Posta
    ● SHOCKER: XA Distributed Transactions are only Eventually Consistent! by Paul Done
    ● It’s Time to Move on from Two Phase Commit by Daniel Abadi
    ● Pattern: Transactional outbox by Chris Richardson
    ● Sagas by Hector Garcia-Molina and Kenneth Salem (1987 whitepaper)
    ● Pattern: Saga by Chris Richardson
    ● Patterns for distributed transactions within a microservices architecture by Keyang
    Xiang

    View Slide

  36. That’s all folks
    Paul Done
    @TheDonester

    View Slide