Slide 1

Slide 1 text

High Frequency Trading Infrastructure and Quality Assurance Iosif Itkin, Exactpro Systems 20th January 2014

Slide 2

Slide 2 text

Contents • Quality Assurance and Risk Assessment • High Frequency Trading Technology • Fat Finger Problems • Flash Crash • Circuit Breakers • Facebook IPO • ABN and ATG on NASDAQ • Knight Capital Events • Monitoring Systems 2

Slide 3

Slide 3 text

Exactpro Systems Company Overview 3 • A specialist firm focused on functional and non functional testing of securities data distribution, trading systems, risk management and post-trade infrastructures • An independent company incorporated in 2009 with 10 people, now employing over 210 specialists • A US company registered and head-quartered in San Rafael, California, with four QA & development centres in Russia and sales support in the UK • We build software to verify trading and back-office systems for exchanges, brokers and other companies in securities industry http://linkedin.com/in/iosifitkin

Slide 4

Slide 4 text

Quality Assurance and Risk Assessment 4 Minimal

Slide 5

Slide 5 text

Quality Assurance and Risk Assessment 5 Minimal Life & Health

Slide 6

Slide 6 text

Quality Assurance and Risk Assessment Financial Services Money and Reputation 6 Minimal Life & Health

Slide 7

Slide 7 text

Trading System Types 7

Slide 8

Slide 8 text

Achieving High Availability • Minimize the number of faults and the effect/recovery time of faults in a system • Avoid a single-point-of-failure by utilizing redundant parts and rerouting (failover) • Have a comprehensive monitoring system in place • Reduce the impact of environmental faults by using UPS and off-site data mirroring and/or replication provided for "hot" repair of failed components 8

Slide 9

Slide 9 text

Trading Architecture Sample from Cinnober 9

Slide 10

Slide 10 text

Ariane 5 First Launch • 4 June 1996 • Maiden flight of Ariane 5 unmanned missile • Loss of guidance and attitude information 37 seconds after ignition Explosion at 3700 meters 3 seconds later • Number conversion from 64 bits into 16 bits • Ada language invalid operand error • The same version of software was used for Ariane 4 • Horizontal velocity appeared to be much higher • Testing using simulators without SRI itself • Both primary and secondary systems failed 10

Slide 11

Slide 11 text

High Frequency Trading System • Hundreds of millions of orders per day; • Micro-bursts with thousands of transactions within milliseconds; • Latencies 3,000 times faster than the blink of an eye… • …and equal to time it takes a flying passenger jet to cover the distance of 2 cm or light getting from here to Frasne 11

Slide 12

Slide 12 text

Brokerage System 12

Slide 13

Slide 13 text

Mizuho Securities • 12 October 2005 • Attempts to sell a single J-Com stock for 610,000 Yen ($5,041) • Mistakenly price and quantity were swapped • Risk systems failure: – Mizuho Securities – Tokyo Stock Exchange • Estimated loss $225 millions • This type of errors is called Fat Finger 13

Slide 14

Slide 14 text

USS Yorktown • 21 September 1997 • CG-47 Aegis pilot version for “Smart Ship” program • All systems outage, including propulsion system for 2.5 hours • Incorrect data entry into Remote Data Base Manager caused overflow in the database, LAN shutdown and disconnection of all controlling terminals • Software defect – division by zero 14

Slide 15

Slide 15 text

Fat Finger Order on NASDAQ from ABN AMRO Client • 18 September 2012, Stockholm. A trader had the intention of posting a sell order for 5,000 SKF B shares. Due to an input error with the Client, the order volume field was populated with a negative value (-5,000) • Instead of returning an error, the system converted the value into a random 9-digit figure - 294,962,296 • The Sell Order corresponded to approximately 71 % of the total outstanding volume in the SKF B share. The Sell Order resulted in execution of 813,442 shares 15

Slide 16

Slide 16 text

Flash Crash • 6 May 2010, Waddell & Reed hedges exposure in equities • Algo to sell 75,000 E-mini contracts (~$4.1b) with 9% participation target • No price or time constraints in the algo, only volume traded during the previous minute • Initial selling was absorbed by HFT and arbitrageurs - buy E-mini, sell SPY or basket of equities. Lack of liquidity and hot-potato exchange between HFT increased volumes and selling pressure from the algo • Sharp decline in prices within 5 minutes. Trigger of across the board volatility interruptions • Participants are leaving the market, causing liquidity crisis in equities and execution against stub quotes • Market recovers within minutes 16

Slide 17

Slide 17 text

Limit Order Book 17

Slide 18

Slide 18 text

Limit Order Book 18

Slide 19

Slide 19 text

Limit Order Book 19

Slide 20

Slide 20 text

Limit Order Book 20

Slide 21

Slide 21 text

Price Boundaries 21

Slide 22

Slide 22 text

Price Boundaries 22

Slide 23

Slide 23 text

Price Boundaries 23

Slide 24

Slide 24 text

Price Boundaries 24

Slide 25

Slide 25 text

Circuit Breaker 25

Slide 26

Slide 26 text

Circuit Breaker 26

Slide 27

Slide 27 text

Circuit Breaker 27

Slide 28

Slide 28 text

Circuit Breaker 28

Slide 29

Slide 29 text

Circuit Breaker 29

Slide 30

Slide 30 text

Facebook IPO • 18 May 2012, NASDAQ, One of the largest IPOs in history • Secondary trading is preceded by a designate Display Only Period (DOP) • Multi-component architecture that included Matching Engine, IPO Cross Application and Execution Application • At the end of the DOP, NASDAQ’s “IPO Cross Application” analyzes all of the buy and sell orders to determine the price at which the largest number of shares will trade and then NASDAQ’s matching engine matches buy and sell orders at that price. Usually takes 1-2 ms • NASDAQ allowed orders to be cancelled at any time up until the end of the DOP – including the very brief interval during which the IPO cross price is calculated. After calculation is completed the system performed orders validation check between ME and “IPO Cross Application”. If any of the orders were cancelled after the start of the cross, the system will have to repeat the calculation 30

Slide 31

Slide 31 text

Facebook IPO • Over 496k orders participated in the cross and its duration exceeded 20ms • Order cancellation arrived during this period and application had to repeat the calculation. Two more cancellations arrived during the second iteration and four more during the third • IPO Cross Application went into infinite loop at 11:05 • NASDAQ team switched off validation check on the secondary system and performed failover 25 minutes after the start of the loop • Unknown at that moment 38k orders submitted between 11:11 and 11:30 were stuck and had not participated in the uncross. It created another discrepancy, this time with Execution App and Members who were not able to receive confirmation for orders executed in the cross until 13:50 31

Slide 32

Slide 32 text

ABN and ATG Auction Uncross Problem on NASDAQ • 28 August 2013 the SEB A share opened significantly lower than on the previous day. The opening price was 51.80, which was around 24% lower than the closing price • A contributing factor was trading performed during the opening auction by Algorithmic Trading Group (ATG), using a certain algorithm that it had developed through Sponsored Access arrangements with ABN Amro Clearing Bank • Because of a shortcoming of the said algorithm, it registered, amended and cancelled the orders as soon as the limit price was equal or cross the equilibrium price of the order book 32

Slide 33

Slide 33 text

Knight Capital Events • 1 August 2012, USA • Knight Capital – one of the most successful HFT firms • Implemented changes related to Retail Liquidity Program в NYSE • SMARS – ultra-fast order router • Source code responsible for the legacy functionality PowerPeg • 212 parent orders, millions child orders • Accumulated loss – $460m or $170k/sec • Incorrectly configured risk systems • Deployment on 7 servers instead of 8… 33

Slide 34

Slide 34 text

North American Blackout • 14 August 2003, USA and Canada • Cascade power outage • Race conditions resulted in buffer overflow in alerting system • Should operators disconnected 4% of the overall load the losses estimated to be $10b could have been avoided 34

Slide 35

Slide 35 text

Market Surveillance and Monitoring • Process all events • Aggregate them • Look for patterns using flexible rules • Replay for Investigation • Store everything as evidence 35

Slide 36

Slide 36 text

Thank You • Program analysis • Software verification • Financial models validation • Load distribution modeling • Research of risk controls and circuit breakers 36 http://tmpaconf.org

Slide 37

Slide 37 text

References • A Cinnober white paper on: Latency October 2009 update • SEC Release No. 34-69655 / May 29, 2013 • SEC Release No. 34-70694 / October 16, 2013 • NASDAQ OMX Stockholm Disciplinary Committee DECISION 2013-02-21 • NASDAQ OMX Stockholm Disciplinary Committee DECISION 2014-01-13 • Findings Regarding the Market Events of May 6, 2010 • The Future of Computer Trading in Financial Markets • TMPA-2013: Tools & Methods of Program Analysis • Software Horror Stories • Slide Share - http://www.slideshare.net/IosifItkin 37