ASE 2017 Keynote: Software Engineering without Borders

Software Engineering without Borders Arie van Deursen Delft University of
Technology @avandeursen November 2, 2017, Urbana-Champaign, USA ASE 2017 keynote address Images: Wikipedia 1

! Jeroen Castelein ! Peter Evers " Mozhan Soltani #
Annibale Panichella $ # Maurício Aniche ! Joop Aué ! Maikel Lobbezoo 2 ! Rick Wieman ! Sicco Verwer " Pouria Derakhshanfar % Xavier Devroey ! Felienne Hermans % Andy Zaidman & Georgios Goussios # ' Alberto Bacchelli

(Y)OUR SOFTWARE ENGINEERING CURIOSITY Mohammad Abdullah https://flic.kr/p/pz5X9 3

DEV OPS xebia.com 4

Context: Payments Payment Provider 5

Payment Provider • # payment methods 250+ • # currencies
150+ • Revenue in 2016 $727 million • Revenue growth 2016 99% • $$ processed in 2016 $90 billion • Volume growth 2016 80% • # employees (end 2016) 500 • Revenue per employee 1.5 million Commons.wikimedia, munttoren 6 https://www.adyen.com/press-and-media/press-releases/press-release-detail/2017/adyen-discloses-2016-revenues-of-727-million-growing-99-year-over-year

Some of Adyen’s 4000+ Merchants 7

Merchant’s Single Point of Failure? Payment Provider A 8

Merchant’s Solution: Competitive A/B Deployment Payment Providers B A 9
KPI to optimize: Conversion Rate

One Billion Log Lines a Day: Monitoring using the ELK
Stack • Logstash: Unify different logging sources • Elastic Search: Search and filter large log data • Kibana: Visual interactive dashboard Image credit: www.neteye-blog.com 10

Poll: Java Exceptions in a Payment System Your payment system
in production generates 1 billion log lines per day. How many errors / warnings with exceptions do you expect to see? A. None. “We have a zero exception policy.” B. 1 Thousand. “Some exceptions are unavoidable.” C. 1 Million. “Most exceptions are harmless.” D. 1 Billion. “We only log errors and exceptions.” 11 Adyen, Nov 2016: ~1,000,000 per day

Log Analysis in Research 1. Abstraction Seeing the bigger picture
2. Detection Finding errors and anomalies 3. Enhancing More effective logging practices 4. Parsing Extracting message templates 5. Modeling Message ordering and protocols 6. Scaling Dealing with many many logs 7. Visualizing Put the eyes to use Joop Aué, Maurício Aniche, Arie van Deursen. Log Analysis from A-Z: A Literature Survey. TU Delft, 2017, in preparation. Identified 73 core papers. Venues: SIGOPS SOSP ACM TOCS Usenix WASL Usenix OSDI IEEE ISSRE ICSE 12

Logness: Extract, Cluster, Tag • Extract features: • application name,
class name, exception • Remove details: • literal numbers, (encryption) hashes • Cluster: • Same payment identifier in 15min window • Same features • longest common substring above threshold • Tag as severe, known (monitored, bug), and unknown 13 Peter Evers, Maurício Aniche, Arie van Deursen, Maikel Lobbezoo. Finding Relevant Errors in Massive Payment Log Data. TU Delft, 2017, in preparation. 1,000,000 err log lines --> 250 exception clusters

Logness Dashboard 14

15 Zoom in to individual exception cluster

Issues Found in Research Period 16 First credit cards starting
with 95 and with 19 digits: long overflow! Merchant configuration error. All payments stalled. Discovered before being noticed by merchant Firewall configuration problem: Server unreachable. Discovered before merchants were assigned to this server Server update incompatible with legacy point of sale terminals. Customer could buy, but merchant received no money. IOException triggered.

Complex API Integration • Payment APIs are complex • Integration
faults are easily made • Merchant needs assistance with API usage • Merchant may not notice mistakes • 2.5M http error responses per month • What can we learn from them? 17

2.5M Errors to 69 Fault Cases 18 { } API
consumer End user API Provider Third party FC12 Contract not found Replication latency. FC24 iDEAL communication error FC42 Invalid paRes from issuer FC1 ApplePay token amount-mismatch FC5 Billing address problem (Country 0) FC62 Unable to decrypt data FC14 Could not read XML stream. FC15 Couldn’t parse expiry year Joop Aué, Maurício Aniche, Arie van Deursen, Maikel Lobbezoo An Exploratory Study on Faults in Web API Integration in a Large-Scale Payment Company . TU Delft, 2017. Submitted.

11 Common Causes for API Error Reponses 19

27% 60% 13% 17% 44% 28% 11% 17% 44% 17%
22% 16% 42% 37% 5% 14% 38% 29% 19% 14% 36% 36% 14% 44% 17% 39% 21% 14% 21% 43% 33% 14% 52% 29% 21% 50% 5% 19% 24% 52% Invalid user input (n=18) Missing user input (n=15) Expired request data (n=14) Invalid request data (n=21) Missing request data (n=18) Insufficient permissions (n=19) Double processing (n=14) Configuration (n=21) Missing server data (n=18) Internal error (n=21) Third party error (n=14) 0 25 50 75 100 Percentage Response None Low Moderate High 20 What impact did you experience for each error cause?

API Integration Recommendations • API Consumer: • Actually handle all
error codes returned by provider • API Producer: • Document which error codes can be returned under what circumstances • Offer easy-to-use test harness for integrations created by consumers • Make explicit which error codes are ‘retriable’ • Enrich returned error codes with actionable info (for consumer or end user) • Offer Error Dashboard for API consumer offering live insight in error handling • API Researcher: • Rethink API usability in this context 21

Payment Terminals Payment Provider 22

Point of sale terminal variability • Card brands • Card
entry modes (chip, swipe, contactless) • Currency conversion • Loyalty points • Validation type (pin, signature) • Issuer responses (declined, insufficient balance) • Cancellations (shopper, merchant) 23

Passive learning Identifying system behavior from observations, and representing it
in the smallest possible model. 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved Rick Wieman, Maurício Aniche, Willem Lobbezoo, Sicco Verwer and Arie van Deursen. An Experience Report on Applying Passive Learning in a Large-Scale Payment Company. ICSME Industry Track, 2017 https://automatonlearning.net/ DFASAT / FlexFringe Heule & Verwer, ICGI 2010 24

Use Inferred Models to Analyze: Bugs in Test Phase •
Terminal asked for PIN • AND asked for signature • Domain expert noted this unwanted behavior in inferred model. • Fixed before it went into production 25

Use Inferred Models to Analyze: Differences Between Card Brands 26
Twice as many chip errors Informed merchant about issue.

Use Inferred Models to Analyze: Time out problems 27 Improved
performance under network instability by adding targeted retry mechanism Timeout

What to Do with a (Logged) Exception? The Anatomy of
a Stack Trace 29 java.lang.IllegalArgumentException: org.apache.commons.collections.map.AbstractHashedMap.<init> (AbstractHashedMap.java:142) org.apache.commons.collections.map.AbstractHashedMap.<init> (AbstractHashedMap.java:127) org.apache.commons.collections.map.AbstractLinkedMap.<init> (AbstractLinkedMap.java:95) org.apache.commons.collections.map.LinkedMap.<init> (LinkedMap.java:78) org.apache.commons.collections.map.TransformedMap.transformMap (TransformedMap.java:153) org.apache.commons.collections.map.TransformedMap.putAll (TransformedMap.java:190) Exception name Class name Method name Specific line Potential Cause of the Exception

EvoCrash: Search-Based Crash Reproduction • Leverage EvoSuite • Devise fitness
function rewarding 1. Target offending method call reached 2. Same exception thrown 3. Stack trace similarity • Produce initial random tests focused on stack trace • Mutation and cross-over operators taking stack traces into account 30 Evaluate Fitness Selection Guided Crossover Initialize Population Fit? Guided Mutation Create next generation Re- insertion Mozhan Soltani, Annibale Panichella, Arie van Deursen: A guided genetic algorithm for automated crash reproduction. ICSE 2017: 209-220 Search-Based Crash Reproduction and its Impact on Debugging. In preparation

Application to 54 Open Source Crash Reports 31 9 ,
e n - 3 0 - r 0 t s e TABLE 1 The 54 real-world bugs used in our study. Project Bug IDs Versions Exceptions Priority Ref. ACC 4, 28, 35, 2.0 - 4.0 NullPointer (5), Major (10) [11] 48, 53, 68, UnsupportedOperation (1), Minor (2) [63] 70,77, 104, IndexOutOfBounds, (1) 331, 277, 411 IllegalArgument(1), ArrayIndexOutOfBounds, (2) ConcurrentModiﬁcation, (1) IllegalState (1), ANT 28820, 33446, 34722, 1.6.1 - 1.8.2 ArrayIndexOutOfBounds (3), Critical (2) [11] 34734, 36733, 38458, NullPointer (17), Major (5) [41] 38622, 41422, 42179, StringIndexOutOfBounds (1) Medium (14), 43292, 44689, 44790, 46747, 47306, 48715, 49137, 49755, 49803, 50894, 51035, 53626 LOG 29, 43, 509, 10528, 1.0.2 - 1.2 NullPointer (17), Critical (1) [11] 10706, 11570, 31003, InInitializerError (1) Major (4) [41] 40212, 41186, 44032, Medium (11) 44899, 45335, 46144, Enhanc. (1) 46271, 46404, 47547, Blocker (1) 47912, 47957 ActiveMQ 5035 5.9 ClassCastExecption (1), Major (1) [41] DnsJava 38 2.1 ClassCastException (1), N/A (1) [41] JFreeChart 434 1.0 NullPointerException (1), N/A (1) [41] [11] = STAR Chen & Kim, TSE 2015 Symbolic execution [63] = muCrash Xuan et al, ESEC FSE 2015 Test suite mutation [41] = JCHARMING Nayrolles et al, JSEP 2016 Model checking

Search-Based Testing for … SQL Queries • Monitored applications often
have persistent state • How can we support testing such data-intensive applications? • How do we find the right data, to test (complex) database queries? • Explore search-based techniques! 34 SELECT * FROM àccount` LEFT JOIN ùser` AS àssignedUser` ON account.assigned_user_id = assigneduser.id LEFT JOIN ùser` AS `modifiedBy` ON account.modified_by_id = modifiedby.id LEFT JOIN ùser` AS `createdBy` ON account.created_by_id = createdby.id LEFT JOIN èntity_email_address` AS èmailAddressesMiddle` ON account.id = emailaddressesmiddle.entity_id AND emailaddressesmiddle.deleted = '0' AND emailaddressesmiddle.primary = '1' AND emailaddressesmiddle.entity_type = 'Account' LEFT JOIN èmail_address` AS èmailAddresses` ON emailaddresses.id = emailaddressesmiddle.email_address_id AND emailaddresses.deleted = '0' LEFT JOIN èntity_phone_number` AS `phoneNumbersMiddle` ON account.id = phonenumbersmiddle.entity_id AND phonenumbersmiddle.deleted = '0' AND phonenumbersmiddle.primary = '1' AND phonenumbersmiddle.entity_type = 'Account' LEFT JOIN `phone_number` AS `phoneNumbers` ON phonenumbers.id = phonenumbersmiddle.phone_number_id AND phonenumbers.deleted = '0' WHERE (( account.name LIKE 'Besha%' OR account.id IN (SELECT entity_id FROM entity_email_address JOIN email_address ON email_address.id = entity_email_address.email_address_id WHERE entity_email_address.deleted = 0 AND entity_email_address.entity_type = 'Account' AND email_address.deleted = 0 AND email_address.name LIKE 'Besha%') )) AND account.deleted = '0'

MC/DC Coverage on SQL Queries Find data that lets each
condition independently set outcome • T1: items { id: 42 }, invoice { id: 42, amount: 1000, taxFree: true } ✅ • T2: items { id: 42 }, invoice { id: 42, amount: 1001, taxFree: false } ✅ • T3: items { id: 42 }, invoice { id: 42, amount: 1000, taxFree: false } ❌ • T4: items { id: 42 }, invoice { id: 41, amount: 1000, taxFree: true } ❌ 35 66 67 68 69 70 71 72 73 74 75 76 77 78 79 quired to fully test a SQL query grows together with the complexity of the query itself. Consider a SQL query that joins two tables and contains two predicates: SELECT items .* FROM invoice JOIN items ON invoice.id = items.invoiceid WHERE amount > 1000 OR taxFree = true This SQL query returns all items of invoices that either have amount greater than 1000 or that are tax free. To test this query rigorously, the tester may want to exercise all “branches” that can be executed in this SQL query. Thus, the tester needs to target 1) the JOIN relation, 2) the left predicate (amount > 1000) to be evaluated to true, 3) the right predicate (taxFree = true) to be evaluated to true. For that to happen, the two tables should contain the right

Tuya et al, 2010: Full Predicate Coverage • MC/DC coverage
on conditions used in SQL Queries. • Coverage target = simplified query that should yield at least one row • Establish coverage potential of given (hand-written) data sets • Still needed: Actual test data! 36

EvoSQL: Test Data for SQL Queries • Fitness of populated
database to yield one row for given target query • Step level: Number of steps still to be executed in query • Step distance: How close we are to particular condition • Specialized for comparison operators, string operators, SQL-specific operators (IS NULL, EXISTS) • Mutation: Delete, change, insert rows • Crossover: Swap values between rows • Seeding: All constants in query 37 Jeroen Castelein, Maurício Aniche, Mozhan Soltani, Annibale Panichella and Arie van Deursen Search-Based Test Data Generation for SQL Queries. Submitted, 2017

EvoSQL Evaluation Approach • Collected all queries from 4 systems
(ERP, CRM, e-learning) • 2000 unique queries • State of the art: Use SAT solving ☹️ Can’t handle 85% of our queries (nested queries, string manipulation, JOINs) ☹️ No implementation available • Compared EvoSQL to pure random and biased (seeded) random • Implemented on top of HSQLDB and SQLFpc-as-a-service 38

EvoSQL Evaluation Outcomes • 100% of targets covered for 98%
of the queries • On average 86% covered for the remaining 2% • Usually within seconds • Outperforms biased and random alternatives: • Biased random can handle 90% of simple queries (< 10 rules) • Biased random often finds no solution for complex queries (10+ rules) 39 Coverage Rules 1-2 3-4 5-6 7-8 9-10 11-15 16-20 21+ # Queries 656 382 408 346 114 107 51 71

(Y)OUR SOFTWARE ENGINEERING CURIOSITY Mohammad Abdullah https://flic.kr/p/pz5X9 40

Vision: Informed, data-driven, software development with continuous feedback between operations
and development 41 Mission: to develop and advance the theory and technology to make this happen

“Operations” is under-represented in software engineering research (and ops is
all about automation!) 42 Border I: Operations

Border II: Our Discipline To make true progress, we software
engineering researchers must embrace other disciplines 43 Image: Doc Searls, https://flic.kr/p/9o5AEY

Border III: Practice True understanding in software engineering research comes
from seeing it work in practice 44 Commons.wikimedia, full orchestra

The Adyen – Delft Collaboration Model • Mutual trust •
Long term (10+ years) • Mutual understanding of needs: win -- win • Challenging and engaging environment for MSc / PhD thesis projects • Mutual education: devs to students, researchers to industry • Embrace openness • Academics willing to get their hands dirty 45

Border IV: My Data, My Tools Demonstrable progress in software
engineering research requires shared data and tools 46 https://en.wikipedia.org/wiki/File:3d10_fm_de_vilafranca.jpg

The GHTorrent Success Story Scalable, query-able, offline mirror of data
from GitHub REST API. • Data since 2012 • 16 TB json in MongoDB • 5 billion rows in MySQL • 2 GB per hour collected • 350 users in > 200 institutions • 200+ papers • 3 data mining challenges • Confirmed commercial uses by Microsoft, Deloitte, Blackduck • Support gifts from Microsoft 47 Georgios Gousios, Diomidis Spinellis, Martin Pinzger, Arie van Deursen, Andy Zaidman, Margaret-Anne Storey, Alberto Bacchelli. MSR 2012, MSR 2013, ICSE 2014, ICSE 2015, ICSE 2016 GHTorrent dataset, pull-based software development, integrator and developer perspectives.

Border V: The Public at Large We must engage with
the public at large to demonstrate our relevance 48 http://fightingillini.com/

49 Exemplary Research Blogging Adrian Coyler

Border VI: Our Publishing Culture We cannot expect tax paying
software engineers to enjoy our research if we make them pay twice for it. We need open access. 50 Image: Tony Armstrong, https://flic.kr/p/XhvQiF

Software Engineering without Borders Arie van Deursen Delft University of
Technology @avandeursen November 2, 2017, Urbana-Champaign, USA ASE 2017 keynote address Images: Wikipedia 51

ASE 2017 Keynote: Software Engineering without ...

ASE 2017 Keynote: Software Engineering without Borders

More Decks by Arie van Deursen

Other Decks in Technology

Featured

Transcript