Testing AI Systems: Quality Characteristics and Cognitive Biases

Build Software to Test Software exactpro.com Testing AI Systems: Quality
Characteristics and Cognitive Biases Elena Treshcheva, Researcher, Exactpro

Exactpro Overview • A specialist firm focused on functional and
non-functional testing of exchanges, clearing houses, depositories, trade repositories and other financial market infrastructures. • Incorporated in 2009 with 10 people, our company has experienced significant growth and is now employing over 550 specialists. • We were part of the London Stock Exchange Group (LSEG) from May 2015 till January 2018. Exactpro management buyout from LSEG was successfully completed in January 2018. We are headquartered in the UK and have operations in the US, Georgia and Russia.

Exactpro Client Network

AI-based Systems’ Quality Characteristics: - Ability to learn: The capacity
of the system to learn from use for the system itself, or data and events it is exposed to. - Ability to generalize: The ability of the system to apply to diﬀerent and previously unseen scenarios. - Trustworthiness: The degree to which the system is trusted by stakeholders, for example a health diagnostic A4Q AI and Software Testing Foundation Syllabus https://www.gasq.org/en/exam-modules/a4q-ai-and-software-testing.html

Ability to Learn: https://www.deeplearning.ai/ • Training set — Which you
run your learning algorithm on. • Development set — Which you use to tune parameters, select features, and make other decisions regarding the learning algorithm. Sometimes also called the hold-out cross validation set. • Test set — which you use to evaluate the performance of the algorithm, but not to make any decisions regarding what learning algorithm or parameters to use.

Trustworthiness: https://innovation.defense.gov/ai/ During the DIB’s quarterly public meeting on October
31, 2019, the DIB members voted to approve the proposed AI Principles.

Trustworthiness: https://www.mas.gov.sg/news/media-releases/2019/mas-partners-financial-industry -to-create-framework-for-responsible-use-of-ai

Trustworthiness: How can we persuade people to trust an algorithm?
Some important techniques are: • Explainability • Testing • Boundary conditions • Gradual rollout • Auditing • Monitors and alarms https://blog.deeplearning.ai/blog/google-ai-explains-itself-neural-net-fights-bias-ai-demoralizes-champions-solar-power-heats-up

Ability to Generalize: Scope of End-to-End and Negative Testing

Congruence bias Confirmation bias Law of triviality Zero-risk bias Anthropocentric
thinking Illusion of control Cognitive Biases Affecting Software Testing of AI-based Systems Automation bias

Mohanani, R., Salman, I., Turhan, B., Rodríguez, P., & Ralph,
P. (2018). Cognitive Biases in Software Engineering: A Systematic Mapping Study. IEEE Transactions on Software Engineering.

Confirmation Bias

AI-based Systems: Machine-Readable News

Anthropocentric Bias: Why We Treat Robots Like Humans Darling, Kate
and Nandy, Palash and Breazeal, Cynthia “Empathic Concern and the Effect of Stories in Human-Robot Interaction” (2015). Proceedings of the IEEE International Workshop on Robot and Human Communication (ROMAN), 2015. 6 p. https://www.ted.com/talks/kate_darling_why_we_ha ve_an_emotional_connection_to_robots

Anthropocentric Bias: Testing Chatbots Anaphora / Context Human: I bought
500 Company X shares two years ago. The stocks’ cost was 60,000 USD. What’s their today’s cost? Chatbot: What currency would you like to have for the rate? X Spelling / overall correctness Human: What is the settlement date of the tradeId XXX?? Chatbot: ???

AI-based Systems: Algo Trading

Congruence Bias Direct Testing Indirect Testing Indirect Testing

Applications of the Proposed Approach https://unsplash.com/search/photos/san-francisco The First IEEE International
Conference on Artificial Intelligence Testing (IEEE AITest 2019), April 4-9 2019, San Francisco East Bay, CA, USA User-Assisted Log Analysis for Quality Control of Distributed Fintech Systems Iosif Itkin, Anna Gromova, Anton Sitnikov, Rostislav Yavorskiy, Evgenii Tsymbalov, Andrey Novikov and Kirill Rudakov.

Law of Triviality (the Bike-Shed Effect)

AI-based Systems: Pricing Calculator

Automation Bias

AI-based Systems: Fraud Detection and Market Surveillance

Build Software to Test Software Click to know more about
Exactpro Test Tools

Zero-Risk Bias

Non-deterministic Systems: Financial Market Infrastructures

The Illusion of Control and Happiness Sherman, G. D., Lee,
J. J., Cuddy, A. J. C., Renshon, J., Oveis, C., Gross, J. J., & Lerner, J. S. (2012). Leadership is associated with lower levels of stress. Proceedings of the National Academy of Sciences, 109(44), 17903–17907.

Fenton-O’Creevy, M., Nicholson, N., Soane, E., & Willman, P. (2003).
“Trading on illusions: Unrealistic perceptions of control and trading performance”. Journal of Occupational and Organizational Psychology, 76(1), 53–68. The Illusion of Control and Performance

28 Build Software to Test Software exactpro.com CONTACT: [email protected] London:
+44 (0) 203 319 1644 Thank you https://exactpro.com/ideas/white-papers/testing-intelligence-your-ai

Testing AI Systems: Quality Characteristics an...

Testing AI Systems: Quality Characteristics and Cognitive Biases

Exactpro
PRO

More Decks by Exactpro

Other Decks in Technology

Featured

Transcript

Build Software to Test Software exactpro.com Testing AI Systems: Quality

Exactpro Overview • A specialist firm focused on functional and

Exactpro Client Network

AI-based Systems’ Quality Characteristics: - Ability to learn: The capacity

Ability to Learn: https://www.deeplearning.ai/ • Training set — Which you

Trustworthiness: https://innovation.defense.gov/ai/ During the DIB’s quarterly public meeting on October

Trustworthiness: https://www.mas.gov.sg/news/media-releases/2019/mas-partners-financial-industry -to-create-framework-for-responsible-use-of-ai

Trustworthiness: How can we persuade people to trust an algorithm?

Ability to Generalize: Scope of End-to-End and Negative Testing

Congruence bias Confirmation bias Law of triviality Zero-risk bias Anthropocentric

Mohanani, R., Salman, I., Turhan, B., Rodríguez, P., & Ralph,

Confirmation Bias

AI-based Systems: Machine-Readable News

Anthropocentric Bias: Why We Treat Robots Like Humans Darling, Kate

Anthropocentric Bias: Testing Chatbots Anaphora / Context Human: I bought

AI-based Systems: Algo Trading

Congruence Bias Direct Testing Indirect Testing Indirect Testing

Applications of the Proposed Approach https://unsplash.com/search/photos/san-francisco The First IEEE International

Law of Triviality (the Bike-Shed Effect)

AI-based Systems: Pricing Calculator

Automation Bias

AI-based Systems: Fraud Detection and Market Surveillance

Build Software to Test Software Click to know more about

Zero-Risk Bias

Non-deterministic Systems: Financial Market Infrastructures

The Illusion of Control and Happiness Sherman, G. D., Lee,

Fenton-O’Creevy, M., Nicholson, N., Soane, E., & Willman, P. (2003).

28 Build Software to Test Software exactpro.com CONTACT: [email protected] London: