How Can I Improve My App? Classifying User Reviews for Software Maintenance and Evolution

How Can I Improve My App? Classifying User Reviews
for Software Maintenance and Evolution Sebastiano Andrea Emi@a Corrado Gerardo Harald Panichella Di Sorbo Guzman Visaggio Canfora Gall

Outline 2 Context: App Store Reviews Case Study:
User Reviews of 7 Popular Apps Results: Automatic Classiﬁcation Performed Using Diﬀerent Sources of Information

App User Reviews 3

A Communication Channel Between Developers and Users 4

Reviews Include Useful Information for Developers Pagano et al.
– RE2013 Chen et al. – ICSE2014 Galvis Carreno et al. – ICSE2013 5

A Large Amount of Eﬀort for Developers Many reviews
every day Unstructured data 6

Users Submit Many Reviews Regularly iOS apps receive
on average 23 reviews per day Facebook for iOS receive more than 4000 reviews per day [ Pagano et al. -‐‑ RE 2013 ] 7

The Quality of Reviews Varies [ Pagano et
al. -‐‑ RE 2013 ] 8

Past Work Chen et al – ICSE 2014 ARMiner:
an approach to help app developers discover the most informative user reviews i.  text analysis and machine learning to ﬁlter out non-‐‑ informative reviews ii.  topic analysis to recognize topics treated in the reviews classiﬁed as informative 9

Identifying Useful Reviews i.  The awful buZon in the
page doesn’t work ii.  A buZon in the page should be added 10

Identifying Useful Reviews i.  The awful buRon in the
page doesn’t work ii.  A buRon in the page should be added 11

Identifying Useful Reviews i.  The awful buRon in the
page doesn’t work ii.  A buRon in the page should be added 12 BUG DESCRIPTION

Available Sources for identifying Useful Reviews i.  The awful
buZon in the page doesn’t work ii.  A buZon in the page should be added sentiment lexicon structure 13

Employed Techniques sentiment lexicon structure Natural Language Parsing Sentiment
Analysis Text Analysis 14

Natural Language Parsing Sentiment Analysis Text Analysis Machine Learning 15
Employed Techniques

Goal: Understanding to what extent
NLP, Text Analysis and Sentiment Analysis could help in recognizing informative reviews from a software maintenance and evolution perspective Quality focus: Automatic detection of user reviews containing helpful information for developers. Perspective: Guide developers in maintaining and evolving their apps. Case Study 16

Research Questions RQ1: Are the language structure, content and
sentiment information able to identify user reviews that could help developers in accomplishing software maintenance and evolution tasks? RQ2: Does the combination of language structure, content and sentiment information produce beRer results than individual techniques used in isolation? 17

Context 18 All seven apps were in the list of
the most popular apps in the year 2013

Automatic Classiﬁcation of User Reviews 20

Taxonomy Definition 26

What do Developers Look for? 27

Conversations between Developers 11 – 01 – 2014 01 –
01 – 2015 150 150 28

Sentences Occurring in Developers’ Discussions 29

Topics in App User Reviews Pagano et al. – RE2013
30

Mapping (i) Taxonomy of discussions contained in
informative reviews. (ii) Taxonomy of discussions occurring in development communication 31

informative reviews. (ii) Taxonomy of discussions occurring in development communication (iii) Intersection 34

The Taxonomy 35

Examples 36

Natural Language Processing 37

Recurrent Linguistic PaZerns "ʺ You should add a new bu/on
"ʺ add You should buRon a new subject auxiliary direct object determiner aRribute [someone] should add [something] 38

NLP Heuristics NLP Heuristics Examples [someone] should add [something]
[something] needs to be [verb] [something] is required [something] crashes [something] is unable to [verb] [something] doesn’t work please adjust/ﬁx [something] [someone] wants to know [something] how to [verb]? [something] provides/supports [something] new changes include [something] feature request problem discovery informa5on seeking informa5on giving 39

NLP Parser raw text NLP parser NLP heuristics 40 feature
request problem discovery informa5on seeking informa5on giving OTHERS

Features Extraction 41

Learning Phase and Evaluation 42

Learning and Evaluation 43 Classifying

Learning and Evaluation 44 Classifying

46 Are the language structure, content and sentiment information able
to identify useful user reviews for software maintenance and evolution tasks? RQ1

Comparison between Structure, Content and Sentiment Information 47

51 Does the combination of language structure, content and sentiment
information produce better results than when used individually? RQ2

Results of The Combination of Structure, Content and Sentiment
information 52

information 53

information 54

information 55

Is it Possible Improve the Results? 56

Varying the SIZE of the Training Set
57

Varying the SIZE of the Training Set
58

65 FUTURE WORK

How Can I Improve My App? Classifying User Revi...

How Can I Improve My App? Classifying User Reviews for Software Maintenance and Evolution

More Decks by Sebastiano Panichella

Other Decks in Technology

Featured

Transcript