Developers’ Sentiment and Issue Reopening

Jonathan Cheruvelil and Bruno C. da Silva <[email protected]> Developers’ Sentiment
and Issue Reopening @BrunoDaSilvaSE github.com/bcdasilv/sentiment-analysis-on-issues #SEmotion19 · ICSE 2019 Workshop · 28 May 2019 · Montréal, QC, Canada https://www.abc.net.au/radionational/programs/allinthemind/our-emotional-brain-6th-may/3985474

https://www.builtinboston.com/2017/12/12/day-life-engineer-facebook-cambridge

SENTIMENT → BUILD STATUS “This fixes a really nasty bug”
build broken Commits with negative sentiment are slightly more likely to result in broken builds SENTIMENT → BUILD STATUS Rodrigo Souza and Bruno Silva. Sentiment analysis of Travis CI builds. In MSR '17. IEEE Press, Piscataway, NJ, USA, 459-462.

Push commits “Definitely hating [issue] #320” build broken Commits following
a broken build tend to be slightly more negative BUILD STATUS → SENTIMENT BUILD STATUS → SENTIMENT Rodrigo Souza and Bruno Silva. Sentiment analysis of Travis CI builds. In MSR '17. IEEE Press, Piscataway, NJ, USA, 459-462.

Happier and more polite developers… ﬁx issues faster solve problems
better

Rework

Research Questions RQ1: Are comments with negative sentiments more likely
to appear in issues that have been reopened?

to appear in issues that have been reopened? RQ2: Does a larger comment size correlate with more extreme sentiment scores?

to appear in issues that have been reopened? RQ2: Does a larger comment size correlate with more extreme sentiment scores? RQ3: Do different projects have different proportions in regards to sentiment scores and issue reopening status?

8 projects What we did…

8 projects REST API What we did…

8 projects REST API 35k+ issues (excluded issues < 1y
old) What we did…

8 projects REST API SentiStrengh-SE 35k+ issues (excluded issues <
1y old) What we did…

Md Rakibul Islam, Minhaz F. Zibran, SentiStrength-SE: Exploiting domain speciﬁcity
for improved sentiment analysis in software engineering text, Journal of Systems and Software, V. 145, 2018, Pages 125-146, How SentiStrength-SE works…

“I love tests, but dislike the awful API” Md Rakibul
Islam, Minhaz F. Zibran, SentiStrength-SE: Exploiting domain speciﬁcity for improved sentiment analysis in software engineering text, Journal of Systems and Software, V. 145, 2018, Pages 125-146, How SentiStrength-SE works…

“I love tests, but dislike the awful API” +3 -3
-4 love dislike awful Md Rakibul Islam, Minhaz F. Zibran, SentiStrength-SE: Exploiting domain speciﬁcity for improved sentiment analysis in software engineering text, Journal of Systems and Software, V. 145, 2018, Pages 125-146, How SentiStrength-SE works…

“I love tests, but dislike the awful API” +3 -3
-4 love dislike awful sentiment score:  [3, -4] Md Rakibul Islam, Minhaz F. Zibran, SentiStrength-SE: Exploiting domain speciﬁcity for improved sentiment analysis in software engineering text, Journal of Systems and Software, V. 145, 2018, Pages 125-146, How SentiStrength-SE works…

SentiStrengh-SE 35k+ issues (excluded issues < 1y old) What we
did… [3, -4] (sentiment tuple) #comment size Transition history —> #reopenings for each issue

RESULTS

RQ1: Are comments with negative sentiments more likely to appear
in issues that have been reopened?

in issues that have been reopened? (-) scores. All issues.

in issues that have been reopened? (-) scores. All issues. Yes. Small effect size (based on Chi-squared test and Cramer’s V).

in issues that have been reopened? (-) scores. All issues. Yes. Small effect size (based on Chi-squared and Cramer’s V). What about the opposite?

(+) scores. All issues. Comments with positive sentiments are also
more likely to appear in issues that have been reopened. Small effect size (based on Chi- squared and Cramer’s V).

RQ2: Does a larger comment size correlate with more extreme
sentiment scores?

comment_size <= 500 words

comment_size <= 500 words 500 < comment_size < 1000

comment_size <= 500 words 500 < comment_size < 1000 comment_size
>= 1000 words

sentiment scores? (-) scores. Comment_size <= 500 words (+) scores. Comment_size <= 500 words

sentiment scores? (-) scores. Comment_size <= 500 words (+) scores. Comment_size <= 500 words 0.7% 4.7%

sentiment scores? (-) scores. Comment_size <= 500 words (+) scores. Comment_size <= 500 words 0.7% 4.7% 0.4% 6.4%

(-) scores. 500 < Comment_size < 1000 (+) scores. 500
words < Comment_size < 1000 RQ2: Does a larger comment size correlate with more extreme sentiment scores? 4.2% 20% 2% 29%

sentiment scores? (-) scores. Comment_size >= 1000 words (+) scores. Comment_size >= 1000 words 12% 37% 6% 48%

sentiment scores? (-) scores. Comment_size >= 1000 words (+) scores. Comment_size >= 1000 words Yes.

sentiment scores? (-) scores. Comment_size >= 1000 words (+) scores. Comment_size >= 1000 words … and a slight correlation with issue reopening

RQ3: Do different projects have different proportions in regards to
sentiment scores and issue reopening status? Project Title % of issues selected Cramer’s V (-) scores Cramers’ V (+) scores Zookeeper 88% .303 .288 Hadoop 30% .224 .202 MNG 77% .164 .152 Qpid 60% .112 .109 Felix 94% .108 .122 Groovy 69% .101 .107 Zeppelin 51% .072 .065 CloudStack 46% .034 .029

sentiment scores and issue reopening status? Project Title % of issues selected Cramer’s V (-) scores Cramers’ V (+) scores Zookeeper 88% .303 .288 Hadoop 30% .224 .202 MNG 77% .164 .152 Qpid 60% .112 .109 Felix 94% .108 .122 Groovy 69% .101 .107 Zeppelin 51% .072 .065 CloudStack 46% .034 .029 (-) scores. CloudStack issues.

sentiment scores and issue reopening status? Project Title % of issues selected Cramer’s V (-) scores Cramers’ V (+) scores Zookeeper 88% .303 .288 Hadoop 30% .224 .202 MNG 77% .164 .152 Qpid 60% .112 .109 Felix 94% .108 .122 Groovy 69% .101 .107 Zeppelin 51% .072 .065 CloudStack 46% .034 .029 (-) scores. Zookeeper issues.

sentiment scores and issue reopening status? Project Title % of issues selected Cramer’s V (-) scores Cramers’ V (+) scores Zookeeper 88% .303 .288 MNG 77% .164 .152 CloudStack 46% .034 .029 Felix 94% .108 .122 Qpid 60% .112 .109 Zeppelin 51% .072 .065 Groovy 69% .101 .107 Hadoop 30% .224 .202 Yes. (-) scores. Zookeeper issues.

Takeaways • If using a lexicon-based approach such as SentiStrength-SE,
consider comment size as a possible confounding factor. • We noticed SentiStrength-SE improved SentiStrength for SE text (compared to our previous experience with SentiStrengh)

Future work • Replicate with other sentiment analysis approaches (including
ML models) • Separate comments made before issue reopenings from those made after the reopenings

REST API SentiStrengh-SE 35k+ issues Developers’ Sentiment and Issue Reopening
Jonathan Cheruvelil and Bruno C. da Silva <[email protected]> #SEmotion19 ICSE 2019 Workshop · 28 May 2019 · Montréal, QC, Canada Material available: github.com/bcdasilv/sentiment-analysis-on-issues

Developers’ Sentiment and Issue Reopening

Developers’ Sentiment and Issue Reopening

More Decks by Bruno C. da Silva

Other Decks in Research

Featured

Transcript