Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Machine Learning and Sentiment Classification i...
Search
Matt D.
May 30, 2011
Programming
1
1k
Machine Learning and Sentiment Classification in Ruby
Matt D.
May 30, 2011
Tweet
Share
Other Decks in Programming
See All in Programming
Best-Practices-for-Cortex-Analyst-and-AI-Agent
ryotaroikeda
1
110
【卒業研究】会話ログ分析によるユーザーごとの関心に応じた話題提案手法
momok47
0
200
コマンドとリード間の連携に対する脅威分析フレームワーク
pandayumi
1
460
AI & Enginnering
codelynx
0
120
なるべく楽してバックエンドに型をつけたい!(楽とは言ってない)
hibiki_cube
0
140
Vibe Coding - AI 驅動的軟體開發
mickyp100
0
180
開発者から情シスまで - 多様なユーザー層に届けるAPI提供戦略 / Postman API Night Okinawa 2026 Winter
tasshi
0
200
Grafana:建立系統全知視角的捷徑
blueswen
0
330
AWS re:Invent 2025参加 直前 Seattle-Tacoma Airport(SEA)におけるハードウェア紛失インシデントLT
tetutetu214
2
120
なぜSQLはAIぽく見えるのか/why does SQL look AI like
florets1
0
470
組織で育むオブザーバビリティ
ryota_hnk
0
180
AI Schema Enrichment for your Oracle AI Database
thatjeffsmith
0
310
Featured
See All Featured
The Limits of Empathy - UXLibs8
cassininazir
1
220
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
740
Lightning talk: Run Django tests with GitHub Actions
sabderemane
0
120
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.4k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.4k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
249
1.3M
HDC tutorial
michielstock
1
390
The agentic SEO stack - context over prompts
schlessera
0
640
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.3k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
Transcript
.BDIJOF -FBSOJOH BOE 4FOUJNFOU $MBTTJGJDBUJPO JO 3VCZ by Matt Drozdzynski
@matid
None
.BDIJOF -FBSOJOH … or how to teach your computer to
do back flips for you.
4FOUJNFOU $MBTTJGJDBUJPO … or how to quantify people’s opinions.
#euruko is definitely the most amazing Ruby conference ever!
I’ve been to many dreadful conferences, but #euruko is certainly
not one of them.
Ruby is a true delight compared to how horrendous Java
can be.
d JO 3VCZ
None
None
%BUB (BUIFSJOH
None
None
-BOHVBHF "DDVSBDZ 0% 25% 50% 75% 100% 2007 English Spanish
German Italian Polish
"OOPUBUJPOT … or I have the tweets—now what?
%BUB $MFBOJOH … or how to separate wheat from the
chaff.
'FBUVSF 3FEVDUJPO … or Matt’s crash course in selective ignorance.
$MBTTJGJDBUJPO … and the ‘not so rocket’ science behind it
all.
/BJWF #BZFT Simple and robust Assumes independence of features Scalable!
require "ankusa" require "ankusa/memory_storage" storage = Ankusa::MemoryStorage.new classifier = Ankusa::NaiveBayesClassifier.new(storage)
training.each do |tweet| classifier.train tweet.sentiment, tweet.to_s end sentiment = classifier.classify tweet.to_s
.BYJNVN &OUSPQZ No independence assumptions Suffers from overfitting Substantially slower
than Naive Bayes
require "maxent_string_classifier" classifier = MaxentStringClassifier::Loader.train(Classifier.root + "max_ent" + "data") classification
= classifier.classify tweet.to_s
4VQQPSU 7FDUPS .BDIJOFT Non-probabilistic binary linear classifier Only directly applicable
to two-class problems “Works by constructing a set of hyperplanes in a high or infinite dimensional space”—what?
None
require "eluka" classifier = Eluka::Model.new training.each do |tweet| classifier.add(tweet.features, tweet.sentiment)
end classifier.build sentiment = classifier.classify tweet.features
$PODMVTJPOT … or is the whole thing worth the hassle?
2VFTUJPOT
@matid spkr8.com/t/7678 bit.ly/matid-dissertation bit.ly/matid-dissertation-pdf 5IBOLT