Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
はじめるCassandra
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
kakerukaeru
September 08, 2015
330
1
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
はじめるCassandra
Cassandra Meetup in Tokyo, Summer 2015でお話してきた奴
kakerukaeru
September 08, 2015
More Decks by kakerukaeru
See All by kakerukaeru
大規模ImageOptimizer利用案件から学ぶ 安心安全のCDN移行 / Fastly yamagoya 2022
kakerukaeru
1
1.5k
事業と歩む Ameba システム刷新の道 / the-road-to-ameba-system-renovation-aws-summit-online
kakerukaeru
0
2k
事業と歩むAmebaシステム刷新の道 / the-road-to-ameba-system-renovation-cadc
kakerukaeru
0
720
The Shining / ~all work and no play makes jack a dull boy~
kakerukaeru
0
510
AmebaとCDNのお付き合いの歴史 / ameba cdn waiwai
kakerukaeru
0
150
fastlyでええかんじにサイトリニューアル @ Yamagoya Meetup 2018 / e-kanzi Website renewal with fastly
kakerukaeru
0
650
ghe_ameba_arekore
kakerukaeru
2
2.3k
20160907_Akamai_Tech_Deep_Dive
kakerukaeru
0
2.3k
ansible is nani
kakerukaeru
1
390
Featured
See All Featured
VelocityConf: Rendering Performance Case Studies
addyosmani
333
25k
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
1.1k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
2k
sira's awesome portfolio website redesign presentation
elsirapls
0
280
RailsConf 2023
tenderlove
30
1.5k
Leadership Guide Workshop - DevTernity 2021
reverentgeek
1
310
So, you think you're a good person
axbom
PRO
2
2.1k
Evolving SEO for Evolving Search Engines
ryanjones
0
220
Facilitating Awesome Meetings
lara
57
7k
Testing 201, or: Great Expectations
jmmastey
46
8.2k
Technical Leadership for Architectural Decision Making
baasie
3
420
Navigating Weather and Climate Data
rabernat
0
230
Transcript
͡ΊΔCassandra
who are you
͍Θͳ͕͔͚Δ twitter@kakerukaeru ΏͱΓΠϯϑϥԂࣇʹ͋ @CyberAgent Amebaͷೝূɺ՝ۚɺը૾৴ج൫ͷ໘ݟΔϚϯ Cassandraྺɿ̍ऑ... ۓுͯ͠·͢ ΦΤʔʔ!!!!ɹʊʊ_ ɹɹɹ ʊʊ_ʗɹɹ
ʋ ɹɹ ʗɹ ʗɹʗ˶ʋ| ɹɹ/ (ƅ)/ɹʗ / ɹ /ɹɹ ŪŊ/Ň˶ʋɻ
࠷ॳʹ
ࠓճͷࢿྉ
For administrator
ͩ
Α
͍
ͱ͍͏͜ͱͰ ࠓճͷ͓ͷ༰ͱͯ͠ ҎԼͷࣄΛॏతʹ͓͠Α͏͔ͱࢥ͍·͢ • ӡ༻্͍ͯ͘͠ͰɺݟΔ͖ • ఆৗతʹߦ͏͖ΦϖϨʔγϣϯ • ॳΊͯCassandraΛӡ༻ͯ͠Έͯͷॴײ
agenda
agenda • about CyberAgent&Service • why Cassandra • Operation •
Build,Monitoring,Backup etc... • about Troubleshooting • ·ͱΊ
about CyberAgent
blog
pigg
game
Entertainment
Entertainment
about Cassandra in CyberAgent
about Cassandra in CyberAgent • Cassandra Versionɿ1.1.5, 1.2.13 • Production
Clusterɿ3 • Production nodesɿ about 150node • Total about qps Read&Writeɿ 50000qps • Total about data sizeɿ15TB
about Service
use at Cyberagent Smartphone Platform1 1 ϒϥβͷPlatformͳ
ͳͷ
Ͱ͢
͕
ࠓճ͓͢Δͷɺ
طଘͰͳ͘ ৽͘͠࡞ͬͨClusterͷ͓
Ͱ͢
For Native App iOS & Android auth payment logging
For Native App • ωΟςΟϒΞϓϦ༻ج൫ • ੜ·Εͯ̍ऑͷج൫ • ೝূɺ՝ۚɺloggingͷapiΛఏڙ •
Cassandraͷ༻ओʹidཧͷ෦ • idʹඥ͚ͯɺ՝ۚˍloggingͳͲͷbackendͷSystemʹܨ ͍͛ͯΔ
why Cassandra
why Cassandra • ୲ʹͳͬͨΒطʹ͋ͬͨ • SPOF͕ͳ͍ • ٸܹͳσʔλ૿ʹ͑ΒΕΔscalability • ϊʔυՃʹΑΔɺεέʔϧΞτ
• ฐࣾSmartphone PlatformͰͷӡ༻࣮
about System
Cassandra setting • Versionɿ2.0.8 • Replication Factorɿ3 • Consistency LevelɿQUORUM
• use vnodeɿ256 • use CQL,nodejs༻ಠࣗυϥΠό࣮ • https://github.com/suguru/cql-client
Request • Peak Request • Readɿabout 9,000 qps • Writeɿabout
3,000 qps • Data size • Totalɿ600GB • 1node avgɿ50GB
Latency • Readɿavg 2ms • Writeɿavg 0.1ms
HW Spec • private Cloud Instance • CPUɿ24core • Memoryɿ94GBɺheap
8GB • Diskʀ1TB • 12node • 1 Cluster
HW Spec • ૬ʹ᩵ͳαʔό • ج൫ͱͯ͜͠Ε͔ΒσΧ͘ͳΔ͜ͱΛݟӽͯ͠ͷαʔό • Resourceతʹ·ͩ·ͩ༨༟͕͋Δ • nodeݮΒͯ͠େৎͦ
• privateCloudͷInstance typeͷϥΠϯφοϓʹΑΓɺ͜ͷ Specʹͳͬͨ
about Operation
Cassandraશܠ
None
Build
Build • Cassandraαʔόͷߏங • Jenkins & ansible • ख࡞ۀCluster join࣌ͷCassandra
ϓϩηεͷىಈͷΈ • vnode(Cassandra ver1.2~)Λ༻ ͍ͯ͠ΔͨΊɺख࡞ۀͰͷtokenͷ ܭࢉˍׂΓ͕ͯඞཁͳ͘୯७ͳ ىಈͰOK
Monitoring
Monitoring • threshold • use sensu • how to check
• Community&Original sensu plugin • how to notify • mail & hipchat
Monitoring • trend • use OpsCenter • data size&latency •
use sensu & influxdb & grafana • how to check • Community&Original sensu plugin
Monitoring • check • OS Resource • cpu,memory,disk&nw latency,fd •
JVM • heap,gc
Monitoring • check • Cassandra • read&write_qps,latency • thread pool
• ReadStage • FlushWriter • Compaction • HintedHandoff...
Monitoring Ͳ͏ͬͯCassandraͷಈΛ͏ͷʁʊʁ • CassandraStatusread&writeͷಈΛ͏ • Write&ReadStageɺMutationStageɺFlushWrite • Compaction_status • Cassandra_Clusterͷhealth֬ೝ͍ͨ͠
• Gossiptimeowtɺhintedhandoff
Operation
Operation • repair & cleanup • about 20h / weekly
• backup & restore • snapshot & sstableloader • restore CI • ?? h / weekly
Operation • repair & cleanup • ϨϓϦΧͷෆ߹Λ͙ͨΊʹఆظతͳrepairΛ࣮ߦ • σʔλͷ෮׆Λ͙ͨΊʹಉ࣌ʹcleanup࣮ߦ •
࣮ߦपظ 7days ʻ gc grace seconds(default:10days)2 2 gc grace secondsɿTombstoneͷGarbageCollection࣮ߦ·Ͱͷ࣌ؒ
Operation • backup • 2hຖʹ֤nodeͰsnapshotΛ࡞͠Swiftʹอଘɻ • restore • test-clusterʹͯఆظతʹrestore͕ग़དྷ͍ͯΔ͔֬ೝ •
sstableloaderΛ͍ۭClusterʹdataΛྲྀ͠ࠐ͜Ή3 3 ͨͩsstableloaderΊͬͪΌ͔͔࣌ؒΔ͔Βɺ࣮ࡍͷrestore࡞ۀsnapshotஔ͖ͷ෮چʹͳΔ͔
about Troubleshooting
Կ͔͋ͬͨ࣌ʹΑ͘͏nodetool • nodetool status • nodeͷঢ়ଶΛͬ͞͞ͱݟ͍ͨ • nodetool tpstats •
࣮ߦதͷthreadͷࢹ • nodetool netstats • streamͷใΛݟΔ
Կ͔͋ͬͨ࣌ʹΑ͘͏nodetool • nodetool cfstats • cfຖʹใΛݟ͍ͨ4 • nodetool disablegossip,disablethrift,disablebinary,flush •
disable* : ֤protocolແޮԽ • flushɿmemtable͔Βflushͤ͞Δ 4 CassandraશମͰSlowdownͯ͠Δͷ͔ɺಛఆcfͰ٧·ͬͯΔͷ͔֬ೝ͍ͨ͠ΑͶ
Կ͔͋ͬͨ࣌ʹΑ͘͏nodetool ͳͷͰ༧ఆ͍ͯͨ͠nodeͷ࠶ىಈͳͲԼهΛͬͨΓ͢Δ $ nodetool disablegossip && \ nodetool disablethrift &&
\ nodetool disablebinary && \ nodetool flush && \ /etc/init.d/cassandra restart
͔࣮͠͠ࡍʹ ಥൃతʹnodeʹԿ͔͕ൃੜͨ͠߹ɺ nodetoolͷ݁Ռ͕ฦͬͯ͜ͳ͍ࣄ͕΄ͱΜͲ ͦͷ߹Ͳ͏͢Δ͔
ఘΊͯ࠶ىಈ5 /etc/init.d/cassandra restart 5 ༻๏༻ྔΛकͬͯਖ਼͓͍͍ͩ͘͘͠͞ɻͪΌΜͱlogɺmetricsΛΈͯஅͯ͠·͢Αɺɺɺ
ͦͷଞɺࠔͬͨ͜ͱ
NWো6 6 ͏طʹා͍Ͱ͢Ͷ
۩ମతʹʁ
None
͓͖ͨ͜ͱɺରԠͨ͜͠ͱ • ॠஅ͕ଓ͖L2ϨϕϧͰͷશͳΔஅʹͳΔ • Clusterతʹશnode͕ಠཱͨ͠ঢ়ଶʹɻ • max hint window ms
(default:3h)Λ͑ͨ(!!)ͷͰhint7ͷใ શͯഁغ͞ΕϦηοτ͞ΕΔܗʹɻ • NW෮چޙʹશnodeͰrepairΛ͔͚ͯɺClusterͷ෮چʹɻ 7 hint:ଞnode͕μϯͨ͠ࡍʹॻ͖ࠐ·ΕΔͣͩͬͨσʔλΛଞͷϨϓϦΧ͕อ࣋͢Δ max_hint_window_ms:↑ ͷhintΛഁغ͢Δ·Ͱͷ࣌ؒ
;Γ͔͑Δͱ • σʔλϩετແ͠ • ॠஅ͕ଓ͘ܗͰhintΛอ࣋ͯ͠ΔݶΓࣗಈతʹϨϓϦΧͷ ߹ੑΛ͑Δ͜ͱ͕Մೳ • hint͕ͳͯ͘node͑͞௵Εͳ͚ΕClusterͷ෮چ͕Մೳ8 • NWஅʹ͑ΒΕͨ
8 hintͳ͍nodeμϯͱͳΔͱɺٹ͑ͳ͍σʔλग़ͯ͘ΔɻͨΓલ͔
ͦͷଞઌਓͷݟ
ͦͷଞઌਓͷݟ9 • slow queryΛݟΔ͜ͱ͕ग़དྷͳ͍ͷͰɺࠔΔલʹΞϓϦଆʹ slow logΛ࣮͢Δ • εΩʔϚઃܭେࣄ • wide
rowΛආ͚Δɻࣄલׂग़དྷΔͳΒͪΌΜͱ͠Α͏ • CassandraʹݶͬͨͰͳ͍͚ΕͲ... 9 ઌਓͷي http://www.slideshare.net/oranie/cassandra-summit-jpn-2014-100-node-cluster-admin-operation
·ͱΊ • ࠷ݶͷࣄΛ͓͚͑ͯӡ༻ָ • CassandraɺŠƂŞūŘż • ઌਓͷݟΛ͋Γ͕͍ͨͨͩ͘͜͏ • ͦͯࣗͨͪ͠ੵͯ͠ڞ༗͠Α͏ɻ •
Cassandra CommunityʹߩݙతͳͶ • 1.xxܥͱ͓ผΕΛ͠Α͏ɻ͍ͨ͠ʢന
͜Ε͔Βͷ͜ͱ • ͜Ε͔ΒͷClusterઃܭ • ͦͷ··Ծʁཧʁʊʁ • PITRʹ͍ۙ͜͠ͱɺ͍ͨ͠ɺɺɺ • σʔληϯλʔػೳΛͬͯɺBackupઐ༻ͷCluster࡞ •
Backup͚࣌ͩɺσʔληϯλʔؒͷϨϓϦΛࢭΊͯɺॻ͖ ࠐΈͷͳ͍ঢ়ଶͰBackupΛऔΔͱ͍͏ໝΛ͍ͯ͠Δɻ
͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ ͳʹ͔͋Ε࠙ձͷ࣌ʹੋඇʂʢʈТʈʣ