4. reallllly BIG objects 5. 1 cluster, several AZs 6. No Firewall 7. No Capacity Planning 8. Wait too long to scale Metric CPU Memory Disk Space Disk IO Network File Descriptors Swap Threshold 75% * num_cores 70% - bu"ers 75% 80% sustained 70% sustained 75% of ulimit > 0KB
4. reallllly BIG objects 5. 1 cluster, several AZs 6. No Firewall 7. No Capacity Planning 8. Wait too long to scale http://<riakip>:<port>/stats Riak Counters graph them with: <insert monitoring tool>
4. reallllly BIG objects 5. 1 cluster, several AZs 6. No Firewall 7. No Capacity Planning 8. Wait too long to scale http://<riakip>:<port>/stats Riak Counters graph them with: <insert monitoring tool> or if you dont want to run your own monitoring service, there’s a aaS for that...
4. reallllly BIG objects 5. 1 cluster, several AZs 6. No Firewall 7. No Capacity Planning 8. Wait too long to scale http://<riakip>:<port>/stats Riak Counters graph them with: <insert monitoring tool> or if you dont want to run your own monitoring service, there’s a aaS for that... 1 import json 2 from urllib2 import urlopen 3 import socket 4 from time import sleep 5 6 UDP_ADDRESS = "carbon.hostedgraphite.com" 7 UDP_PORT = 2003 8 RIAK_STATS_URL='http://localhost:11098/stats' 9 10 HG_API_KEY='Your Api Key from HostedGraphite.com' 11 12 stats=json.load(urlopen(RIAK_STATS_URL)) 13 14 nn = stats['nodename'].replace('.', '-‐') 15 sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) # UDP# Internet 16 17 for k in stats: 18 if type(1) == type(stats[k]): 19 message='%s.%s.%s %s' % (HG_API_KEY,nn,k,stats[k]) 20 sock.sendto(message, (UDP_ADDRESS, UDP_PORT)) 21 #sleep(0.1) 22 print message 23 print 'Sent %s' % len(stats) 24
4. reallllly BIG objects 5. 1 cluster, several AZs 6. No Firewall 7. No Capacity Planning 8. Wait too long to scale Run Queues Process Process Process Process Process Process OS + kernel CPU Core 1 . . . . . . CPU Core N Erlang VM N 1 SMP Schedulers (one per core)
4. reallllly BIG objects 5. 1 cluster, several AZs 6. No Firewall 7. No Capacity Planning 8. Wait too long to scale http://aphyr.com/posts/224-do-not-expose-riak-to-the-internet
4. reallllly BIG objects 5. 1 cluster, several AZs 6. No Firewall 7. No Capacity Planning 8. Wait too long to scale https://github.com/basho/basho_bench Basho Bench
4. reallllly BIG objects 5. 1 cluster, several AZs 6. No Firewall 7. No Capacity Planning 8. Wait too long to scale (actual footage of a cluster under load attempting hando")