Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Analytics for the busy Ruby developer

Analytics for the busy Ruby developer

Presentation given at Rubyconf Taiwan on Analytics for your Ruby application. http://www.polyglotprogramminginc.com

89e0c5e7bfe1c613b1b9287d89963e73?s=128

Lance Gleason

April 27, 2014
Tweet

Transcript

  1. Sunday, April 27, 14

  2. Introductions Sunday, April 27, 14

  3. Sunday, April 27, 14

  4. Twitter @lgleasain Github lgleasain www.lancegleason.com www.polyglotprogrammincinc.com lgleason@polyglotprogramminginc.com Sunday, April 27,

    14
  5. Sunday, April 27, 14

  6. Sunday, April 27, 14

  7. http://www.purrprogramming.com Sunday, April 27, 14

  8. What Are Analytics? Sunday, April 27, 14

  9. Data Science Sunday, April 27, 14

  10. Sunday, April 27, 14

  11. Sunday, April 27, 14

  12. Sunday, April 27, 14

  13. Sunday, April 27, 14

  14. Sunday, April 27, 14

  15. Sunday, April 27, 14

  16. Gathering Data Sunday, April 27, 14

  17. Database Sunday, April 27, 14

  18. Database Sunday, April 27, 14

  19. Database Sunday, April 27, 14

  20. Database Sunday, April 27, 14

  21. Sunday, April 27, 14

  22. Logging (Papertrail/ Loggly) Sunday, April 27, 14

  23. Logging (Papertrail/ Loggly) Amazon S3 Sunday, April 27, 14

  24. {"measure":"instance","instance": "stores","store_id": 64696,"company_id": 210,"store_name":"bebe", "controller":"api/v1/ stores","action":"index"} Sunday, April 27, 14

  25. Sunday, April 27, 14

  26. Amazon Elastic Map Reduce Sunday, April 27, 14

  27. Amazon Elastic Map Reduce Sunday, April 27, 14

  28. DynamoDB Sunday, April 27, 14

  29. CREATE EXTERNAL TABLE events_1 ( id bigint, received_at string, generated_at

    string, source_id bigint, source_name string, source_ip string, facility string, severity string, program string, message string ) PARTITIONED BY ( dt string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE LOCATION 's3://mybucket/papertrail/logs/production'; Sunday, April 27, 14
  30. ALTER TABLE events_1 RECOVER PARTITIONS; Sunday, April 27, 14

  31. CREATE EXTERNAL TABLE promotions_1 (id string, received_at string, source_id string,

    source_ip string, source_name string,measure string, instance string, promotion_id string, company_id string, controller string, action string) stored by 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHan dler' TBLPROPERTIES ("dynamodb.table.name" = "sh_promotions_latest", "dynamodb.column.mapping" = "id:id,received_at:received_at,source_id:source_id,source_i p:source_ip,source_name:source_name,measure:measure,i nstance:instance,promotion_id:promotion_id,company_id:c ompany_id,controller:controller,action:action"); Sunday, April 27, 14
  32. alter table promotions_1 recover partitions; Sunday, April 27, 14

  33. insert overwrite table promotions_1 select id, received_at, source_id, source_ip, source_name,

    get_json_object(message, '$.measure') as measure, get_json_object(message, '$.instance') as instance, get_json_object(message, '$.promotion_id') as promotion_id, get_json_object(message, '$.company_id') as company_id, get_json_object(message, '$.controller') as controller, get_json_object(message, '$.action') as action from events_1 where message like '%"promotion"%' ; Sunday, April 27, 14
  34. Hadoop Sunday, April 27, 14

  35. Cassandra MongoDB Sunday, April 27, 14

  36. Cleaning Data Sunday, April 27, 14

  37. Segmentation Sunday, April 27, 14

  38. Sparse Data Sunday, April 27, 14

  39. Analysis Sunday, April 27, 14

  40. Descriptive Statistics Stats Sample Sunday, April 27, 14

  41. Visualization Sunday, April 27, 14

  42. To Get Statistically Meaningful Results you will need thousands of

    data points Sunday, April 27, 14
  43. False Positives Sunday, April 27, 14

  44. Sunday, April 27, 14

  45. Nearly ALL sick people have eaten Rice (obviously then, the

    effects are cumulative). Sunday, April 27, 14
  46. An estimated 99.9% of all people who die from cancer

    or heart attacks have eaten Rice. Sunday, April 27, 14
  47. Another 99.9% of people involved in auto accidents ate Rice

    within 60-days before the accident. Sunday, April 27, 14
  48. Among people born in 1839 who later dined on Rice,

    there has been a 100% mortality rate Sunday, April 27, 14
  49. Rice Will Kill You Sunday, April 27, 14

  50. We had 4000 app downloads this month. We are doing

    great.... Sunday, April 27, 14
  51. Sunday, April 27, 14

  52. Most people use the app once and then uninstall it.

    Sunday, April 27, 14
  53. Sunday, April 27, 14

  54. My shopping app just saw a spike in weekly usage

    after I made UI changes. Sunday, April 27, 14
  55. That UI change led to more users! Sunday, April 27,

    14
  56. Sunday, April 27, 14

  57. The change went live during the last week of November.

    Sunday, April 27, 14
  58. Sunday, April 27, 14

  59. Be Wary of N of 1 Experiments Sunday, April 27,

    14
  60. The Results Need to Pass the Smell Test Sunday, April

    27, 14
  61. http://www.kaggle.com/c/titanic-gettingStarted Sunday, April 27, 14

  62. What Are Analytics? Sunday, April 27, 14

  63. Sunday, April 27, 14

  64. Sunday, April 27, 14

  65. Visualization Sunday, April 27, 14

  66. Sparse Data Sunday, April 27, 14

  67. Sunday, April 27, 14

  68. Insights Sunday, April 27, 14

  69. Trends Sunday, April 27, 14

  70. Sunday, April 27, 14

  71. SVG Sunday, April 27, 14

  72. Rubyvis Sunday, April 27, 14

  73. require 'rubygems' require 'rubyvis' vis = Rubyvis::Panel.new do width 150

    height 150 bar do data [1, 1.2, 1.7, 1.5, 0.7, 0.3] width 20 height {|d| d * 80} bottom(0) left {index * 25} end end vis.render() puts vis.to_svg # Output final SVG Sunday, April 27, 14
  74. <svg fill="none" font-family="sans-serif" font-size="10px" height="150.0" stroke="none" stroke-width="1.5" width="150.0"> <g transform="translate(0.0,0.0)">

    <rect fill="rgb(31,119,180)" height="80" width="20" y="70"/> <rect fill="rgb(31,119,180)" height="96.0" width="20" x="25" y="54.0"/> <rect fill="rgb(31,119,180)" height="136.0" width="20" x="50" y="14.0"/> <rect fill="rgb(31,119,180)" height="120.0" width="20" x="75" y="30.0"/> <rect fill="rgb(31,119,180)" height="56.0" width="20" x="100" y="94.0"/> <rect fill="rgb(31,119,180)" height="24.0" width="20" x="125" y="126.0"/> </g> </svg> Sunday, April 27, 14
  75. Sunday, April 27, 14

  76. NVD3 Sunday, April 27, 14

  77. XCharts C3.js Sunday, April 27, 14

  78. Sunday, April 27, 14

  79. D3JS Sunday, April 27, 14

  80. Sunday, April 27, 14

  81. Sunday, April 27, 14

  82. SVG Sunday, April 27, 14

  83. SVG Canvas Sunday, April 27, 14

  84. Sunday, April 27, 14

  85. Sunday, April 27, 14

  86. Sunday, April 27, 14

  87. •Internet Explorer 9 and 10+ •Chrome 24, 25, and 26+

    •Safari 5 and 6+ •Firefox 19, 20, and 21+ Sunday, April 27, 14
  88. Sunday, April 27, 14

  89. Sunday, April 27, 14

  90. Sunday, April 27, 14

  91. Sunday, April 27, 14

  92. Rubyvis for a pure Ruby Sunday, April 27, 14

  93. NVD3, C3 or XCharts For Easy Stuff Sunday, April 27,

    14
  94. D3JS for loads of flexibility Sunday, April 27, 14

  95. Limit Your Dataset for performance Sunday, April 27, 14

  96. Don’t overwhelm with too much information Sunday, April 27, 14

  97. Be Wary of N of 1 Experiments Sunday, April 27,

    14
  98. Twitter @lgleasain Github lgleasain www.lancegleason.com www.polyglotprogrammincinc.com lgleason@polyglotprogramminginc.com Sunday, April 27,

    14