Slide 1

Slide 1 text

Sunday, April 27, 14

Slide 2

Slide 2 text

Introductions Sunday, April 27, 14

Slide 3

Slide 3 text

Sunday, April 27, 14

Slide 4

Slide 4 text

Twitter @lgleasain Github lgleasain www.lancegleason.com www.polyglotprogrammincinc.com [email protected] Sunday, April 27, 14

Slide 5

Slide 5 text

Sunday, April 27, 14

Slide 6

Slide 6 text

Sunday, April 27, 14

Slide 7

Slide 7 text

http://www.purrprogramming.com Sunday, April 27, 14

Slide 8

Slide 8 text

What Are Analytics? Sunday, April 27, 14

Slide 9

Slide 9 text

Data Science Sunday, April 27, 14

Slide 10

Slide 10 text

Sunday, April 27, 14

Slide 11

Slide 11 text

Sunday, April 27, 14

Slide 12

Slide 12 text

Sunday, April 27, 14

Slide 13

Slide 13 text

Sunday, April 27, 14

Slide 14

Slide 14 text

Sunday, April 27, 14

Slide 15

Slide 15 text

Sunday, April 27, 14

Slide 16

Slide 16 text

Gathering Data Sunday, April 27, 14

Slide 17

Slide 17 text

Database Sunday, April 27, 14

Slide 18

Slide 18 text

Database Sunday, April 27, 14

Slide 19

Slide 19 text

Database Sunday, April 27, 14

Slide 20

Slide 20 text

Database Sunday, April 27, 14

Slide 21

Slide 21 text

Sunday, April 27, 14

Slide 22

Slide 22 text

Logging (Papertrail/ Loggly) Sunday, April 27, 14

Slide 23

Slide 23 text

Logging (Papertrail/ Loggly) Amazon S3 Sunday, April 27, 14

Slide 24

Slide 24 text

{"measure":"instance","instance": "stores","store_id": 64696,"company_id": 210,"store_name":"bebe", "controller":"api/v1/ stores","action":"index"} Sunday, April 27, 14

Slide 25

Slide 25 text

Sunday, April 27, 14

Slide 26

Slide 26 text

Amazon Elastic Map Reduce Sunday, April 27, 14

Slide 27

Slide 27 text

Amazon Elastic Map Reduce Sunday, April 27, 14

Slide 28

Slide 28 text

DynamoDB Sunday, April 27, 14

Slide 29

Slide 29 text

CREATE EXTERNAL TABLE events_1 ( id bigint, received_at string, generated_at string, source_id bigint, source_name string, source_ip string, facility string, severity string, program string, message string ) PARTITIONED BY ( dt string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE LOCATION 's3://mybucket/papertrail/logs/production'; Sunday, April 27, 14

Slide 30

Slide 30 text

ALTER TABLE events_1 RECOVER PARTITIONS; Sunday, April 27, 14

Slide 31

Slide 31 text

CREATE EXTERNAL TABLE promotions_1 (id string, received_at string, source_id string, source_ip string, source_name string,measure string, instance string, promotion_id string, company_id string, controller string, action string) stored by 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHan dler' TBLPROPERTIES ("dynamodb.table.name" = "sh_promotions_latest", "dynamodb.column.mapping" = "id:id,received_at:received_at,source_id:source_id,source_i p:source_ip,source_name:source_name,measure:measure,i nstance:instance,promotion_id:promotion_id,company_id:c ompany_id,controller:controller,action:action"); Sunday, April 27, 14

Slide 32

Slide 32 text

alter table promotions_1 recover partitions; Sunday, April 27, 14

Slide 33

Slide 33 text

insert overwrite table promotions_1 select id, received_at, source_id, source_ip, source_name, get_json_object(message, '$.measure') as measure, get_json_object(message, '$.instance') as instance, get_json_object(message, '$.promotion_id') as promotion_id, get_json_object(message, '$.company_id') as company_id, get_json_object(message, '$.controller') as controller, get_json_object(message, '$.action') as action from events_1 where message like '%"promotion"%' ; Sunday, April 27, 14

Slide 34

Slide 34 text

Hadoop Sunday, April 27, 14

Slide 35

Slide 35 text

Cassandra MongoDB Sunday, April 27, 14

Slide 36

Slide 36 text

Cleaning Data Sunday, April 27, 14

Slide 37

Slide 37 text

Segmentation Sunday, April 27, 14

Slide 38

Slide 38 text

Sparse Data Sunday, April 27, 14

Slide 39

Slide 39 text

Analysis Sunday, April 27, 14

Slide 40

Slide 40 text

Descriptive Statistics Stats Sample Sunday, April 27, 14

Slide 41

Slide 41 text

Visualization Sunday, April 27, 14

Slide 42

Slide 42 text

To Get Statistically Meaningful Results you will need thousands of data points Sunday, April 27, 14

Slide 43

Slide 43 text

False Positives Sunday, April 27, 14

Slide 44

Slide 44 text

Sunday, April 27, 14

Slide 45

Slide 45 text

Nearly ALL sick people have eaten Rice (obviously then, the effects are cumulative). Sunday, April 27, 14

Slide 46

Slide 46 text

An estimated 99.9% of all people who die from cancer or heart attacks have eaten Rice. Sunday, April 27, 14

Slide 47

Slide 47 text

Another 99.9% of people involved in auto accidents ate Rice within 60-days before the accident. Sunday, April 27, 14

Slide 48

Slide 48 text

Among people born in 1839 who later dined on Rice, there has been a 100% mortality rate Sunday, April 27, 14

Slide 49

Slide 49 text

Rice Will Kill You Sunday, April 27, 14

Slide 50

Slide 50 text

We had 4000 app downloads this month. We are doing great.... Sunday, April 27, 14

Slide 51

Slide 51 text

Sunday, April 27, 14

Slide 52

Slide 52 text

Most people use the app once and then uninstall it. Sunday, April 27, 14

Slide 53

Slide 53 text

Sunday, April 27, 14

Slide 54

Slide 54 text

My shopping app just saw a spike in weekly usage after I made UI changes. Sunday, April 27, 14

Slide 55

Slide 55 text

That UI change led to more users! Sunday, April 27, 14

Slide 56

Slide 56 text

Sunday, April 27, 14

Slide 57

Slide 57 text

The change went live during the last week of November. Sunday, April 27, 14

Slide 58

Slide 58 text

Sunday, April 27, 14

Slide 59

Slide 59 text

Be Wary of N of 1 Experiments Sunday, April 27, 14

Slide 60

Slide 60 text

The Results Need to Pass the Smell Test Sunday, April 27, 14

Slide 61

Slide 61 text

http://www.kaggle.com/c/titanic-gettingStarted Sunday, April 27, 14

Slide 62

Slide 62 text

What Are Analytics? Sunday, April 27, 14

Slide 63

Slide 63 text

Sunday, April 27, 14

Slide 64

Slide 64 text

Sunday, April 27, 14

Slide 65

Slide 65 text

Visualization Sunday, April 27, 14

Slide 66

Slide 66 text

Sparse Data Sunday, April 27, 14

Slide 67

Slide 67 text

Sunday, April 27, 14

Slide 68

Slide 68 text

Insights Sunday, April 27, 14

Slide 69

Slide 69 text

Trends Sunday, April 27, 14

Slide 70

Slide 70 text

Sunday, April 27, 14

Slide 71

Slide 71 text

SVG Sunday, April 27, 14

Slide 72

Slide 72 text

Rubyvis Sunday, April 27, 14

Slide 73

Slide 73 text

require 'rubygems' require 'rubyvis' vis = Rubyvis::Panel.new do width 150 height 150 bar do data [1, 1.2, 1.7, 1.5, 0.7, 0.3] width 20 height {|d| d * 80} bottom(0) left {index * 25} end end vis.render() puts vis.to_svg # Output final SVG Sunday, April 27, 14

Slide 74

Slide 74 text

Sunday, April 27, 14

Slide 75

Slide 75 text

Sunday, April 27, 14

Slide 76

Slide 76 text

NVD3 Sunday, April 27, 14

Slide 77

Slide 77 text

XCharts C3.js Sunday, April 27, 14

Slide 78

Slide 78 text

Sunday, April 27, 14

Slide 79

Slide 79 text

D3JS Sunday, April 27, 14

Slide 80

Slide 80 text

Sunday, April 27, 14

Slide 81

Slide 81 text

Sunday, April 27, 14

Slide 82

Slide 82 text

SVG Sunday, April 27, 14

Slide 83

Slide 83 text

SVG Canvas Sunday, April 27, 14

Slide 84

Slide 84 text

Sunday, April 27, 14

Slide 85

Slide 85 text

Sunday, April 27, 14

Slide 86

Slide 86 text

Sunday, April 27, 14

Slide 87

Slide 87 text

•Internet Explorer 9 and 10+ •Chrome 24, 25, and 26+ •Safari 5 and 6+ •Firefox 19, 20, and 21+ Sunday, April 27, 14

Slide 88

Slide 88 text

Sunday, April 27, 14

Slide 89

Slide 89 text

Sunday, April 27, 14

Slide 90

Slide 90 text

Sunday, April 27, 14

Slide 91

Slide 91 text

Sunday, April 27, 14

Slide 92

Slide 92 text

Rubyvis for a pure Ruby Sunday, April 27, 14

Slide 93

Slide 93 text

NVD3, C3 or XCharts For Easy Stuff Sunday, April 27, 14

Slide 94

Slide 94 text

D3JS for loads of flexibility Sunday, April 27, 14

Slide 95

Slide 95 text

Limit Your Dataset for performance Sunday, April 27, 14

Slide 96

Slide 96 text

Don’t overwhelm with too much information Sunday, April 27, 14

Slide 97

Slide 97 text

Be Wary of N of 1 Experiments Sunday, April 27, 14

Slide 98

Slide 98 text

Twitter @lgleasain Github lgleasain www.lancegleason.com www.polyglotprogrammincinc.com [email protected] Sunday, April 27, 14