Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Command line & Data Science
Search
Yatish Mehta
October 28, 2014
0
57
Command line & Data Science
Yatish Mehta
October 28, 2014
Tweet
Share
More Decks by Yatish Mehta
See All by Yatish Mehta
Shore | A modern Ruby on Rails template to start your next project
yatish27
0
100
Taming The Rails Monolith Mammoth
yatish27
0
31
ActionCable and ReactJS tie the knot
yatish27
1
270
Featured
See All Featured
Being A Developer After 40
akosma
89
590k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
59k
Imperfection Machines: The Place of Print at Facebook
scottboms
267
13k
Building Flexible Design Systems
yeseniaperezcruz
328
38k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
49k
Side Projects
sachag
452
42k
Designing for Performance
lara
604
68k
Build The Right Thing And Hit Your Dates
maggiecrowley
34
2.5k
The Language of Interfaces
destraynor
156
24k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
129
19k
Fantastic passwords and where to find them - at NoRuKo
philnash
51
3k
The MySQL Ecosystem @ GitHub 2015
samlambert
250
12k
Transcript
Yatish Mehta @yatish27
Command line & Data Science
• pip install csvkit • cat leads.csv | csvlook •
csvstat leads.csv • csvgrep -c 6 -m samplecompany.com | csvlook 1.csvkit
2. grep,sed,sort,uniq • cat wiki.txt | grep -oE '\w+' |
tee words • < words grep '^a' | sort | uniq -c | sort -r • sed ’s/data/tata/g’ wiki.txt > wiki2.txt
• brew install jq • < data.json jq ‘.[]’ •
< data.json jq ‘.[] | select(.age>22)’ • cat data.json | jq '.[] | {isActive: ._id, name: .name}' 3. jq JSON processor
4. qstats • qstats one_hundred_milion.dat Min.
44.947 1st Qu. 93.2553 Median 100.001 Mean 100.001 3rd Qu. 106.747 Max. 156.997 Range 112.05 Std Dev. 10.0002 Length 100000000 • Faster than awk, sort, R
5. parallel • iterative • shell parallel.sh , each action
as a job • parallel keyword
Thank You