Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Command line & Data Science
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Yatish Mehta
October 28, 2014
63
0
Share
Command line & Data Science
Yatish Mehta
October 28, 2014
More Decks by Yatish Mehta
See All by Yatish Mehta
Shore | A modern Ruby on Rails template to start your next project
yatish27
0
120
Taming The Rails Monolith Mammoth
yatish27
0
35
ActionCable and ReactJS tie the knot
yatish27
1
280
Featured
See All Featured
Game over? The fight for quality and originality in the time of robots
wayneb77
1
160
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
120
Odyssey Design
rkendrick25
PRO
2
580
What the history of the web can teach us about the future of AI
inesmontani
PRO
1
530
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
770
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.7k
How to Build an AI Search Optimization Roadmap - Criteria and Steps to Take #SEOIRL
aleyda
1
2k
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
320
Exploring the relationship between traditional SERPs and Gen AI search
raygrieselhuber
PRO
2
3.8k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.6k
Music & Morning Musume
bryan
47
7.2k
Product Roadmaps are Hard
iamctodd
PRO
55
12k
Transcript
Yatish Mehta @yatish27
Command line & Data Science
• pip install csvkit • cat leads.csv | csvlook •
csvstat leads.csv • csvgrep -c 6 -m samplecompany.com | csvlook 1.csvkit
2. grep,sed,sort,uniq • cat wiki.txt | grep -oE '\w+' |
tee words • < words grep '^a' | sort | uniq -c | sort -r • sed ’s/data/tata/g’ wiki.txt > wiki2.txt
• brew install jq • < data.json jq ‘.[]’ •
< data.json jq ‘.[] | select(.age>22)’ • cat data.json | jq '.[] | {isActive: ._id, name: .name}' 3. jq JSON processor
4. qstats • qstats one_hundred_milion.dat Min.
44.947 1st Qu. 93.2553 Median 100.001 Mean 100.001 3rd Qu. 106.747 Max. 156.997 Range 112.05 Std Dev. 10.0002 Length 100000000 • Faster than awk, sort, R
5. parallel • iterative • shell parallel.sh , each action
as a job • parallel keyword
Thank You