Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How the Web Works: Lecture 9
Search
Abhinav Sharma
January 09, 2014
Education
0
59
How the Web Works: Lecture 9
This talk was designed for a class (98-135) taught at Carnegie Mellon University in Spring 2010.
Abhinav Sharma
January 09, 2014
Tweet
Share
More Decks by Abhinav Sharma
See All by Abhinav Sharma
How the Web Works: Lecture 5
abhinavsharma
1
70
How the Web Works: Lecture 6
abhinavsharma
0
44
How the Web Works: Lecture 7
abhinavsharma
0
43
How the Web Works: Lecture 8
abhinavsharma
0
110
How the Web Works: Lecture 3
abhinavsharma
0
32
How the Web Works: Lecture 2
abhinavsharma
1
44
How the Web Works: Lecture 1
abhinavsharma
2
120
Other Decks in Education
See All in Education
Ch2_-_Partie_3.pdf
bernhardsvt
0
100
技術を楽しもう/enjoy_engineering
studio_graph
1
420
CSS3 and Responsive Web Design - Lecture 5 - Web Technologies (1019888BNR)
signer
PRO
1
2.5k
Flip-videochat
matleenalaakso
0
14k
ヘイトスピーチがある世界のコミュニケーション
ktanishima
0
140
Nodiレクチャー 「CGと数学」講義資料 2024/11/19
masatatsu
2
200
Medidas en informática
irocho
0
300
LLMs for Social Simulation: Progress, Opportunities and Challenges
wingnus
1
100
Library Prefects 2024-2025
cbtlibrary
0
110
小学生にスクラムを試してみた件~中学受検までの100週間の舞台裏~
ukky86
0
340
H5P-työkalut
matleenalaakso
4
36k
Human Perception and Cognition - Lecture 4 - Human-Computer Interaction (1023841ANR)
signer
PRO
0
710
Featured
See All Featured
Raft: Consensus for Rubyists
vanstee
136
6.6k
Product Roadmaps are Hard
iamctodd
PRO
49
11k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
131
33k
Music & Morning Musume
bryan
46
6.2k
GraphQLの誤解/rethinking-graphql
sonatard
67
10k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
250
21k
Java REST API Framework Comparison - PWX 2021
mraible
PRO
28
8.2k
A designer walks into a library…
pauljervisheath
204
24k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
6.9k
Typedesign – Prime Four
hannesfritz
40
2.4k
Speed Design
sergeychernyshev
25
620
Reflections from 52 weeks, 52 projects
jeffersonlam
346
20k
Transcript
None
Lecture 9 Distributed Computing & Scaling
None
Homeworks
Homeworks Overall, I failed =(
Homeworks Overall, I failed =( Should’ve done it in winter
Homeworks Overall, I failed =( Should’ve done it in winter
Get 3 Points to Pass
Homeworks Overall, I failed =( Should’ve done it in winter
Get 3 Points to Pass Hopefully, 10 by the end
Homeworks Overall, I failed =( Should’ve done it in winter
Get 3 Points to Pass Hopefully, 10 by the end I don’t want to fail anyone
Zeliveau Please Start Soon
None
Rankmaniac http://scienceoftheweb.org/15-396/assignments/hw6.pdf
None
None
“Essentially, using nofollow causes us to drop the target links
from our overall graph of the web”
None
None
SSL/TLS That HTTPS business...
None
None
None
None
None
None
Visible to Wireless Network, ISP, Server LAN
None
Let encrypt with a key!
Let encrypt with a key! ENC(K, “MES”) = “NFT” |
DEC(K, “NFT”) = “MES” K = “Shift One Alphabet”
Let encrypt with a key! But how do we share
the key? ENC(K, “MES”) = “NFT” | DEC(K, “NFT”) = “MES” K = “Shift One Alphabet”
Public Key Encryption Insanely Awesomely Brilliant
n = p * q
n = p * q Given These
n = p * q Given These Easy to Compute
n = p * q Given These Easy to Compute
Given This
n = p * q Given These Easy to Compute
Given This Possible but...
RSA
Rivest RSA
Rivest Shamir RSA
Rivest Shamir Adleman RSA
Public Key Encryption
Public Key Encryption Create an Algorithm that...
Public Key Encryption Create an Algorithm that... uses n to
encrypt
Public Key Encryption Create an Algorithm that... uses n to
encrypt but needs p & q to decrypt
Public Key Encryption Create an Algorithm that... uses n to
encrypt but needs p & q to decrypt Publish n as public key
Public Key Encryption Create an Algorithm that... uses n to
encrypt but needs p & q to decrypt Publish n as public key Keep p & q
Public Key Encryption Create an Algorithm that... uses n to
encrypt but needs p & q to decrypt Publish n as public key Keep p & q Heard of PGP?
None
Who are you anyway?
Who are you anyway? Aha, an Imposter!
Who are you anyway? Aha, an Imposter! The Verification Problem
None
None
None
Browsers Preinstalled with some CAs
Browsers Preinstalled with some CAs
None
Install if you trust CMU
None
None
P2P ... and the indexing problem
Client-Server Model
Client-Server Model I can haz music!
None
P2P Model
P2P Model Who has my file?
More Generally Distributed Hash Table Given a Key, get the
Value Stored across computers Google’s Index (GFS) So, how do you find a file?
Ask Everyone
Ask Everyone What Not to Do!
None
Computers (N)
Computers (N) Files (K)
Computers (N) Files (K) 2^m > max{N,K}
8 12 16 Example
1 3 4 5 7 9 12 15
Label Nodes between 0 and 2^m 1 3 4 5
7 9 12 15
1 6 12 2 8 13 4 9 15 5
11 16
Label Keys between 0 and 2^m 1 6 12 2
8 13 4 9 15 5 11 16
Assignment Assign Key K to Node K If Node K
doesn’t exist ... assign to next node 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
1 6 12 2 8 13 4 9 15 5
11 16 1 3 4 5 7 9 12 15
1 6 12 2 8 13 4 9 15 5
11 16 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
Searching For key K ~ For Node K Linear Search
Start at Machine 1, goto next ... so on until found! 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
But Wait They’re sorted, seems familiar?
Binary Search Is 8 in the list? What position is
it? 1 3 5 6 8 9
Binary Search Is 8 in the list? What position is
it? 1 3 5 6 8 9 1 3 5 6 8 9
Binary Search Is 8 in the list? What position is
it? 1 3 5 6 8 9 1 3 5 6 8 9 6 8 9
Binary Search Is 8 in the list? What position is
it? 1 3 5 6 8 9 1 3 5 6 8 9 6 8 9 8 9
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Each machine stores address to some others!
Finger Table 1 3 4 5 7 9 1 1
1 6 12 2 8 13 4 9 15 5 11 16
Finger Table Total Machines (2^m) = 8 1 3 4
5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
Finger Table Total Machines (2^m) = 8 Machine N7 stores:
1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
Finger Table Total Machines (2^m) = 8 Machine N7 stores:
addr(N7 + 1) addr(N7 + 2) addr(N7 + 4) = addr(N7 + 2^(m-1)) 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
Finger Table Total Machines (2^m) = 8 Machine N7 stores:
addr(N7 + 1) addr(N7 + 2) addr(N7 + 4) = addr(N7 + 2^(m-1)) Can Take short-cuts! 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
1 3 4 5 7 9 12 15
1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Make Biggest Jump | Too Low | Use N7’s table
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Make Biggest Jump | Too Low | Use N7’s table
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Halve the remaining ring
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Halve the remaining ring
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Done!
Chord Protocol 1 3 4 5 7 9 1 1
1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
Chord Protocol log(N) Performance 1 3 4 5 7 9
1 1 1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
Chord Protocol log(N) Performance Fully Decentralizes P2P 1 3 4
5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
Chord Protocol log(N) Performance Fully Decentralizes P2P Napster was Centralized
1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
Chord Protocol log(N) Performance Fully Decentralizes P2P Napster was Centralized
... hence closed down! 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
Chord Protocol log(N) Performance Fully Decentralizes P2P Napster was Centralized
... hence closed down! RIAA/MPAA: Oh Noes! 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
MapReduce But first, an OCD Programmer
alert("get the lobster"); PutInPot("lobster"); PutInPot("water"); alert("get the chicken"); BoomBoom("chicken"); BoomBoom("coconut");
function Cook( i1, i2, f ) { alert("get the "
+ i1); f(i1); f(i2); } Cook( "lobster", "water", PutInPot ); Cook( "chicken", "coconut", BoomBoom );
Map
var a = [1,2,3]; for (i=0; i<a.length; i++) { a[i]
= a[i] * 2; } for (i=0; i<a.length; i++) { alert(a[i]); }
function map(fmap, a) { for (i = 0; i <
a.length; i++) { a[i] = fmap(a[i]); } } map( function(x){return x*2;}, a ); map( alert, a );
Reduce
function sum(a) { var s = 0; for (i =
0; i < a.length; i++) s += a[i]; return s; } function join(a) { var s = ""; for (i = 0; i < a.length; i++) s += a[i]; return s; }
function reduce(fred, a, init) { var s = init; for
(i = 0; i < a.length; i++) s = fred( s, a[i] ); return s; }
function sum(a) { return reduce( function(a, b){ return a +
b; }, a, 0 ); } function join(a) { return reduce( function(a, b){ return a + b; }, a, "" ); }
Map Reduce
Map [1, 2, 3, 4, 5] Reduce
Map [1, 2, 3, 4, 5] [2, 4, 6, 8
, 10] Reduce
Map [1, 2, 3, 4, 5] [2, 4, 6, 8
, 10] [One, Two, Three, Four , Five] Reduce
Map [1, 2, 3, 4, 5] [2, 4, 6, 8
, 10] [One, Two, Three, Four , Five] Reduce [1, 1, 1, 1, 1]
Map [1, 2, 3, 4, 5] [2, 4, 6, 8
, 10] [One, Two, Three, Four , Five] Reduce [1, 1, 1, 1, 1] 5
Map [1, 2, 3, 4, 5] [2, 4, 6, 8
, 10] [One, Two, Three, Four , Five] Reduce [1, 1, 1, 1, 1] 5 “11111”
“Without understanding functional programming, you can't invent MapReduce. The very
fact that Google invented MapReduce, and Microsoft didn't, says something about why Microsoft is still playing catch up” - Joel Spolsky
Pop Quiz [1, 2, 3, 4, 5] [“odd”, “even”, “odd”,
“even”, “odd”] “oddevenoddevenoddeven”
How is that useful?
Word Count Given a document # occurrences of each word
Let’s try the intuitive way...
None
bigFile is too big? Have two files!
None
None
Lets See that Again
None
BoomBoom("chicken"); BoomBoom("coconut");
BoomBoom("chicken"); BoomBoom("coconut"); Map
BoomBoom("chicken"); BoomBoom("coconut"); Map
BoomBoom("chicken"); BoomBoom("coconut"); Map function reduce(union, [d1,d2], [])
BoomBoom("chicken"); BoomBoom("coconut"); Map Reduce function reduce(union, [d1,d2], [])
foo foo baz bar gor baz goo bar
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper foo 1 foo 1 baz 1 bar 1 gor 1 baz 1 goo 1 bar 1
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper foo 1 foo 1 baz 1 bar 1 gor 1 baz 1 goo 1 bar 1 Bucket by Key
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper foo 1 foo 1 baz 1 bar 1 gor 1 baz 1 goo 1 bar 1 Bucket by Key [foo, foo] [baz, baz] [bar, bar] [goo] [gor]
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper foo 1 foo 1 baz 1 bar 1 gor 1 baz 1 goo 1 bar 1 Bucket by Key [foo, foo] [baz, baz] [bar, bar] [goo] [gor] foo reducer foo reducer foo reducer foo reducer foo reducer
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper foo 1 foo 1 baz 1 bar 1 gor 1 baz 1 goo 1 bar 1 Bucket by Key [foo, foo] [baz, baz] [bar, bar] [goo] [gor] foo reducer foo reducer foo reducer foo reducer foo reducer 2 2 2 1 1
Who Does What?
Who Does What? User: Write Mapper and Reducer
Who Does What? User: Write Mapper and Reducer Hadoop: Splitting,
Bucketing
Who Does What? User: Write Mapper and Reducer Hadoop: Splitting,
Bucketing Cons: Restricted Paradigm
Who Does What? User: Write Mapper and Reducer Hadoop: Splitting,
Bucketing Cons: Restricted Paradigm Pros: Generalized, Safe
Who Does What? User: Write Mapper and Reducer Hadoop: Splitting,
Bucketing Cons: Restricted Paradigm Pros: Generalized, Safe Implementing can be tricky!
None
Abstract, Complex? But you know what...
The Point People talk about scaling ... but now you
know it! Distributing Files Distributing Computation
Higher Level Point This isn’t a CS class... ... but
I’m a CS major =P Its not all HTML/CSS There’s some serious CS here!
http://hadoop.apache.org/ http://www.cloudera.com/resources/?type=Training
Poor Man’s Scaling
Redundancy Replicate across computers Main server balances load Other servers
serve content Also useful for data backups Usually Host Managed
Caching PHP is dynamic ... usually unnecessarily Calculate, cache, reserve
Memoization PHP/memcached http://en.wikipedia.org/wiki/Memcached
Bottlenecks Content Bandwidth Databases External APIs Script busy computing etc...
S3 EC2 http://www.youtube.com/watch?v=Iaxu-NLecm4 http://www.youtube.com/watch?v=bBajLxeKqoY
Homework 7 is out No Class Next Week
None
Photo Credits http://mi9.com/datawallpapers/data/12/993/1217993797/eye-with-black-background_1280x1024.jpg http://www.aemmp.org/site/wp-content/uploads/2009/10/imgname-riaa_training_video_leaked_more_stupid_than_expected-50226711-RIAA.jpg http://jasonjeffrey.files.wordpress.com/2007/09/drm.jpg http://www.sdtimes.com/blog/post/2009/image.axd?picture=2009%2F7%2Fhadoopephant.jpg
None