Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How the Web Works: Lecture 9
Search
Abhinav Sharma
January 09, 2014
Education
0
63
How the Web Works: Lecture 9
This talk was designed for a class (98-135) taught at Carnegie Mellon University in Spring 2010.
Abhinav Sharma
January 09, 2014
Tweet
Share
More Decks by Abhinav Sharma
See All by Abhinav Sharma
How the Web Works: Lecture 5
abhinavsharma
1
73
How the Web Works: Lecture 6
abhinavsharma
0
48
How the Web Works: Lecture 7
abhinavsharma
0
45
How the Web Works: Lecture 8
abhinavsharma
0
110
How the Web Works: Lecture 3
abhinavsharma
0
35
How the Web Works: Lecture 2
abhinavsharma
1
46
How the Web Works: Lecture 1
abhinavsharma
2
120
Other Decks in Education
See All in Education
Linguaxes de programación
irocho
0
300
みんなのコード 2024年度活動報告書/ 2025年度活動計画書
codeforeveryone
0
370
20250830_本社にみんなの公園を作ってみた
yoneyan
0
140
GOVERNOR ADDRESS:2025年9月29日合同公式訪問例会:2720 Japan O.K. ロータリーEクラブ、2025年10月6日卓話:藤田 千克由 氏(国際ロータリー第2720地区 2025-2026年度 ガバナー・大分中央ロータリークラブ・大分トキハタクシー(株)顧問)
2720japanoke
0
670
フィードバックの伝え方、受け身のココロ / The Way of Feedback: Words and the Receiving Heart
spring_aki
1
180
社外コミュニティの歩き方
masakiokuda
2
210
子どもが自立した学習者となるデジタルの活用について
naokikato
PRO
0
120
仏教の源流からの奈良県中南和_奈良まほろば館‗飛鳥・藤原DAO/asuka-fujiwara_Saraswati
tkimura12
0
150
2024-2025 CBT top items
cbtlibrary
0
130
授業レポート:共感と協調のリーダーシップ(2025年上期)
jibunal
1
130
Présentation_2nde_2025.pdf
bernhardsvt
0
280
言葉の文化祭2025:IKIGAI World Fes:program
tsutsumi
1
1.4k
Featured
See All Featured
Building Adaptive Systems
keathley
44
2.8k
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
2
130
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.5k
Art, The Web, and Tiny UX
lynnandtonic
303
21k
Building Flexible Design Systems
yeseniaperezcruz
329
39k
Rails Girls Zürich Keynote
gr2m
95
14k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
130k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3k
Navigating Team Friction
lara
190
15k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.6k
A Modern Web Designer's Workflow
chriscoyier
697
190k
VelocityConf: Rendering Performance Case Studies
addyosmani
333
24k
Transcript
None
Lecture 9 Distributed Computing & Scaling
None
Homeworks
Homeworks Overall, I failed =(
Homeworks Overall, I failed =( Should’ve done it in winter
Homeworks Overall, I failed =( Should’ve done it in winter
Get 3 Points to Pass
Homeworks Overall, I failed =( Should’ve done it in winter
Get 3 Points to Pass Hopefully, 10 by the end
Homeworks Overall, I failed =( Should’ve done it in winter
Get 3 Points to Pass Hopefully, 10 by the end I don’t want to fail anyone
Zeliveau Please Start Soon
None
Rankmaniac http://scienceoftheweb.org/15-396/assignments/hw6.pdf
None
None
“Essentially, using nofollow causes us to drop the target links
from our overall graph of the web”
None
None
SSL/TLS That HTTPS business...
None
None
None
None
None
None
Visible to Wireless Network, ISP, Server LAN
None
Let encrypt with a key!
Let encrypt with a key! ENC(K, “MES”) = “NFT” |
DEC(K, “NFT”) = “MES” K = “Shift One Alphabet”
Let encrypt with a key! But how do we share
the key? ENC(K, “MES”) = “NFT” | DEC(K, “NFT”) = “MES” K = “Shift One Alphabet”
Public Key Encryption Insanely Awesomely Brilliant
n = p * q
n = p * q Given These
n = p * q Given These Easy to Compute
n = p * q Given These Easy to Compute
Given This
n = p * q Given These Easy to Compute
Given This Possible but...
RSA
Rivest RSA
Rivest Shamir RSA
Rivest Shamir Adleman RSA
Public Key Encryption
Public Key Encryption Create an Algorithm that...
Public Key Encryption Create an Algorithm that... uses n to
encrypt
Public Key Encryption Create an Algorithm that... uses n to
encrypt but needs p & q to decrypt
Public Key Encryption Create an Algorithm that... uses n to
encrypt but needs p & q to decrypt Publish n as public key
Public Key Encryption Create an Algorithm that... uses n to
encrypt but needs p & q to decrypt Publish n as public key Keep p & q
Public Key Encryption Create an Algorithm that... uses n to
encrypt but needs p & q to decrypt Publish n as public key Keep p & q Heard of PGP?
None
Who are you anyway?
Who are you anyway? Aha, an Imposter!
Who are you anyway? Aha, an Imposter! The Verification Problem
None
None
None
Browsers Preinstalled with some CAs
Browsers Preinstalled with some CAs
None
Install if you trust CMU
None
None
P2P ... and the indexing problem
Client-Server Model
Client-Server Model I can haz music!
None
P2P Model
P2P Model Who has my file?
More Generally Distributed Hash Table Given a Key, get the
Value Stored across computers Google’s Index (GFS) So, how do you find a file?
Ask Everyone
Ask Everyone What Not to Do!
None
Computers (N)
Computers (N) Files (K)
Computers (N) Files (K) 2^m > max{N,K}
8 12 16 Example
1 3 4 5 7 9 12 15
Label Nodes between 0 and 2^m 1 3 4 5
7 9 12 15
1 6 12 2 8 13 4 9 15 5
11 16
Label Keys between 0 and 2^m 1 6 12 2
8 13 4 9 15 5 11 16
Assignment Assign Key K to Node K If Node K
doesn’t exist ... assign to next node 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
1 6 12 2 8 13 4 9 15 5
11 16 1 3 4 5 7 9 12 15
1 6 12 2 8 13 4 9 15 5
11 16 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
Searching For key K ~ For Node K Linear Search
Start at Machine 1, goto next ... so on until found! 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
But Wait They’re sorted, seems familiar?
Binary Search Is 8 in the list? What position is
it? 1 3 5 6 8 9
Binary Search Is 8 in the list? What position is
it? 1 3 5 6 8 9 1 3 5 6 8 9
Binary Search Is 8 in the list? What position is
it? 1 3 5 6 8 9 1 3 5 6 8 9 6 8 9
Binary Search Is 8 in the list? What position is
it? 1 3 5 6 8 9 1 3 5 6 8 9 6 8 9 8 9
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Each machine stores address to some others!
Finger Table 1 3 4 5 7 9 1 1
1 6 12 2 8 13 4 9 15 5 11 16
Finger Table Total Machines (2^m) = 8 1 3 4
5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
Finger Table Total Machines (2^m) = 8 Machine N7 stores:
1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
Finger Table Total Machines (2^m) = 8 Machine N7 stores:
addr(N7 + 1) addr(N7 + 2) addr(N7 + 4) = addr(N7 + 2^(m-1)) 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
Finger Table Total Machines (2^m) = 8 Machine N7 stores:
addr(N7 + 1) addr(N7 + 2) addr(N7 + 4) = addr(N7 + 2^(m-1)) Can Take short-cuts! 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16
1 3 4 5 7 9 12 15
1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Make Biggest Jump | Too Low | Use N7’s table
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Make Biggest Jump | Too Low | Use N7’s table
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Halve the remaining ring
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Halve the remaining ring
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15
16 1 6 12 2 8 13 4 9 15
5 11 1 3 4 5 7 9 12 15 Done!
Chord Protocol 1 3 4 5 7 9 1 1
1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
Chord Protocol log(N) Performance 1 3 4 5 7 9
1 1 1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
Chord Protocol log(N) Performance Fully Decentralizes P2P 1 3 4
5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
Chord Protocol log(N) Performance Fully Decentralizes P2P Napster was Centralized
1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
Chord Protocol log(N) Performance Fully Decentralizes P2P Napster was Centralized
... hence closed down! 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
Chord Protocol log(N) Performance Fully Decentralizes P2P Napster was Centralized
... hence closed down! RIAA/MPAA: Oh Noes! 1 3 4 5 7 9 1 1 1 6 12 2 8 13 4 9 15 5 11 16 http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29
MapReduce But first, an OCD Programmer
alert("get the lobster"); PutInPot("lobster"); PutInPot("water"); alert("get the chicken"); BoomBoom("chicken"); BoomBoom("coconut");
function Cook( i1, i2, f ) { alert("get the "
+ i1); f(i1); f(i2); } Cook( "lobster", "water", PutInPot ); Cook( "chicken", "coconut", BoomBoom );
Map
var a = [1,2,3]; for (i=0; i<a.length; i++) { a[i]
= a[i] * 2; } for (i=0; i<a.length; i++) { alert(a[i]); }
function map(fmap, a) { for (i = 0; i <
a.length; i++) { a[i] = fmap(a[i]); } } map( function(x){return x*2;}, a ); map( alert, a );
Reduce
function sum(a) { var s = 0; for (i =
0; i < a.length; i++) s += a[i]; return s; } function join(a) { var s = ""; for (i = 0; i < a.length; i++) s += a[i]; return s; }
function reduce(fred, a, init) { var s = init; for
(i = 0; i < a.length; i++) s = fred( s, a[i] ); return s; }
function sum(a) { return reduce( function(a, b){ return a +
b; }, a, 0 ); } function join(a) { return reduce( function(a, b){ return a + b; }, a, "" ); }
Map Reduce
Map [1, 2, 3, 4, 5] Reduce
Map [1, 2, 3, 4, 5] [2, 4, 6, 8
, 10] Reduce
Map [1, 2, 3, 4, 5] [2, 4, 6, 8
, 10] [One, Two, Three, Four , Five] Reduce
Map [1, 2, 3, 4, 5] [2, 4, 6, 8
, 10] [One, Two, Three, Four , Five] Reduce [1, 1, 1, 1, 1]
Map [1, 2, 3, 4, 5] [2, 4, 6, 8
, 10] [One, Two, Three, Four , Five] Reduce [1, 1, 1, 1, 1] 5
Map [1, 2, 3, 4, 5] [2, 4, 6, 8
, 10] [One, Two, Three, Four , Five] Reduce [1, 1, 1, 1, 1] 5 “11111”
“Without understanding functional programming, you can't invent MapReduce. The very
fact that Google invented MapReduce, and Microsoft didn't, says something about why Microsoft is still playing catch up” - Joel Spolsky
Pop Quiz [1, 2, 3, 4, 5] [“odd”, “even”, “odd”,
“even”, “odd”] “oddevenoddevenoddeven”
How is that useful?
Word Count Given a document # occurrences of each word
Let’s try the intuitive way...
None
bigFile is too big? Have two files!
None
None
Lets See that Again
None
BoomBoom("chicken"); BoomBoom("coconut");
BoomBoom("chicken"); BoomBoom("coconut"); Map
BoomBoom("chicken"); BoomBoom("coconut"); Map
BoomBoom("chicken"); BoomBoom("coconut"); Map function reduce(union, [d1,d2], [])
BoomBoom("chicken"); BoomBoom("coconut"); Map Reduce function reduce(union, [d1,d2], [])
foo foo baz bar gor baz goo bar
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper foo 1 foo 1 baz 1 bar 1 gor 1 baz 1 goo 1 bar 1
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper foo 1 foo 1 baz 1 bar 1 gor 1 baz 1 goo 1 bar 1 Bucket by Key
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper foo 1 foo 1 baz 1 bar 1 gor 1 baz 1 goo 1 bar 1 Bucket by Key [foo, foo] [baz, baz] [bar, bar] [goo] [gor]
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper foo 1 foo 1 baz 1 bar 1 gor 1 baz 1 goo 1 bar 1 Bucket by Key [foo, foo] [baz, baz] [bar, bar] [goo] [gor] foo reducer foo reducer foo reducer foo reducer foo reducer
foo foo baz bar gor baz goo bar foo foo
baz bar gor baz goo bar Mapper Mapper foo 1 foo 1 baz 1 bar 1 gor 1 baz 1 goo 1 bar 1 Bucket by Key [foo, foo] [baz, baz] [bar, bar] [goo] [gor] foo reducer foo reducer foo reducer foo reducer foo reducer 2 2 2 1 1
Who Does What?
Who Does What? User: Write Mapper and Reducer
Who Does What? User: Write Mapper and Reducer Hadoop: Splitting,
Bucketing
Who Does What? User: Write Mapper and Reducer Hadoop: Splitting,
Bucketing Cons: Restricted Paradigm
Who Does What? User: Write Mapper and Reducer Hadoop: Splitting,
Bucketing Cons: Restricted Paradigm Pros: Generalized, Safe
Who Does What? User: Write Mapper and Reducer Hadoop: Splitting,
Bucketing Cons: Restricted Paradigm Pros: Generalized, Safe Implementing can be tricky!
None
Abstract, Complex? But you know what...
The Point People talk about scaling ... but now you
know it! Distributing Files Distributing Computation
Higher Level Point This isn’t a CS class... ... but
I’m a CS major =P Its not all HTML/CSS There’s some serious CS here!
http://hadoop.apache.org/ http://www.cloudera.com/resources/?type=Training
Poor Man’s Scaling
Redundancy Replicate across computers Main server balances load Other servers
serve content Also useful for data backups Usually Host Managed
Caching PHP is dynamic ... usually unnecessarily Calculate, cache, reserve
Memoization PHP/memcached http://en.wikipedia.org/wiki/Memcached
Bottlenecks Content Bandwidth Databases External APIs Script busy computing etc...
S3 EC2 http://www.youtube.com/watch?v=Iaxu-NLecm4 http://www.youtube.com/watch?v=bBajLxeKqoY
Homework 7 is out No Class Next Week
None
Photo Credits http://mi9.com/datawallpapers/data/12/993/1217993797/eye-with-black-background_1280x1024.jpg http://www.aemmp.org/site/wp-content/uploads/2009/10/imgname-riaa_training_video_leaked_more_stupid_than_expected-50226711-RIAA.jpg http://jasonjeffrey.files.wordpress.com/2007/09/drm.jpg http://www.sdtimes.com/blog/post/2009/image.axd?picture=2009%2F7%2Fhadoopephant.jpg
None