Error Correction Over Noisy Channels

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Error Correction Over Noisy Channels and the Noisy Channel Coding Theorem Charles Julian Knight Berry College March 20, 2012

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Introduction How do you... • Talk to space shuttles? • Avoid cross-talk over telephone lines? • Make a CD that still plays if you scratch it?

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 History Claude Shannon “The Father of Information Theory” • 1916 - 2001 • MIT • Diﬀerential Analyzer • Bell Labs • ENIGMA Papers • A Symbolic Analysis of Relay and Switching Circuits “Possibly the most important, and also the most famous, master’s thesis of the century.” -Howard Gardner • A Mathematical Theory of Communication http://www.bell-labs.com/news/2001/february/26/1.html Copyright Bell Labs

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Deﬁnitions and Notation • Shannon bit – “The amount of information gained (or entropy removed) upon learning the answer to a question whose two possible answers were equally likely, a priori.”

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Deﬁnitions and Notation • Shannon bit – “The amount of information gained (or entropy removed) upon learning the answer to a question whose two possible answers were equally likely, a priori.” • Target Space – {x1, x2, ..., xn}

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Deﬁnitions and Notation • Shannon bit – “The amount of information gained (or entropy removed) upon learning the answer to a question whose two possible answers were equally likely, a priori.” • Target Space – {x1, x2, ..., xn} • P(X = x1) or simply P(x1)

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Deﬁnitions and Notation • Shannon bit – “The amount of information gained (or entropy removed) upon learning the answer to a question whose two possible answers were equally likely, a priori.” • Target Space – {x1, x2, ..., xn} • P(X = x1) or simply P(x1) • Expected Value – E(X) = n i=1 xiP(xi)

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Bayes’ Theorem Conditional probability: P(X|Y )

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Bayes’ Theorem Conditional probability: P(X|Y ) P(U|V ) = P(V |U)P(U) P(V )

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Bayes’ Theorem Conditional probability: P(X|Y ) P(U|V ) = P(V |U)P(U) P(V ) = P(UandV ) P(V )

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Shannon Entropy A measure of the amount of uncertainty in the value of a random variable, measured in Shannon bits. Example: Coin Toss

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Shannon Entropy A measure of the amount of uncertainty in the value of a random variable, measured in Shannon bits. Example: Coin Toss Two notes: • Entropy in Thermodynamics

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Shannon Entropy A measure of the amount of uncertainty in the value of a random variable, measured in Shannon bits. Example: Coin Toss Two notes: • Entropy in Thermodynamics • Shannon bits vs. digital bits

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Entropy Function For random variable X with target space x1, x2, ..., xn H(X) = n i=0 P(xi) log2 1 P(xi)

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Entropy Function For random variable X with target space x1, x2, ..., xn H(X) = n i=0 P(xi) log2 1 P(xi) or, if we pop a sign out of the log, = − n i=0 P(xi) log2 (P(xi))

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Entropy Function For random variable X with target space x1, x2, ..., xn H(X) = n i=0 P(xi) log2 1 P(xi) or, if we pop a sign out of the log, = − n i=0 P(xi) log2 (P(xi)) compare this to statistical mechanics: S = −k i pi log(pi)

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Example Weighted coin, 2 3 heads, 1 3 tails: H(heads) = 2 3 log 3 2 ≈ .39bits H(tails) = 1 3 log (3) ≈ .53bits

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Example Weighted coin, 2 3 heads, 1 3 tails: H(heads) = 2 3 log 3 2 ≈ .39bits H(tails) = 1 3 log (3) ≈ .53bits H(cointoss) = H(heads)+H(tails) = 2 3 log 3 2 + 1 3 log (3) ≈ .92bits

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Example Weighted coin, 2 3 heads, 1 3 tails: H(heads) = 2 3 log 3 2 ≈ .39bits H(tails) = 1 3 log (3) ≈ .53bits H(cointoss) = H(heads)+H(tails) = 2 3 log 3 2 + 1 3 log (3) ≈ .92bits 6-sided die 1 6 log(6) + 1 6 log(6) + ... + 1 6 log(6) = log(6) ≈ 2.59bits

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Example Weighted coin, 2 3 heads, 1 3 tails: H(heads) = 2 3 log 3 2 ≈ .39bits H(tails) = 1 3 log (3) ≈ .53bits H(cointoss) = H(heads)+H(tails) = 2 3 log 3 2 + 1 3 log (3) ≈ .92bits 6-sided die 1 6 log(6) + 1 6 log(6) + ... + 1 6 log(6) = log(6) ≈ 2.59bits Weighted 6-sided die {1 2 , 1 10 , 1 10 , 1 10 , 1 10 } 1 2 log(2) + 1 2 log(10)

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Example Weighted coin, 2 3 heads, 1 3 tails: H(heads) = 2 3 log 3 2 ≈ .39bits H(tails) = 1 3 log (3) ≈ .53bits H(cointoss) = H(heads)+H(tails) = 2 3 log 3 2 + 1 3 log (3) ≈ .92bits 6-sided die 1 6 log(6) + 1 6 log(6) + ... + 1 6 log(6) = log(6) ≈ 2.59bits Weighted 6-sided die {1 2 , 1 10 , 1 10 , 1 10 , 1 10 } 1 2 log(2) + 1 2 log(10) ≈ 2.16bits

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Binary Entropy Function http://en.wikipedia.org/wiki/File:Binary entropy plot.svg Creative Commons 3.0 BY-SA

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Binomial Experiments A ﬁxed number n trials of a system with two outcomes, with a probability of success p. Then the probability of k successes is P(k) = n k pk(1 − p)n−k

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Binomial Experiments A ﬁxed number n trials of a system with two outcomes, with a probability of success p. Then the probability of k successes is P(k) = n k pk(1 − p)n−k Binomial coeﬃcient: n! k!(n − k)!

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Noisy Channels • Channel – A medium (physical or logical) over which information is sent.

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Noisy Channels • Channel – A medium (physical or logical) over which information is sent. • Noisy channel – Channel which has some probability (noise parameter f) that a bit will swap values (be “ﬂipped”).

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Noise Simulation By Randall Monroe, xkcd.com/171, Creative Commons 2.5 BY-NC

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Hamming Distance Simply the amount of bits two messages diﬀer by. X Y Hamming Distance 0000 1011 3 1000 1110 2 11010 11010 0

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Repetition Codes Let’s send a message! Noise frequency f = 25% s t r s’

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Repetition Codes Let’s send a message! Noise frequency f = 25% s t r s’ 11010

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Repetition Codes Let’s send a message! Noise frequency f = 25% s t r s’ 11010 11010 11010 11010

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Repetition Codes Let’s send a message! Noise frequency f = 25% s t r s’ 11010 11010 11010 11010 10010 11011 10000

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Repetition Codes Let’s send a message! Noise frequency f = 25% s t r s’ 11010 11010 11010 11010 10010 11011 10000 10010

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Repetition Codes Let’s send a message! Noise frequency f = 25% s t r s’ 11010 11010 11010 11010 10010 11011 10000 10010 Hamming distance from s to s’ = 1. Error of 20%. How much, in general, do we reduce the error by?

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Repetition Codes Error Correction in General case:

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Repetition Codes Error Correction in General case: P(errorRN ) = N n=N+1 2 N n fn(1 − f)N−n This is the sum of Binomial probabilities of getting more than half the bits ﬂipped.

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Repetition Codes Error Correction in General case: P(errorRN ) = N n=N+1 2 N n fn(1 − f)N−n This is the sum of Binomial probabilities of getting more than half the bits ﬂipped. Error probabilities for f = .25: • P(errorR3) ≈ .156 • P(errorR7) ≈ .071

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Repetition Codes Error Correction in General case: P(errorRN ) = N n=N+1 2 N n fn(1 − f)N−n This is the sum of Binomial probabilities of getting more than half the bits ﬂipped. Error probabilities for f = .25: • P(errorR3) ≈ .156 • P(errorR7) ≈ .071 Rate reduction. Is there a better way?

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Joint Entropy and Mutual Information For random variables X = {x1, ..xn} and Y = {y1, ..., ym}:

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Joint Entropy and Mutual Information For random variables X = {x1, ..xn} and Y = {y1, ..., ym}: • Joint Entropy – H(X, Y )

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Joint Entropy and Mutual Information For random variables X = {x1, ..xn} and Y = {y1, ..., ym}: • Joint Entropy – H(X, Y ) = n i=1 m j=1 H(X = xiandY = yj)

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Joint Entropy and Mutual Information For random variables X = {x1, ..xn} and Y = {y1, ..., ym}: • Joint Entropy – H(X, Y ) = n i=1 m j=1 H(X = xiandY = yj) • Mutual Information – I(X : Y )

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Joint Entropy and Mutual Information For random variables X = {x1, ..xn} and Y = {y1, ..., ym}: • Joint Entropy – H(X, Y ) = n i=1 m j=1 H(X = xiandY = yj) • Mutual Information – I(X : Y ) = H(X) + H(Y ) − H(X, Y )

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 −0log0? lim x→0+ −xlog2(x) L’Hˆ opital’s Rule If: 1. limx→c f(x) = limx→c = 0 or ±∞ 2. limx→c f (x) g (x) exists 3. g (x) = 0 Then: lim x→c f(x) g(x) = lim x→c f (x) g (x)

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 −0log0? lim x→0+ −xlog2(x) L’Hˆ opital’s Rule If: 1. limx→c f(x) = limx→c = 0 or ±∞ 2. limx→c f (x) g (x) exists 3. g (x) = 0 Then: lim x→c f(x) g(x) = lim x→c f (x) g (x) • f(x) = log2(x) → f (x) = 1 x • g(x) = −1 x → g (x) = 1 x2

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 −0log0? lim x→0+ −xlog2(x) L’Hˆ opital’s Rule If: 1. limx→c f(x) = limx→c = 0 or ±∞ 2. limx→c f (x) g (x) exists 3. g (x) = 0 Then: lim x→c f(x) g(x) = lim x→c f (x) g (x) • f(x) = log2(x) → f (x) = 1 x • g(x) = −1 x → g (x) = 1 x2 • limx→0+ f(x) g(x) = limx→0+ f (x) g (x) = limx→0+ x2 x = limx→c x = 0

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Example Take two coins, one fair, one two-headed. Choose one randomly (X) and ﬂip it twice. Random variable Y corresponds to the number of heads. X = {0, 1} (fair, unfair), Y = {0, 1, 2} (1 head, 2 heads, 3 heads)

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Example Take two coins, one fair, one two-headed. Choose one randomly (X) and ﬂip it twice. Random variable Y corresponds to the number of heads. X = {0, 1} (fair, unfair), Y = {0, 1, 2} (1 head, 2 heads, 3 heads) X Unfair X=0 H H F=2 H F=2 H H F=2 H F=2 Fair X=1 H H F=2 T F=1 T H F=1 T F=0

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Example Take two coins, one fair, one two-headed. Choose one randomly (X) and ﬂip it twice. Random variable Y corresponds to the number of heads. X = {0, 1} (fair, unfair), Y = {0, 1, 2} (1 head, 2 heads, 3 heads) X Unfair X=0 H H F=2 H F=2 H H F=2 H F=2 Fair X=1 H H F=2 T F=1 T H F=1 T F=0 H(X) H(Y )

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Example Take two coins, one fair, one two-headed. Choose one randomly (X) and ﬂip it twice. Random variable Y corresponds to the number of heads. X = {0, 1} (fair, unfair), Y = {0, 1, 2} (1 head, 2 heads, 3 heads) X Unfair X=0 H H F=2 H F=2 H H F=2 H F=2 Fair X=1 H H F=2 T F=1 T H F=1 T F=0 H(X) H(Y ) H(X, Y )

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Example Take two coins, one fair, one two-headed. Choose one randomly (X) and ﬂip it twice. Random variable Y corresponds to the number of heads. X = {0, 1} (fair, unfair), Y = {0, 1, 2} (1 head, 2 heads, 3 heads) X Unfair X=0 H H F=2 H F=2 H H F=2 H F=2 Fair X=1 H H F=2 T F=1 T H F=1 T F=0 H(X) H(Y ) H(X, Y ) I(X : Y )

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 As Related to Sets

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 More Deﬁnitions Assuming channel Λ is a discrete, memoryless, binary symmetric channel (BSC) with error parameter f

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 More Deﬁnitions Assuming channel Λ is a discrete, memoryless, binary symmetric channel (BSC) with error parameter f • Discrete – Messages can be divided into separate symbols. Furthermore, X and Y have ﬁnite target spaces.

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 More Deﬁnitions Assuming channel Λ is a discrete, memoryless, binary symmetric channel (BSC) with error parameter f • Discrete – Messages can be divided into separate symbols. Furthermore, X and Y have ﬁnite target spaces. • Memoryless – Probabilities are independent and don’t change.

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 More Deﬁnitions Assuming channel Λ is a discrete, memoryless, binary symmetric channel (BSC) with error parameter f • Discrete – Messages can be divided into separate symbols. Furthermore, X and Y have ﬁnite target spaces. • Memoryless – Probabilities are independent and don’t change. • Binary Symmetric Channel – Channel with binary input and binary output and error parameter f < .5

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 More Deﬁnitions Assuming channel Λ is a discrete, memoryless, binary symmetric channel (BSC) with error parameter f • Discrete – Messages can be divided into separate symbols. Furthermore, X and Y have ﬁnite target spaces. • Memoryless – Probabilities are independent and don’t change. • Binary Symmetric Channel – Channel with binary input and binary output and error parameter f < .5 • Redundancy – Extra information added to a message to reduce error.

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 More Deﬁnitions Assuming channel Λ is a discrete, memoryless, binary symmetric channel (BSC) with error parameter f • Discrete – Messages can be divided into separate symbols. Furthermore, X and Y have ﬁnite target spaces. • Memoryless – Probabilities are independent and don’t change. • Binary Symmetric Channel – Channel with binary input and binary output and error parameter f < .5 • Redundancy – Extra information added to a message to reduce error. • Capacity – Maximum concentration of information for a given channel. Γ = maxP(X) I(X : Y )

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 More Definitions Assuming channel Λ is a discrete, memoryless, binary symmetric channel (BSC) with error parameter f • Discrete – Messages can be divided into separate symbols. Furthermore, X and Y have finite target spaces. • Memoryless – Probabilities are independent and don’t change. • Binary Symmetric Channel – Channel with binary input and binary output and error parameter f < .5 • Redundancy – Extra information added to a message to reduce error. • Capacity – Maximum concentration of information for a given channel. Γ = maxP(X) I(X : Y ) Simplifies to Γ = 1 − H(f)

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Fundamental Theorem Goal: to minimize error probability, while also minimizing redundancy (and thereby maximize rate).

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Fundamental Theorem Goal: to minimize error probability, while also minimizing redundancy (and thereby maximize rate). Let Λ be a BSC with parameter f < 1/2 and resulting capacity Γ = 1 − H(f). Let R be any information rate with R < Γ. Let > 0 be an arbitrarily small positive quantity. Then, there exists a code C of length N and a rate ≥ R and a decoding algorithm such that the maximum probability of error is ≤ .

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Fundamental Theorem Goal: to minimize error probability, while also minimizing redundancy (and thereby maximize rate). Let Λ be a BSC with parameter f < 1/2 and resulting capacity Γ = 1 − H(f). Let R be any information rate with R < Γ. Let > 0 be an arbitrarily small positive quantity. Then, there exists a code C of length N and a rate ≥ R and a decoding algorithm such that the maximum probability of error is ≤ . • Baby analogy

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Fundamental Theorem Goal: to minimize error probability, while also minimizing redundancy (and thereby maximize rate). Let Λ be a BSC with parameter f < 1/2 and resulting capacity Γ = 1 − H(f). Let R be any information rate with R < Γ. Let > 0 be an arbitrarily small positive quantity. Then, there exists a code C of length N and a rate ≥ R and a decoding algorithm such that the maximum probability of error is ≤ . • Baby analogy • R > Γ

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Hamming Code and Parity • Richard Hamming

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Hamming Code and Parity • Richard Hamming – (7,4)Hamming Code

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Hamming Code and Parity • Richard Hamming – (7,4)Hamming Code • SECDED

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Hamming Code and Parity • Richard Hamming – (7,4)Hamming Code • SECDED • General – (n,k)Hamming Codes: n message length, r parity bits, and k data bits, where n = 2r − 1, k = 2r − r − 1 = n − r, and r ≥ 2.

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Hamming Code and Parity • Richard Hamming – (7,4)Hamming Code • SECDED • General – (n,k)Hamming Codes: n message length, r parity bits, and k data bits, where n = 2r − 1, k = 2r − r − 1 = n − r, and r ≥ 2. r n k rate = k n 2 3 1 .3333 3 7 4 .5714 4 15 11 .7333 5 31 26 .8387 . . . . . . 10 1023 1013 .9902 . . . . . . 16 65535 65519 .9996

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Application 1 – QR Codes • Use Reed-Solomon error correction with 8 bit codewords. • Block errors Level Approx. codewords restored L 7% M 15% Q 32% H 30%

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Application 2 – DNA • DNA as code: 4 base pairs: (A, C, G, T) • Quaternary system - easy to make Binary

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Application 2 – DNA • DNA as code: 4 base pairs: (A, C, G, T) • Quaternary system - easy to make Binary • About 3.2 Billion base pairs: 6.4 Billion bits or about 800 Megabytes. Testament?

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Application 2 – DNA • DNA as code: 4 base pairs: (A, C, G, T) • Quaternary system - easy to make Binary • About 3.2 Billion base pairs: 6.4 Billion bits or about 800 Megabytes. Testament? • Competing goals: prokaryotes vs. eukaryotes

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Application 2 – DNA • DNA as code: 4 base pairs: (A, C, G, T) • Quaternary system - easy to make Binary • About 3.2 Billion base pairs: 6.4 Billion bits or about 800 Megabytes. Testament? • Competing goals: prokaryotes vs. eukaryotes • Field of Bioinformatics

0 1 0 1 1 1 1 0 0 0
1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 0 Sources Primary Sources 1. Aiden A. Bruen and Mario A. Forcintino. Cryptography, Information Theory, and Error-Correction John Wiley Sons, 2005. 2. David J. C. MacKay. Information Theory, Inference, and Learning Algorithms Cambridge University Press, 2003. Papers 1. Claude E. Shannon, Warren Weaver. The Mathematical Theory of Communication Univ of Illinois Press, 1949. 2. Thomas A. Kunkel. DNA Replication Fidelity JBC Papers 2004.

Error Correction Over Noisy Channels

Error Correction Over Noisy Channels

More Decks by Charles Julian Knight

Other Decks in Education

Featured

Transcript