In information theory , the Shannon–Hartley theorem tells the maximum rate at which information can be transmitted over a communications channel of a specified bandwidth in the presence of noise . It is an application of the noisy-channel coding theorem to the archetypal case of a continuous-time analog communications channel subject to Gaussian noise . The theorem establishes Shannon's channel capacity for such a communication link, a bound on the maximum amount of error-free information per time unit that can be transmitted with a specified bandwidth in the presence of the noise interference, assuming that the signal power is bounded, and that the Gaussian noise process is characterized by a known power or power spectral density. The law is named after Claude Shannon and Ralph Hartley .
107-494: The Shannon–Hartley theorem states the channel capacity C {\displaystyle C} , meaning the theoretical tightest upper bound on the information rate of data that can be communicated at an arbitrarily low error rate using an average received signal power S {\displaystyle S} through an analog communication channel subject to additive white Gaussian noise (AWGN) of power N {\displaystyle N} : where During
214-473: A basic tool for measurement and computation in many areas of science and engineering; in these contexts log x still often means the base ten logarithm. In mathematics log x usually means to the natural logarithm (base e ). In computer science and information theory, log often refers to binary logarithms (base 2). The following table lists common notations for logarithms to these bases. The "ISO notation" column lists designations suggested by
321-526: A capacity. But such an errorless channel is an idealization, and if M is chosen small enough to make the noisy channel nearly errorless, the result is necessarily less than the Shannon capacity of the noisy channel of bandwidth B {\displaystyle B} , which is the Hartley–Shannon result that followed later. Claude Shannon 's development of information theory during World War II provided
428-535: A channel of bandwidth B {\displaystyle B} hertz was 2 B {\displaystyle 2B} pulses per second, to arrive at his quantitative measure for achievable line rate. Hartley's law is sometimes quoted as just a proportionality between the analog bandwidth , B {\displaystyle B} , in Hertz and what today is called the digital bandwidth , R {\displaystyle R} , in bit/s. Other times it
535-497: A channel remains open, but it can be upper bounded by another important graph invariant, the Lovász number . The noisy-channel coding theorem states that for any error probability ε > 0 and for any transmission rate R less than the channel capacity C , there is an encoding and decoding scheme transmitting data at rate R whose error probability is less than ε, for a sufficiently large block length. Also, for any rate greater than
642-461: A coding technique which allows the probability of error at the receiver to be made arbitrarily small. This means that theoretically, it is possible to transmit information nearly without error up to nearly a limit of C {\displaystyle C} bits per second. The converse is also important. If the probability of error at the receiver increases without bound as the rate is increased, so no useful information can be transmitted beyond
749-647: A combined manner provides the same theoretical capacity as using them independently. More formally, let p 1 {\displaystyle p_{1}} and p 2 {\displaystyle p_{2}} be two independent channels modelled as above; p 1 {\displaystyle p_{1}} having an input alphabet X 1 {\displaystyle {\mathcal {X}}_{1}} and an output alphabet Y 1 {\displaystyle {\mathcal {Y}}_{1}} . Idem for p 2 {\displaystyle p_{2}} . We define
856-456: A cooperative framework inspired by generative adversarial networks . CORTICAL consists of two cooperative networks: a generator with the objective of learning to sample from the capacity-achieving input distribution, and a discriminator with the objective to learn to distinguish between paired and unpaired channel input-output samples and estimates I ( X ; Y ) {\displaystyle I(X;Y)} . This section focuses on
963-1575: A distribution p X 1 , X 2 {\displaystyle p_{X_{1},X_{2}}} such that I ( X 1 , X 2 : Y 1 , Y 2 ) ≥ I ( X 1 : Y 1 ) + I ( X 2 : Y 2 ) {\displaystyle I(X_{1},X_{2}:Y_{1},Y_{2})\geq I(X_{1}:Y_{1})+I(X_{2}:Y_{2})} . In fact, π 1 {\displaystyle \pi _{1}} and π 2 {\displaystyle \pi _{2}} , two probability distributions for X 1 {\displaystyle X_{1}} and X 2 {\displaystyle X_{2}} achieving C ( p 1 ) {\displaystyle C(p_{1})} and C ( p 2 ) {\displaystyle C(p_{2})} , suffice: ie. C ( p 1 × p 2 ) ≥ C ( p 1 ) + C ( p 2 ) {\displaystyle C(p_{1}\times p_{2})\geq C(p_{1})+C(p_{2})} Now let us show that C ( p 1 × p 2 ) ≤ C ( p 1 ) + C ( p 2 ) {\displaystyle C(p_{1}\times p_{2})\leq C(p_{1})+C(p_{2})} . Let π 12 {\displaystyle \pi _{12}} be some distribution for
1070-3816: A given pair ( x 1 , x 2 ) {\displaystyle (x_{1},x_{2})} , we can rewrite H ( Y 1 , Y 2 | X 1 , X 2 = x 1 , x 2 ) {\displaystyle H(Y_{1},Y_{2}|X_{1},X_{2}=x_{1},x_{2})} as: H ( Y 1 , Y 2 | X 1 , X 2 = x 1 , x 2 ) = ∑ ( y 1 , y 2 ) ∈ Y 1 × Y 2 P ( Y 1 , Y 2 = y 1 , y 2 | X 1 , X 2 = x 1 , x 2 ) log ( P ( Y 1 , Y 2 = y 1 , y 2 | X 1 , X 2 = x 1 , x 2 ) ) = ∑ ( y 1 , y 2 ) ∈ Y 1 × Y 2 P ( Y 1 , Y 2 = y 1 , y 2 | X 1 , X 2 = x 1 , x 2 ) [ log ( P ( Y 1 = y 1 | X 1 = x 1 ) ) + log ( P ( Y 2 = y 2 | X 2 = x 2 ) ) ] = H ( Y 1 | X 1 = x 1 ) + H ( Y 2 | X 2 = x 2 ) {\displaystyle {\begin{aligned}H(Y_{1},Y_{2}|X_{1},X_{2}=x_{1},x_{2})&=\sum _{(y_{1},y_{2})\in {\mathcal {Y}}_{1}\times {\mathcal {Y}}_{2}}\mathbb {P} (Y_{1},Y_{2}=y_{1},y_{2}|X_{1},X_{2}=x_{1},x_{2})\log(\mathbb {P} (Y_{1},Y_{2}=y_{1},y_{2}|X_{1},X_{2}=x_{1},x_{2}))\\&=\sum _{(y_{1},y_{2})\in {\mathcal {Y}}_{1}\times {\mathcal {Y}}_{2}}\mathbb {P} (Y_{1},Y_{2}=y_{1},y_{2}|X_{1},X_{2}=x_{1},x_{2})[\log(\mathbb {P} (Y_{1}=y_{1}|X_{1}=x_{1}))+\log(\mathbb {P} (Y_{2}=y_{2}|X_{2}=x_{2}))]\\&=H(Y_{1}|X_{1}=x_{1})+H(Y_{2}|X_{2}=x_{2})\end{aligned}}} By summing this equality over all ( x 1 , x 2 ) {\displaystyle (x_{1},x_{2})} , we obtain H ( Y 1 , Y 2 | X 1 , X 2 ) = H ( Y 1 | X 1 ) + H ( Y 2 | X 2 ) {\displaystyle H(Y_{1},Y_{2}|X_{1},X_{2})=H(Y_{1}|X_{1})+H(Y_{2}|X_{2})} . We can now give an upper bound over mutual information: I ( X 1 , X 2 : Y 1 , Y 2 ) ≤ H ( Y 1 ) + H ( Y 2 ) − H ( Y 1 | X 1 ) − H ( Y 2 | X 2 ) = I ( X 1 : Y 1 ) + I ( X 2 : Y 2 ) {\displaystyle {\begin{aligned}I(X_{1},X_{2}:Y_{1},Y_{2})&\leq H(Y_{1})+H(Y_{2})-H(Y_{1}|X_{1})-H(Y_{2}|X_{2})\\&=I(X_{1}:Y_{1})+I(X_{2}:Y_{2})\end{aligned}}} This relation
1177-447: A great aid to calculations before the invention of computers. Given a positive real number b such that b ≠ 1 , the logarithm of a positive real number x with respect to base b is the exponent by which b must be raised to yield x . In other words, the logarithm of x to base b is the unique real number y such that b y = x {\displaystyle b^{y}=x} . The logarithm
SECTION 10
#17330849963781284-521: A known variance. Since the variance of a Gaussian process is equivalent to its power, it is conventional to call this variance the noise power. Such a channel is called the Additive White Gaussian Noise channel, because Gaussian noise is added to the signal; "white" means equal amounts of noise at all frequencies within the channel bandwidth. Such noise can arise both from random sources of energy and also from coding and measurement error at
1391-492: A noise process consisting of adding a random wave whose amplitude is 1 or −1 at any point in time, and a channel that adds such a wave to the source signal. Such a wave's frequency components are highly dependent. Though such a noise may have a high power, it is fairly easy to transmit a continuous signal with much less power than one would need if the underlying noise was a sum of independent noises in each frequency band. For large or small and constant signal-to-noise ratios,
1498-420: A non-zero probability that the channel is in deep fade, the capacity of the slow-fading channel in strict sense is zero. However, it is possible to determine the largest value of R {\displaystyle R} such that the outage probability p o u t {\displaystyle p_{out}} is less than ϵ {\displaystyle \epsilon } . This value
1605-446: A product is the sum of the logarithms of the numbers being multiplied; the logarithm of the ratio of two numbers is the difference of the logarithms. The logarithm of the p -th power of a number is p times the logarithm of the number itself; the logarithm of a p -th root is the logarithm of the number divided by p . The following table lists these identities with examples. Each of the identities can be derived after substitution of
1712-1174: A random variable corresponding to the output of X 1 {\displaystyle X_{1}} through the channel p 1 {\displaystyle p_{1}} , and Y 2 {\displaystyle Y_{2}} for X 2 {\displaystyle X_{2}} through p 2 {\displaystyle p_{2}} . By definition C ( p 1 × p 2 ) = sup p X 1 , X 2 ( I ( X 1 , X 2 : Y 1 , Y 2 ) ) {\displaystyle C(p_{1}\times p_{2})=\sup _{p_{X_{1},X_{2}}}(I(X_{1},X_{2}:Y_{1},Y_{2}))} . Since X 1 {\displaystyle X_{1}} and X 2 {\displaystyle X_{2}} are independent, as well as p 1 {\displaystyle p_{1}} and p 2 {\displaystyle p_{2}} , ( X 1 , Y 1 ) {\displaystyle (X_{1},Y_{1})}
1819-550: A signal-to-noise ratio, but achieving reliability through error-correction coding rather than through reliably distinguishable pulse levels. If there were such a thing as a noise-free analog channel, one could transmit unlimited amounts of error-free data over it per unit of time (Note that an infinite-bandwidth analog channel could not transmit unlimited amounts of error-free data absent infinite signal power). Real channels, however, are subject to limitations imposed by both finite bandwidth and nonzero noise. Bandwidth and noise affect
1926-505: A very conservative value of M {\displaystyle M} to achieve a low error rate. The concept of an error-free capacity awaited Claude Shannon, who built on Hartley's observations about a logarithmic measure of information and Nyquist's observations about the effect of bandwidth limitations. Hartley's rate result can be viewed as the capacity of an errorless M -ary channel of 2 B {\displaystyle 2B} symbols per second. Some authors refer to it as
2033-406: A way to quantify information and its line rate (also known as data signalling rate R bits per second). This method, later known as Hartley's law, became an important precursor for Shannon's more sophisticated notion of channel capacity. Hartley argued that the maximum number of distinguishable pulse levels that can be transmitted and received reliably over a communications channel is limited by
2140-533: Is log b y . Roughly, a continuous function is differentiable if its graph has no sharp "corners". Moreover, as the derivative of f ( x ) evaluates to ln( b ) b by the properties of the exponential function , the chain rule implies that the derivative of log b x is given by d d x log b x = 1 x ln b . {\displaystyle {\frac {d}{dx}}\log _{b}x={\frac {1}{x\ln b}}.} That is,
2247-483: Is a limit to the amount of information that can be transferred by a signal of a bounded power, even when sophisticated multi-level encoding techniques are used. In the channel considered by the Shannon–Hartley theorem, noise and signal are combined by addition. That is, the receiver measures a signal that is equal to the sum of the signal encoding the desired information and a continuous random variable that represents
SECTION 20
#17330849963782354-604: Is a positive real number . (If b is not a positive real number, both exponentiation and logarithm can be defined but may take several values, which makes definitions much more complicated.) One of the main historical motivations of introducing logarithms is the formula log b ( x y ) = log b x + log b y , {\displaystyle \log _{b}(xy)=\log _{b}x+\log _{b}y,} by which tables of logarithms allow multiplication and division to be reduced to addition and subtraction,
2461-426: Is called the base- b logarithm function or logarithmic function (or just logarithm ). The function log b x can also be essentially characterized by the product formula log b ( x y ) = log b x + log b y . {\displaystyle \log _{b}(xy)=\log _{b}x+\log _{b}y.} More precisely,
2568-660: Is commonly used in science and engineering. The natural logarithm has the number e ≈ 2.718 as its base; its use is widespread in mathematics and physics because of its very simple derivative . The binary logarithm uses base 2 and is frequently used in computer science . Logarithms were introduced by John Napier in 1614 as a means of simplifying calculations. They were rapidly adopted by navigators , scientists, engineers, surveyors , and others to perform high-accuracy computations more easily. Using logarithm tables , tedious multi-digit multiplication steps can be replaced by table look-ups and simpler addition. This
2675-539: Is denoted " log b x " (pronounced as "the logarithm of x to base b ", "the base- b logarithm of x ", or most commonly "the log, base b , of x "). An equivalent and more succinct definition is that the function log b is the inverse function to the function x ↦ b x {\displaystyle x\mapsto b^{x}} . Several important formulas, sometimes called logarithmic identities or logarithmic laws , relate logarithms to one another. The logarithm of
2782-439: Is essentially as good as the best possible code; the theorem is proved through the statistics of such random codes. Shannon's theorem shows how to compute a channel capacity from a statistical description of a channel, and establishes that given a noisy channel with capacity C {\displaystyle C} and information transmitted at a line rate R {\displaystyle R} , then if there exists
2889-491: Is exactly one real number x such that b x = y {\displaystyle b^{x}=y} . We let log b : R > 0 → R {\displaystyle \log _{b}\colon \mathbb {R} _{>0}\to \mathbb {R} } denote the inverse of f . That is, log b y is the unique real number x such that b x = y {\displaystyle b^{x}=y} . This function
2996-432: Is greater than one. In that case, log b ( x ) is an increasing function . For b < 1 , log b ( x ) tends to minus infinity instead. When x approaches zero, log b x goes to minus infinity for b > 1 (plus infinity for b < 1 , respectively). Analytic properties of functions pass to their inverses. Thus, as f ( x ) = b is a continuous and differentiable function , so
3103-534: Is independent of ( X 2 , Y 2 ) {\displaystyle (X_{2},Y_{2})} . We can apply the following property of mutual information : I ( X 1 , X 2 : Y 1 , Y 2 ) = I ( X 1 : Y 1 ) + I ( X 2 : Y 2 ) {\displaystyle I(X_{1},X_{2}:Y_{1},Y_{2})=I(X_{1}:Y_{1})+I(X_{2}:Y_{2})} For now we only need to find
3210-641: Is known as the ϵ {\displaystyle \epsilon } -outage capacity. In a fast-fading channel , where the latency requirement is greater than the coherence time and the codeword length spans many coherence periods, one can average over many independent channel fades by coding over a large number of coherence time intervals. Thus, it is possible to achieve a reliable rate of communication of E ( log 2 ( 1 + | h | 2 S N R ) ) {\displaystyle \mathbb {E} (\log _{2}(1+|h|^{2}SNR))} [bits/s/Hz] and it
3317-452: Is logarithmic in power and approximately linear in bandwidth. This is called the bandwidth-limited regime . When the SNR is small (SNR ≪ 0 dB), the capacity C ≈ P ¯ N 0 ln 2 {\displaystyle C\approx {\frac {\bar {P}}{N_{0}\ln 2}}} is linear in power but insensitive to bandwidth. This is called
Shannon–Hartley theorem - Misplaced Pages Continue
3424-463: Is meaningful to speak of this value as the capacity of the fast-fading channel. Feedback capacity is the greatest rate at which information can be reliably transmitted, per unit time, over a point-to-point communication channel in which the receiver feeds back the channel outputs to the transmitter. Information-theoretic analysis of communication systems that incorporate feedback is more complicated and challenging than without feedback. Possibly, this
3531-575: Is possible because the logarithm of a product is the sum of the logarithms of the factors: log b ( x y ) = log b x + log b y , {\displaystyle \log _{b}(xy)=\log _{b}x+\log _{b}y,} provided that b , x and y are all positive and b ≠ 1 . The slide rule , also based on logarithms, allows quick calculations without tables, but at lower precision. The present-day notion of logarithms comes from Leonhard Euler , who connected them to
3638-426: Is preserved at the supremum. Therefore Combining the two inequalities we proved, we obtain the result of the theorem: If G is an undirected graph , it can be used to define a communications channel in which the symbols are the graph vertices, and two codewords may be confused with each other if their symbols in each position are equal or adjacent. The computational complexity of finding the Shannon capacity of such
3745-439: Is quoted in this more quantitative form, as an achievable line rate of R {\displaystyle R} bits per second: Hartley did not work out exactly how the number M should depend on the noise statistics of the channel, or how the communication could be made reliable even when individual symbol pulses could not be reliably distinguished to M levels; with Gaussian noise statistics, system designers had to choose
3852-403: Is related to the number of decimal digits of a positive integer x : The number of digits is the smallest integer strictly bigger than log 10 ( x ) . For example, log 10 (5986) is approximately 3.78 . The next integer above it is 4, which is the number of digits of 5986. Both the natural logarithm and the binary logarithm are used in information theory , corresponding to
3959-607: Is the bandwidth (in hertz). The quantity 2 B {\displaystyle 2B} later came to be called the Nyquist rate , and transmitting at the limiting pulse rate of 2 B {\displaystyle 2B} pulses per second as signalling at the Nyquist rate . Nyquist published his results in 1928 as part of his paper "Certain topics in Telegraph Transmission Theory". During 1928, Hartley formulated
4066-553: Is the gain of subchannel n {\displaystyle n} , with λ {\displaystyle \lambda } chosen to meet the power constraint. In a slow-fading channel , where the coherence time is greater than the latency requirement, there is no definite capacity as the maximum rate of reliable communications supported by the channel, log 2 ( 1 + | h | 2 S N R ) {\displaystyle \log _{2}(1+|h|^{2}SNR)} , depends on
4173-430: Is the theoretical maximum rate at which information can be reliably transmitted over a communication channel . Following the terms of the noisy-channel coding theorem , the channel capacity of a given channel is the highest information rate (in units of information per unit time) that can be achieved with arbitrarily small error probability. Information theory , developed by Claude E. Shannon in 1948, defines
4280-411: Is written as f ( x ) = b . When b is positive and unequal to 1, we show below that f is invertible when considered as a function from the reals to the positive reals. Let b be a positive real number not equal to 1 and let f ( x ) = b . It is a standard result in real analysis that any continuous strictly monotonic function is bijective between its domain and range. This fact follows from
4387-633: The International Organization for Standardization . The history of logarithms in seventeenth-century Europe saw the discovery of a new function that extended the realm of analysis beyond the scope of algebraic methods. The method of logarithms was publicly propounded by John Napier in 1614, in a book titled Mirifici Logarithmorum Canonis Descriptio ( Description of the Wonderful Canon of Logarithms ). Prior to Napier's invention, there had been other techniques of similar scopes, such as
Shannon–Hartley theorem - Misplaced Pages Continue
4494-439: The acidity of an aqueous solution . Logarithms are commonplace in scientific formulae , and in measurements of the complexity of algorithms and of geometric objects called fractals . They help to describe frequency ratios of musical intervals , appear in formulas counting prime numbers or approximating factorials , inform some models in psychophysics , and can aid in forensic accounting . The concept of logarithm as
4601-506: The conditional probability distribution function of Y {\displaystyle Y} given X {\displaystyle X} , which is an inherent fixed property of the communication channel. Then the choice of the marginal distribution p X ( x ) {\displaystyle p_{X}(x)} completely determines the joint distribution p X , Y ( x , y ) {\displaystyle p_{X,Y}(x,y)} due to
4708-431: The decimal number system: log 10 ( 10 x ) = log 10 10 + log 10 x = 1 + log 10 x . {\displaystyle \log _{10}\,(\,10\,x\,)\ =\;\log _{10}10\ +\;\log _{10}x\ =\ 1\,+\,\log _{10}x\,.} Thus, log 10 ( x )
4815-413: The exponential function in the 18th century, and who also introduced the letter e as the base of natural logarithms. Logarithmic scales reduce wide-ranging quantities to smaller scopes. For example, the decibel (dB) is a unit used to express ratio as logarithms , mostly for signal power and amplitude (of which sound pressure is a common example). In chemistry, pH is a logarithmic measure for
4922-501: The function now known as the natural logarithm began as an attempt to perform a quadrature of a rectangular hyperbola by Grégoire de Saint-Vincent , a Belgian Jesuit residing in Prague. Archimedes had written The Quadrature of the Parabola in the third century BC, but a quadrature for the hyperbola eluded all efforts until Saint-Vincent published his results in 1647. The relation that
5029-579: The intermediate value theorem . Now, f is strictly increasing (for b > 1 ), or strictly decreasing (for 0 < b < 1 ), is continuous, has domain R {\displaystyle \mathbb {R} } , and has range R > 0 {\displaystyle \mathbb {R} _{>0}} . Therefore, f is a bijection from R {\displaystyle \mathbb {R} } to R > 0 {\displaystyle \mathbb {R} _{>0}} . In other words, for each positive real number y , there
5136-428: The logarithm base 10 {\displaystyle 10} of 1000 is 3 , or log 10 (1000) = 3 . The logarithm of x to base b is denoted as log b ( x ) , or without parentheses, log b x . When the base is clear from the context or is irrelevant it is sometimes written log x . The logarithm base 10 is called the decimal or common logarithm and
5243-476: The natural logarithm is used, assuming B is in hertz ; the signal and noise powers S and N are expressed in a linear power unit (like watts or volts ). Since S/N figures are often cited in dB , a conversion may be needed. For example, a signal-to-noise ratio of 30 dB corresponds to a linear power ratio of 10 30 / 10 = 10 3 = 1000 {\displaystyle 10^{30/10}=10^{3}=1000} . To determine
5350-740: The power-limited regime . The bandwidth-limited regime and power-limited regime are illustrated in the figure. The capacity of the frequency-selective channel is given by so-called water filling power allocation, where P n ∗ = max { ( 1 λ − N 0 | h ¯ n | 2 ) , 0 } {\displaystyle P_{n}^{*}=\max \left\{\left({\frac {1}{\lambda }}-{\frac {N_{0}}{|{\bar {h}}_{n}|^{2}}}\right),0\right\}} and | h ¯ n | 2 {\displaystyle |{\bar {h}}_{n}|^{2}}
5457-510: The prosthaphaeresis or the use of tables of progressions, extensively developed by Jost Bürgi around 1600. Napier coined the term for logarithm in Middle Latin, logarithmus , literally meaning ' ratio-number ' , derived from the Greek logos ' proportion, ratio, word ' + arithmos ' number ' . The common logarithm of a number is the index of that power of ten which equals
SECTION 50
#17330849963785564-407: The slope of the tangent touching the graph of the base- b logarithm at the point ( x , log b ( x )) equals 1/( x ln( b )) . The derivative of ln( x ) is 1/ x ; this implies that ln( x ) is the unique antiderivative of 1/ x that has the value 0 for x = 1 . It is this very simple formula that motivated to qualify as "natural" the natural logarithm; this is also one of
5671-400: The x - and the y -coordinates (or upon reflection at the diagonal line x = y ), as shown at the right: a point ( t , u = b ) on the graph of f yields a point ( u , t = log b u ) on the graph of the logarithm and vice versa. As a consequence, log b ( x ) diverges to infinity (gets bigger than any given number) if x grows to infinity, provided that b
5778-407: The 1970s, because it allows, at the expense of precision, much faster computation than techniques based on tables. A deeper study of logarithms requires the concept of a function . A function is a rule that, given one number, produces another number. An example is the function producing the x -th power of b from any real number x , where the base b is a fixed number. This function
5885-618: The AWGN channel capacity is where P ¯ N 0 W {\displaystyle {\frac {\bar {P}}{N_{0}W}}} is the received signal-to-noise ratio (SNR). This result is known as the Shannon–Hartley theorem . When the SNR is large (SNR ≫ 0 dB), the capacity C ≈ W log 2 P ¯ N 0 W {\displaystyle C\approx W\log _{2}{\frac {\bar {P}}{N_{0}W}}}
5992-477: The Trapdoor channel, Ising channel, the binary erasure channel with a no-consecutive-ones input constraint, NOST channels. Logarithm In mathematics , the logarithm to base b is the inverse function of exponentiation with base b . That means that the logarithm of a number x to the base b is the exponent to which b must be raised to produce x . For example, since 1000 = 10 ,
6099-484: The additive noise is not white (or that the S / N {\displaystyle S/N} is not constant with frequency over the bandwidth) is obtained by treating the channel as many narrow, independent Gaussian channels in parallel: where Note: the theorem only applies to Gaussian stationary process noise. This formula's way of introducing frequency-dependent noise cannot describe all continuous-time noise processes. For example, consider
6206-407: The advance of science, especially astronomy . They were critical to advances in surveying , celestial navigation , and other domains. Pierre-Simon Laplace called logarithms As the function f ( x ) = b is the inverse function of log b x , it has been called an antilogarithm . Nowadays, this function is more commonly called an exponential function . A key tool that enabled
6313-507: The advent of novel error correction coding mechanisms that have resulted in achieving performance very close to the limits promised by channel capacity. The basic mathematical model for a communication system is the following: where: Let X {\displaystyle X} and Y {\displaystyle Y} be modeled as random variables. Furthermore, let p Y | X ( y | x ) {\displaystyle p_{Y|X}(y|x)} be
6420-1208: The alphabet of X 1 {\displaystyle X_{1}} , Y 1 {\displaystyle {\mathcal {Y}}_{1}} for Y 1 {\displaystyle Y_{1}} , and analogously X 2 {\displaystyle {\mathcal {X}}_{2}} and Y 2 {\displaystyle {\mathcal {Y}}_{2}} . By definition of mutual information, we have I ( X 1 , X 2 : Y 1 , Y 2 ) = H ( Y 1 , Y 2 ) − H ( Y 1 , Y 2 | X 1 , X 2 ) ≤ H ( Y 1 ) + H ( Y 2 ) − H ( Y 1 , Y 2 | X 1 , X 2 ) {\displaystyle {\begin{aligned}I(X_{1},X_{2}:Y_{1},Y_{2})&=H(Y_{1},Y_{2})-H(Y_{1},Y_{2}|X_{1},X_{2})\\&\leq H(Y_{1})+H(Y_{2})-H(Y_{1},Y_{2}|X_{1},X_{2})\end{aligned}}} Let us rewrite
6527-577: The approximation to the logarithm: then the capacity is linear in power. This is called the power-limited regime . In this low-SNR approximation, capacity is independent of bandwidth if the noise is white, of spectral density N 0 {\displaystyle N_{0}} watts per hertz, in which case the total noise power is N = B ⋅ N 0 {\displaystyle N=B\cdot N_{0}} . Channel capacity Channel capacity , in electrical engineering , computer science , and information theory ,
SECTION 60
#17330849963786634-451: The base is given by: b = x 1 y , {\displaystyle b=x^{\frac {1}{y}},} which can be seen from taking the defining equation x = b log b x = b y {\displaystyle x=b^{\,\log _{b}x}=b^{y}} to the power of 1 y . {\displaystyle {\tfrac {1}{y}}.} Among all choices for
6741-405: The base, three are particularly common. These are b = 10 , b = e (the irrational mathematical constant e ≈ 2.71828183 ), and b = 2 (the binary logarithm ). In mathematical analysis , the logarithm base e is widespread because of analytical properties explained below. On the other hand, base 10 logarithms (the common logarithm ) are easy to use for manual calculations in
6848-500: The capacity formula can be approximated: When the SNR is large ( S / N ≫ 1 ), the logarithm is approximated by in which case the capacity is logarithmic in power and approximately linear in bandwidth (not quite linear, since N increases with bandwidth, imparting a logarithmic effect). This is called the bandwidth-limited regime . where Similarly, when the SNR is small (if S / N ≪ 1 {\displaystyle S/N\ll 1} ), applying
6955-455: The channel p 1 × p 2 {\displaystyle p_{1}\times p_{2}} defining ( X 1 , X 2 ) {\displaystyle (X_{1},X_{2})} and the corresponding output ( Y 1 , Y 2 ) {\displaystyle (Y_{1},Y_{2})} . Let X 1 {\displaystyle {\mathcal {X}}_{1}} be
7062-440: The channel capacity, it is necessary to find the capacity-achieving distribution p X ( x ) {\displaystyle p_{X}(x)} and evaluate the mutual information I ( X ; Y ) {\displaystyle I(X;Y)} . Research has mostly focused on studying additive noise channels under certain power constraints and noise distributions, as analytical methods are not feasible in
7169-408: The channel capacity, the probability of error at the receiver goes to 0.5 as the block length goes to infinity. An application of the channel capacity concept to an additive white Gaussian noise (AWGN) channel with B Hz bandwidth and signal-to-noise ratio S/N is the Shannon–Hartley theorem : C is measured in bits per second if the logarithm is taken in base 2, or nats per second if
7276-468: The channel capacity. The theorem does not address the rare situation in which rate and capacity are equal. The Shannon–Hartley theorem establishes what that channel capacity is for a finite-bandwidth continuous-time channel subject to Gaussian noise. It connects Hartley's result with Shannon's channel capacity theorem in a form that is equivalent to specifying the M in Hartley's line rate formula in terms of
7383-453: The common logarithms of trigonometric functions . Another critical application was the slide rule , a pair of logarithmically divided scales used for calculation. The non-sliding logarithmic scale, Gunter's rule , was invented shortly after Napier's invention. William Oughtred enhanced it to create the slide rule—a pair of logarithmic scales movable with respect to each other. Numbers are placed on sliding scales at distances proportional to
7490-400: The differences between their logarithms. Sliding the upper scale appropriately amounts to mechanically adding logarithms, as illustrated here: For example, adding the distance from 1 to 2 on the lower scale to the distance from 1 to 3 on the upper scale yields a product of 6, which is read off at the lower part. The slide rule was an essential calculating tool for engineers and scientists until
7597-415: The dynamic range of the signal amplitude and the precision with which the receiver can distinguish amplitude levels. Specifically, if the amplitude of the transmitted signal is restricted to the range of [− A ... + A ] volts, and the precision of the receiver is ±Δ V volts, then the maximum number of distinct pulses M is given by By taking information per pulse in bit/pulse to be the base-2- logarithm of
7704-433: The following formula: log b x = log k x log k b . {\displaystyle \log _{b}x={\frac {\log _{k}x}{\log _{k}b}}.} Typical scientific calculators calculate the logarithms to bases 10 and e . Logarithms with respect to any base b can be determined using either of these two logarithms by
7811-469: The ideas of Nyquist and Hartley, and then formulated a complete theory of information and its transmission. In 1927, Nyquist determined that the number of independent pulses that could be put through a telegraph channel per unit time is limited to twice the bandwidth of the channel. In symbolic notation, where f p {\displaystyle f_{p}} is the pulse frequency (in pulses per second) and B {\displaystyle B}
7918-425: The identity which, in turn, induces a mutual information I ( X ; Y ) {\displaystyle I(X;Y)} . The channel capacity is defined as where the supremum is taken over all possible choices of p X ( x ) {\displaystyle p_{X}(x)} . Channel capacity is additive over independent channels. It means that using two independent channels in
8025-482: The inverse of exponentiation extends to other mathematical structures as well. However, in general settings, the logarithm tends to be a multi-valued function. For example, the complex logarithm is the multi-valued inverse of the complex exponential function. Similarly, the discrete logarithm is the multi-valued inverse of the exponential function in finite groups; it has uses in public-key cryptography . Addition , multiplication , and exponentiation are three of
8132-777: The last term of entropy . H ( Y 1 , Y 2 | X 1 , X 2 ) = ∑ ( x 1 , x 2 ) ∈ X 1 × X 2 P ( X 1 , X 2 = x 1 , x 2 ) H ( Y 1 , Y 2 | X 1 , X 2 = x 1 , x 2 ) {\displaystyle H(Y_{1},Y_{2}|X_{1},X_{2})=\sum _{(x_{1},x_{2})\in {\mathcal {X}}_{1}\times {\mathcal {X}}_{2}}\mathbb {P} (X_{1},X_{2}=x_{1},x_{2})H(Y_{1},Y_{2}|X_{1},X_{2}=x_{1},x_{2})} By definition of
8239-422: The late 1920s, Harry Nyquist and Ralph Hartley developed a handful of fundamental ideas related to the transmission of information, particularly in the context of the telegraph as a communications system. At the time, these concepts were powerful breakthroughs individually, but they were not part of a comprehensive theory. In the 1940s, Claude Shannon developed the concept of channel capacity, based in part on
8346-462: The log base 2 ; and in photography rescaled base 2 logarithms are used to measure exposure values , light levels , exposure times , lens apertures , and film speeds in "stops". The abbreviation log x is often used when the intended base can be inferred based on the context or discipline, or when the base is indeterminate or immaterial. Common logarithms (base 10), historically used in logarithm tables and slide rules, are
8453-433: The logarithm definitions x = b log b x {\displaystyle x=b^{\,\log _{b}x}} or y = b log b y {\displaystyle y=b^{\,\log _{b}y}} in the left hand sides. The logarithm log b x can be computed from the logarithms of x and b with respect to an arbitrary base k using
8560-434: The logarithm provides between a geometric progression in its argument and an arithmetic progression of values, prompted A. A. de Sarasa to make the connection of Saint-Vincent's quadrature and the tradition of logarithms in prosthaphaeresis , leading to the term "hyperbolic logarithm", a synonym for natural logarithm. Soon the new function was appreciated by Christiaan Huygens , and James Gregory . The notation Log y
8667-528: The logarithm to any base b > 1 is the only increasing function f from the positive reals to the reals satisfying f ( b ) = 1 and f ( x y ) = f ( x ) + f ( y ) . {\displaystyle f(xy)=f(x)+f(y).} As discussed above, the function log b is the inverse to the exponential function x ↦ b x {\displaystyle x\mapsto b^{x}} . Therefore, their graphs correspond to each other upon exchanging
8774-926: The lookups of the two logarithms, calculating their sum or difference, and looking up the antilogarithm is much faster than performing the multiplication by earlier methods such as prosthaphaeresis , which relies on trigonometric identities . Calculations of powers and roots are reduced to multiplications or divisions and lookups by c d = ( 10 log 10 c ) d = 10 d log 10 c {\displaystyle c^{d}=\left(10^{\,\log _{10}c}\right)^{d}=10^{\,d\log _{10}c}} and c d = c 1 d = 10 1 d log 10 c . {\displaystyle {\sqrt[{d}]{c}}=c^{\frac {1}{d}}=10^{{\frac {1}{d}}\log _{10}c}.} Trigonometric calculations were facilitated by tables that contained
8881-563: The majority of other scenarios. Hence, alternative approaches such as, investigation on the input support, relaxations and capacity bounds, have been proposed in the literature. The capacity of a discrete memoryless channel can be computed using the Blahut-Arimoto algorithm . Deep learning can be used to estimate the channel capacity. In fact, the channel capacity and the capacity-achieving distribution of any discrete-time continuous memoryless vector channel can be obtained using CORTICAL,
8988-1221: The mantissa, as the characteristic can be easily determined by counting digits from the decimal point. The characteristic of 10 · x is one plus the characteristic of x , and their mantissas are the same. Thus using a three-digit log table, the logarithm of 3542 is approximated by log 10 3542 = log 10 ( 1000 ⋅ 3.542 ) = 3 + log 10 3.542 ≈ 3 + log 10 3.54 {\displaystyle {\begin{aligned}\log _{10}3542&=\log _{10}(1000\cdot 3.542)\\&=3+\log _{10}3.542\\&\approx 3+\log _{10}3.54\end{aligned}}} Greater accuracy can be obtained by interpolation : log 10 3542 ≈ 3 + log 10 3.54 + 0.2 ( log 10 3.55 − log 10 3.54 ) {\displaystyle \log _{10}3542\approx {}3+\log _{10}3.54+0.2(\log _{10}3.55-\log _{10}3.54)} The value of 10 can be determined by reverse look up in
9095-447: The most fundamental arithmetic operations. The inverse of addition is subtraction , and the inverse of multiplication is division . Similarly, a logarithm is the inverse operation of exponentiation . Exponentiation is when a number b , the base , is raised to a certain power y , the exponent , to give a value x ; this is denoted b y = x . {\displaystyle b^{y}=x.} For example, raising 2 to
9202-412: The net data rate that can be approached with coding is equivalent to using that M {\displaystyle M} in Hartley's law. In the simple version above, the signal and noise are fully uncorrelated, in which case S + N {\displaystyle S+N} is the total power of the received signal and noise together. A generalization of the above equation for the case where
9309-402: The next big step in understanding how much information could be reliably communicated through noisy channels. Building on Hartley's foundation, Shannon's noisy channel coding theorem (1948) describes the maximum possible efficiency of error-correcting methods versus levels of noise interference and data corruption. The proof of the theorem shows that a randomly constructed error-correcting code
9416-400: The noise. This addition creates uncertainty as to the original signal's value. If the receiver has some information about the random process that generates the noise, one can in principle recover the information in the original signal by considering all possible states of the noise process. In the case of the Shannon–Hartley theorem, the noise is assumed to be generated by a Gaussian process with
9523-460: The notion of channel capacity and provides a mathematical model by which it may be computed. The key result states that the capacity of the channel, as defined above, is given by the maximum of the mutual information between the input and output of the channel, where the maximization is with respect to the input distribution. The notion of channel capacity has been central to the development of modern wireline and wireless communication systems, with
9630-400: The number of distinct messages M that could be sent, Hartley constructed a measure of the line rate R as: where f p {\displaystyle f_{p}} is the pulse rate, also known as the symbol rate, in symbols/second or baud . Hartley then combined the above quantification with Nyquist's observation that the number of independent pulses that could be put through
9737-425: The number. Speaking of a number as requiring so many figures is a rough allusion to common logarithm, and was referred to by Archimedes as the "order of a number". The first real logarithms were heuristic methods to turn multiplication into addition, thus facilitating rapid computation. Some of these methods used tables derived from trigonometric identities. Such methods are called prosthaphaeresis . Invention of
9844-455: The output. The directed information was coined by James Massey in 1990, who showed that its an upper bound on feedback capacity. For memoryless channels , Shannon showed that feedback does not increase the capacity, and the feedback capacity coincides with the channel capacity characterized by the mutual information between the input and the output. The feedback capacity is known as a closed-form expression only for several examples such as:
9951-414: The power of 3 gives 8 : 2 3 = 8. {\displaystyle 2^{3}=8.} The logarithm of base b is the inverse operation, that provides the output y from the input x . That is, y = log b x {\displaystyle y=\log _{b}x} is equivalent to x = b y {\displaystyle x=b^{y}} if b
10058-455: The power ratio back to a voltage ratio, so the number of levels is approximately proportional to the ratio of signal RMS amplitude to noise standard deviation. This similarity in form between Shannon's capacity and Hartley's law should not be interpreted to mean that M {\displaystyle M} pulse levels can be literally sent without any confusion. More levels are needed to allow for redundant coding and error correction, but
10165-424: The practical use of logarithms was the table of logarithms . The first such table was compiled by Henry Briggs in 1617, immediately after Napier's invention but with the innovation of using 10 as the base. Briggs' first table contained the common logarithms of all integers in the range from 1 to 1000, with a precision of 14 digits. Subsequently, tables with increasing scope were written. These tables listed
10272-475: The previous formula: log b x = log 10 x log 10 b = log e x log e b . {\displaystyle \log _{b}x={\frac {\log _{10}x}{\log _{10}b}}={\frac {\log _{e}x}{\log _{e}b}}.} Given a number x and its logarithm y = log b x to an unknown base b ,
10379-1734: The product channel p 1 × p 2 {\displaystyle p_{1}\times p_{2}} as ∀ ( x 1 , x 2 ) ∈ ( X 1 , X 2 ) , ( y 1 , y 2 ) ∈ ( Y 1 , Y 2 ) , ( p 1 × p 2 ) ( ( y 1 , y 2 ) | ( x 1 , x 2 ) ) = p 1 ( y 1 | x 1 ) p 2 ( y 2 | x 2 ) {\displaystyle \forall (x_{1},x_{2})\in ({\mathcal {X}}_{1},{\mathcal {X}}_{2}),\;(y_{1},y_{2})\in ({\mathcal {Y}}_{1},{\mathcal {Y}}_{2}),\;(p_{1}\times p_{2})((y_{1},y_{2})|(x_{1},x_{2}))=p_{1}(y_{1}|x_{1})p_{2}(y_{2}|x_{2})} This theorem states: C ( p 1 × p 2 ) = C ( p 1 ) + C ( p 2 ) {\displaystyle C(p_{1}\times p_{2})=C(p_{1})+C(p_{2})} We first show that C ( p 1 × p 2 ) ≥ C ( p 1 ) + C ( p 2 ) {\displaystyle C(p_{1}\times p_{2})\geq C(p_{1})+C(p_{2})} . Let X 1 {\displaystyle X_{1}} and X 2 {\displaystyle X_{2}} be two independent random variables. Let Y 1 {\displaystyle Y_{1}} be
10486-629: The product channel, P ( Y 1 , Y 2 = y 1 , y 2 | X 1 , X 2 = x 1 , x 2 ) = P ( Y 1 = y 1 | X 1 = x 1 ) P ( Y 2 = y 2 | X 2 = x 2 ) {\displaystyle \mathbb {P} (Y_{1},Y_{2}=y_{1},y_{2}|X_{1},X_{2}=x_{1},x_{2})=\mathbb {P} (Y_{1}=y_{1}|X_{1}=x_{1})\mathbb {P} (Y_{2}=y_{2}|X_{2}=x_{2})} . For
10593-402: The random channel gain | h | 2 {\displaystyle |h|^{2}} , which is unknown to the transmitter. If the transmitter encodes data at rate R {\displaystyle R} [bits/s/Hz], there is a non-zero probability that the decoding error probability cannot be made arbitrarily small, in which case the system is said to be in outage. With
10700-446: The rate at which information can be transmitted over an analog channel. Bandwidth limitations alone do not impose a cap on the maximum information rate because it is still possible for the signal to take on an indefinitely large number of different voltage levels on each symbol pulse, with each slightly different level being assigned a different meaning or bit sequence. Taking into account both noise and bandwidth limitations, however, there
10807-1046: The same table, since the logarithm is a monotonic function . The product and quotient of two positive numbers c and d were routinely calculated as the sum and difference of their logarithms. The product cd or quotient c / d came from looking up the antilogarithm of the sum or difference, via the same table: c d = 10 log 10 c 10 log 10 d = 10 log 10 c + log 10 d {\displaystyle cd=10^{\,\log _{10}c}\,10^{\,\log _{10}d}=10^{\,\log _{10}c\,+\,\log _{10}d}} and c d = c d − 1 = 10 log 10 c − log 10 d . {\displaystyle {\frac {c}{d}}=cd^{-1}=10^{\,\log _{10}c\,-\,\log _{10}d}.} For manual calculations that demand any appreciable precision, performing
10914-424: The sender and receiver respectively. Since sums of independent Gaussian random variables are themselves Gaussian random variables, this conveniently simplifies analysis, if one assumes that such error sources are also Gaussian and independent. Comparing the channel capacity to the information rate from Hartley's law, we can find the effective number of distinguishable levels M : The square root effectively converts
11021-439: The single-antenna, point-to-point scenario. For channel capacity in systems with multiple antennas, see the article on MIMO . If the average received power is P ¯ {\displaystyle {\bar {P}}} [W], the total bandwidth is W {\displaystyle W} in Hertz, and the noise power spectral density is N 0 {\displaystyle N_{0}} [W/Hz],
11128-488: The use of nats or bits as the fundamental units of information, respectively. Binary logarithms are also used in computer science , where the binary system is ubiquitous; in music theory , where a pitch ratio of two (the octave ) is ubiquitous and the number of cents between any two pitches is a scaled version of the binary logarithm, or log 2 times 1200, of the pitch ratio (that is, 100 cents per semitone in conventional equal temperament ), or equivalently
11235-455: The values of log 10 x for any number x in a certain range, at a certain precision. Base-10 logarithms were universally used for computation, hence the name common logarithm, since numbers that differ by factors of 10 have logarithms that differ by integers. The common logarithm of x can be separated into an integer part and a fractional part , known as the characteristic and mantissa . Tables of logarithms need only include
11342-673: Was adopted by Leibniz in 1675, and the next year he connected it to the integral ∫ d y y . {\textstyle \int {\frac {dy}{y}}.} Before Euler developed his modern conception of complex natural logarithms, Roger Cotes had a nearly equivalent result when he showed in 1714 that log ( cos θ + i sin θ ) = i θ . {\displaystyle \log(\cos \theta +i\sin \theta )=i\theta .} By simplifying difficult calculations before calculators and computers became available, logarithms contributed to
11449-457: Was the reason C.E. Shannon chose feedback as the subject of the first Shannon Lecture, delivered at the 1973 IEEE International Symposium on Information Theory in Ashkelon, Israel. The feedback capacity is characterized by the maximum of the directed information between the channel inputs and the channel outputs, where the maximization is with respect to the causal conditioning of the input given
#377622