搜索
热搜: music
门户 Mathematics Formal sciences Information theory view content

Quantities of information

2014-3-2 16:26| view publisher: amanda| views: 1007| wiki(57883.com) 0 : 0

description: Information theory is based on probability theory and statistics. The most important quantities of information are entropy, the information in a random variable, and mutual information, the amount of ...
Information theory is based on probability theory and statistics. The most important quantities of information are entropy, the information in a random variable, and mutual information, the amount of information in common between two random variables. The former quantity indicates how easily message data can be compressed while the latter can be used to find the communication rate across a channel.
The choice of logarithmic base in the following formulae determines the unit of information entropy that is used. The most common unit of information is the bit, based on the binary logarithm. Other units include the nat, which is based on the natural logarithm, and the hartley, which is based on the common logarithm.
In what follows, an expression of the form p \log p \, is considered by convention to be equal to zero whenever p=0. This is justified because \lim_{p \rightarrow 0+} p \log p = 0 for any logarithmic base.
Entropy[edit]


Entropy of a Bernoulli trial as a function of success probability, often called the binary entropy function, H_\mbox{b}(p). The entropy is maximized at 1 bit per trial when the two possible outcomes are equally probable, as in an unbiased coin toss.
The entropy, H, of a discrete random variable X is a measure of the amount of uncertainty associated with the value of X.
Suppose one transmits 1000 bits (0s and 1s). If these bits are known ahead of transmission (to be a certain value with absolute probability), logic dictates that no information has been transmitted. If, however, each is equally and independently likely to be 0 or 1, 1000 bits (in the information theoretic sense) have been transmitted. Between these two extremes, information can be quantified as follows. If \mathbb{X} is the set of all messages \{x_1, ..., x_n\} that X could be, and p(x) is the probability of some x \in \mathbb X, then the entropy, H, of X is defined:[8]
 H(X) = \mathbb{E}_{X} [I(x)] = -\sum_{x \in \mathbb{X}} p(x) \log p(x).
(Here, I(x) is the self-information, which is the entropy contribution of an individual message, and \mathbb{E}_{X} is the expected value.) An important property of entropy is that it is maximized when all the messages in the message space are equiprobable p(x)=1/n,—i.e., most unpredictable—in which case  H(X)=\log n.
The special case of information entropy for a random variable with two outcomes is the binary entropy function, usually taken to the logarithmic base 2:
H_{\mathrm{b}}(p) = - p \log_2 p - (1-p)\log_2 (1-p).\,
Joint entropy[edit]
The joint entropy of two discrete random variables X and Y is merely the entropy of their pairing: (X, Y). This implies that if X and Y are independent, then their joint entropy is the sum of their individual entropies.
For example, if (X,Y) represents the position of a chess piece — X the row and Y the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.
H(X, Y) = \mathbb{E}_{X,Y} [-\log p(x,y)] = - \sum_{x, y} p(x, y) \log p(x, y) \,
Despite similar notation, joint entropy should not be confused with cross entropy.
Conditional entropy (equivocation)[edit]
The conditional entropy or conditional uncertainty of X given random variable Y (also called the equivocation of X about Y) is the average conditional entropy over Y:[9]
 H(X|Y) = \mathbb E_Y [H(X|y)] = -\sum_{y \in Y} p(y) \sum_{x \in X} p(x|y) \log p(x|y) = -\sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(y)}.
Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use. A basic property of this form of conditional entropy is that:
 H(X|Y) = H(X,Y) - H(Y) .\,
Mutual information (transinformation)[edit]
Mutual information measures the amount of information that can be obtained about one random variable by observing another. It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. The mutual information of X relative to Y is given by:
I(X;Y) = \mathbb{E}_{X,Y} [SI(x,y)] = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x)\, p(y)}
where SI (Specific mutual Information) is the pointwise mutual information.
A basic property of the mutual information is that
I(X;Y) = H(X) - H(X|Y).\,
That is, knowing Y, we can save an average of I(X; Y) bits in encoding X compared to not knowing Y.
Mutual information is symmetric:
I(X;Y) = I(Y;X) = H(X) + H(Y) - H(X,Y).\,
Mutual information can be expressed as the average Kullback–Leibler divergence (information gain) between the posterior probability distribution of X given the value of Y and the prior distribution on X:
I(X;Y) = \mathbb E_{p(y)} [D_{\mathrm{KL}}( p(X|Y=y) \| p(X) )].
In other words, this is a measure of how much, on the average, the probability distribution on X will change if we are given the value of Y. This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:
I(X; Y) = D_{\mathrm{KL}}(p(X,Y) \| p(X)p(Y)).
Mutual information is closely related to the log-likelihood ratio test in the context of contingency tables and the multinomial distribution and to Pearson's χ2 test: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.

About us|Jobs|Help|Disclaimer|Advertising services|Contact us|Sign in|Website map|Search|

GMT+8, 2015-9-11 22:13 , Processed in 0.165368 second(s), 16 queries .

57883.com service for you! X3.1

返回顶部