Ilya: 1 revision imported

2018-07-04T16:28:41Z

1 revision imported

nemenman>Ilya: /* Warmup questions */

2011-09-27T13:58:27Z

‎Warmup questions

New page

{{PHYS380-2011}}

In these lectures, we cover some background on information theory. A good physics style introduction to this problem can be found in the upcoming book by Bialek (Bialek 2010). A very nice, and probably still the best, introduction to information theory as a theory of communication is (Shannow and Weaver, 1949). A standard and very good textbook on information theory is (Cover and Thomas, 2006).

==Warmup questions==
#Does noise in signal transduction pathways affect information transmission?
#We would like to characterize how much information is transmitted by a cellular signaling pathway, say the NF-<math>\kappa</math>B pathway depicted on the right (Cheong et al. 2011) , or in ''E. coli'' transcription (Guet et al., 2002; Ziv et al., 2007), as shown on the left. What characteristics of the system should we measure in order to be able to quantify this? Specifically, do we need:
#*<math><r></math>, <math><r|s></math> only?
#*<math><r></math>, <math><r|s></math>, and <math>\sigma^2_r</math>, <math>\sigma^2_{r|s}</math> only?
#*<math>P(r|s)</math> for all s only?
#*<math>P(r|s)</math> for all s and <math>P(s)</math>, that is, the entire <math>P(r,s)</math>?

==Main lecture==

*Setting up the problem: How do we measure information transmitted by a biological signaling system?
*Shannon's axioms and the derivation of entropy: if a variable <math>x</math> is observed from a distribution <math>P(x)</math> then the amount of the information we gain from this observation must obey the following properties.
*#If the cardinality of the distribution grows and the distribution is uniform, then the measure of information grows as well.
*#The measure of information must be a continuous function of the distribution <math>P(x)</math>
*#The measure of information is additive. That is, for a fine graining of <math>x</math> into <math>\xi</math>, we should have <math>S[\xi]=S[x]+\sum P(x) S[\xi|x]</math>.
Up to a multiplicative constant, the measure of information is then <math>S=-\sum P \log P</math>, which is also called the Boltzman-Shannon entropy. And we fix the constant by defining the entropy of a uniform binary distribution to be 1. Then <math>S=-\sum P \log_2 P</math>. The entropy is then measured in ''bits''.
*Meaning of entropy: Entropy of 1 bit means that we have gained enough information to answer one yes or no (binary) question about the variable <math>x</math>.
*Properties of entropy (positive, limited, convex):
*#<math>0\le S[X]\le \log_2k</math>, where <math>k</math> is the cardinality of the distribution. Moreover, the first inequality becomes an equality iff the variable is deterministic (that is, one event has a probability of 1), and the second inequality is an equality iff the distribution is uniform.
*#Entropy is a convex function of the distribution
*#Entropies of independent variables add.
*#Entropy is an extensive quantity: for a joint distribution <math>P(x_1,x_2,\dots,x_n)</math>, we can define an entropy ''rate'' <math>S_0=\lim_{n\to\infty} S[X_1,\dots,X_n]/n</math>.
*Differential entropy: a continuous variable <math>x</math> can be discretized with a step <math>\Delta x</math>, and then the entropy is <math>S[X]=-\sum P(x)\Delta x\log_2 \left(P(x)\Delta x\right)\to \int dx P(x)\log_2P(x) +\log_21/\Delta x</math>. This formally diverges at fine discretization: we need infinitely many bits to fully specify a continuous variable. The integral in the above expression is called the ''differential entropy'', and whenever we write <math>S[X]</math> for continuous variables, we mean the differential entropy.
*Entropy of a normal distribution with variance <math>\sigma^2</math> is <math>S=1/2\log_2\sigma^2 + {\rm const}</math>.
*Multivariate entropy is defined with summation/integration of log-probability over multiple variables, cf. entropy rate above.
*Conditional entropy is defined as averaged log-probability of a conditional distribution
*Mutual information: what if we want to know about a variable <math>x</math>, but instead are measuring a variable <math>y</math>. How much are we learning about <math>x</math> then? This is given by the difference of entropies of <math>x</math> before and after the measurement: <math>\begin{array}{ll}I[X;Y]&=S[X]-\langle S[X|Y]\rangle_y\\&=S[X]+S[Y]-S[X,Y]\\&=\langle\log_2\frac{P(x,y)}{P(x)P(y)}\end{array}</math>.
*Meaning of mutual information: mutual information of 1 bit between two variables means that by querying one of them as much as possible, we can get one bit of information about the other.
*Properties of mutual information
*#Limits: <math>0\le I[X;Y]\le \min(S[X],S[X])</math>. Note that the first inequality becomes an equality iff the two variables are completely statistically independent.
*#Mutual information is well-defined for continuous variables.
*#Reparameterization invariance: for any <math>\xi=\xi(x),\, \eta=\eta(y)</math>, the following is true <math>I[X;Y]=I[\Xi;\Eta]</math>.
*#Data processing inequality: For <math>P(x,y,z)=P(x)P(y|x)P(z|y)</math>, <math>I[X;Z]\le \min (I[X;Y], I[Y;Z])</math>. That is, information cannot get created in a transformation of a variable, whether deterministic or probabilistic.
*#Information rate: Information is also an extensive quantity, so that it makes sense to define an information rate <math>I_0=\lim_{n\to\infty}I[X_1,\dots,X_n;Y_1\dots Y_n]/n</math>.
*Mutual information of a bivariate normal with a correlation coefficient <math>\rho</math> is <math>I=1/2 \log_2(1-\rho^2)</math>.
*For Gaussian variables <math>y=g(x+\eta)</math>, where <math>x</math> is the signal, <math>y</math> is the response, and <math>\eta</math> is the noise related to the input, <math>I[X;Y]=\frac{1}{2}\log_2\left(1+\frac{\sigma^2_x}{\sigma^2_\eta}\right)=\frac{1}{2}\log_2(1+SNR)</math> (see the homework problem).

← Older revision	Revision as of 16:28, 4 July 2018
(No difference)

Physics 380, 2011: Lecture 9 - Revision history

Ilya: 1 revision imported

nemenman>Ilya: /* Warmup questions */