Physics 380, 2011: Lecture 10
Back to the main Teaching page.
Back to Physics 380, 2011: Information Processing in Biology.
In these lectures, we cover some background on information theory. A good physics style introduction to this problem can be found in the upcoming book by Bialek (Bialek 2010). A very nice, and probably still the best, introduction to information theory as a theory of communication is (Shannon and Weaver, 1949). A standard and very good textbook on information theory is (Cover and Thomas, 2006).
Finishing the previous leture
- Mutual information: what if we want to know about a variable , but instead are measuring a variable . How much are we learning about then? This is given by the difference of entropies of before and after the measurement: .
- Meaning of mutual information: mutual information of 1 bit between two variables means that by querying one of them as much as possible, we can get one bit of information about the other.
- Properties of mutual information
- Limits: . Note that the first inequality becomes an equality iff the two variables are completely statistically independent.
- Mutual information is well-defined for continuous variables.
- Reparameterization invariance: for any , the following is true .
- Data processing inequality: For , . That is, information cannot get created in a transformation of a variable, whether deterministic or probabilistic.
- Information rate: Information is also an extensive quantity, so that it makes sense to define an information rate .
- Mutual information of a bivariate normal with a correlation coefficient is .
- For Gaussian variables , where is the signal, is the response, and is the noise related to the input, .
Warmup question
- For transmitting information through a synthetic transcriptional circuit in E. coli (Guet et al., 2002) -- see picture on the board -- which of the following quantities might constrain the mutual information between the chemical signal and the expressed reporter response?
- The mean molecular copy number of the reporter molecule.
- The mean molecular copy number of the other, non-reporter genes.
- The probability distribution of the input signals.
In this lecture, we will try to derive the limits on the quality of information processing in molecular circuits.
Main Lecture
- We follow the discussion of Ziv et al., 2007.
- Consider a chain: signal s -> mean response -> actual noisy response r. Due to signal processing inequality, .
- Assuming that , and following the general formula for noise propagation from two lectures ago, we get , where .
- Simple counting argument suggests that for N molecules with error, there are distinguishable states, and the information is limited by .
- For a fixed mean response N, we can calculate that maximizes . We get . And the information becomes .
- This is the best possible result. But in a biochemical system not every is possible. Biochemical systems are modeled typically with Hill activation/suppression dynamics (activation of x by y) or (suppression of x by y). And different values of the chemical signals may switch the binding/MIchaelis constant . To achieve good information transmission, one would need to make sure that mean responses to different chemical inputs are well-separated and narrow -- but one wouldn't be able to have the optimal distribution of as stated above (see picture on board how these distribution looks).
- Ziv et al., 2007, have shown that the inability of biochemical networks to have the optimal is not very important. Even with this constraint, the maximum information that can be transmitted through these systems is still close to .
- Biochemical systems can be very efficient in transmitting information, with their intrinsic stochastic noisiness being, essentially, the main constraint.