Physics 434, 2012: Lecture 5

From Ilya Nemenman: Theoretical Biophysics @ Emory
Jump to: navigation, search
Emory Logo

Back to the main Teaching page.

Back to Physics 434, 2012: Information Processing in Biology.

We are continuing our review of some basic concepts of probability theory, such as probability distributions, conditionals, marginals, expectations, etc. We will discuss the central limit theorem and will derive some properties of random walks. Finally, we will study some specific useful probability distributions. In the course of this whole lecture block, we should be thinking about E. coli chemotaxis in the background -- all of these concepts will be applicable.

A very good introduction to probability theory can be found in Introduction to Probability by CM Grinstead and JL Snell.

Main Lecture

  • We are still answering the question: what will the distribution of E. coli positions be if it starts at 0 and moves for time .
  • Using the addition of CGFs, we show that when independent random variables add, their cumulants, and, in particular, their means and variances add.
  • Why do measurements of a quantity many times improve the measurement?
    • Frequencies and probabilities: Law of large numbers. If , then and . This follows from the addition of means and variances. So, we can calculate the mean and the variance of the E. coli motion.
  • Warmup question: Now consider an idealized spherical cell of radius whose entire surface is covered with disk-like receptors of radius . This is a reasonably good model for an immune cell, such as a mast cell. There are of such receptors. Using the Berg-Purcell limit from the first lecture, we know that the accuracy of determination of the concentration by a single receptor is , where is the diffusion coefficient and is the observation time. Since we have receptors, we use the law of large numbers to calculate that the overall accuracy of the concentration determination by the cell should be . On the other hand, if we consider the entire cell a single large receptor of size , the Berg-Purcell limit gives: . Can you reconcile the differences between these two estimates?
  • Central limit theorem: sum of many i.i.d. random variables (with finite variances) approaches a certain distribution, which we call a Gaussian distribution. This is the most remarkable law in the probability theory. It is supposed to explains why experimental noises are often Gaussian distributed as well. More precisely, suppose are i.i.d. random variables with mean and variance . Then the CLT says that is distributed according to (called the standard normal distribution), provided is sufficiently large. We prove this assuming that none of the cumulants of the i.i.d. variables is infinite.
    • The same holds if the variables have different variances and means, but all variances are bounded. Convergence will be slower though.
  • The central limit distribution has only the first two cumulants that are nonzero. What is this distribution?
    • It's a Gaussian with a given mean and a variance. We show this but explicitly computing the CGF of a Gaussian
    • Numerical simulation of the CLT for exponential and binary distributions: CLT.m
    • E. coli motion has a Gaussian distribution of end points. Moreover, we will show in a homework that for E. coli. It's a diffusive motion as well, just like diffision of small molecules. We demonstrate this by numerical simulations (homework)
  • Additional distributions to remember:
    • normal: diffusive motion
    • -distribution: deterministic limit ; .
    • multivariate normal: , here is the covariance matrix