Physics 434, 2012: Homework 2

From Ilya Nemenman: Theoretical Biophysics @ Emory
Jump to: navigation, search
Emory Logo

Back to the main Teaching page.

Back to Physics 434, 2012: Information Processing in Biology.

Please turn on the assignment either as a PDF file to me by email, or as a printout to my mailbox in physics. Detailed derivations (with explanations) and calculations must be present for the problems for full credit.

  1. Let be three possible outcomes of an experiment. Let . What is the probability that or will happen? That won't happen?
  2. Let's calculate some expectation values:
    • Calculate the mean and the variance of a uniform distribution, .
    • Calculate the cumulant generating function of an exponential distribution , and use it to calculate the cumulants of the exponential distribution.
    • Calculate the mean and the variance of a Poisson distribution, , directly, without using CGF method and compare to the CGF-based results in class.
  3. An E. coli moving on a 2-dimensional surface is being tracked in an experiment. It chooses a direction at random and runs, then tumbles and reorients randomly, runs for the second time, tumbles yet again, and keeps running. What is the probability that all three of the directions that it chooses all fall not farther than from each other. That is, what is the probability that the bacterium moves in roughly speaking the same direction all three times? For graduate students: Can you generalize this for tumbles, instead of three?
  4. In class we discussed an approximation for the motion of E. coli, where the bacterium, moving in two dimensions, would tumble and reorient completely, moving with the velocity of between the tumbles. Suppose the E. coli tumbles at random times, and the distribution of intervals between two successive tumbles is the exponential distribution with the mean .
    • What is the distribution of the number of times the E.coli will tumble over a time .
    • Remember that means and variances of independent random variables add and use this fact repeatedly to calculate the mean and the variance of the displacement of E. coli in this model. How does the variance of the displacement grow with time?
    • For Graduate Students: If we complicate the model even further, and say that the velocity for each run is sampled independently from a Gaussian distribution , how does the variance grow with time then? How should the bacterium move to make the variance grow faster than linearly with the time?
  5. Let's verify the law of large numbers numerically. Take a variable that can have three outcomes, with probabilities as in Problem 1 above. Using Matlab/Octave, generate 10 random realizations of this variable. Calculate the observed frequency of the outcome . What is the squared difference of the frequency from the true probability of 1/2? Repeat the procedure 100 times to get a good estimate of the variance of the difference between the frequency and the probability. Now do the same for 30, 100, 300, 1000, 3000, and 10000 samples from the distribution. Plot the variance of the frequency-probability difference vs.\ the number of samples. Do you see the expected 1/N trend?
  6. This problem will not be graded -- it's again an exercise to get you up to speed with Matlab/Octave quicker. Write a Matlab code that would generate random E. coli trajectories as described in Problem 3: constant velocity motion, exponential waiting time between tumbles, and random re-orientation during a tumble. Make sure your code can build trajectories of arbitrary durations. Your program should end up looking something like this
  7. This problem will not be graded. Let's do a simulation of the Luria-Delbruck experiment. Consider a colony with a million bacteria. Let's suppose selection acts on standing variation. Then a fraction of them is resistant, and the rest are not. The dynamics of the fraction can be modeled as generating a resistant mutation, which happens with a long exponential waiting time, and then the number of the resistant individuals does an unbiased random walk (can you explain why?) We can do the walk for a billion steps? (Maybe instead of doing the walk, we can simulate where the number of resistant mutants will end up using a diffusion approximation). Then a small fraction of the colony is being taken out at random (say, 1000 cells). How many of these are resistant? (and will form resistant colonies)? Repeat this many times to get the mean and the variance. If selection causes variation, then we model the entire million of bacteria as non-resistant, choose a 1000 of them, and then flip each of a 1000 randomly into resistance independently. What are the means and the variances you get? Compare the two models in the case when the background mutations are small. Which of the two distributions has a higher variance?