Physics 380, 2010: Random Walks

From Ilya Nemenman: Theoretical Biophysics @ Emory
Jump to: navigation, search

Back to Physics 380, 2010: Information Processing in Biology.

Lectures 4 and 5

In these lecture, we talk about random walks, investigating the properties of their trajectories.

  • Biased random walk
    • Biased random walk: steps of length each, and the probability of left-right steps is not the same. For the total displacement, and .
  • First passage times: what is a distribution of time until a random walk or diffusion reaches a particular point?
    • Example of a neural action potential generation: First passage time to a threshold. We model the neural voltage as changing in discrete steps , starting from the potential , releasing a spike when , and resetting the voltage to . The right step probability is and the left step probability is . The bias of the walk is influenced by the current into the cell. What is the distribution of the time to the next spike, ?
    • First passage for right-biased walk. Let's suppose that . Then always stepping right is the most probable situation. Reaching the threshold will require steps. They are independent, and their times will add. As a result, =N\mu_{\rm one step}</math>, . As a result, the coefficient of variation . Hence the coefficient of variation for an extremely right-biased walk is small. In particular, this means that this neuron, if driven to fire often, will fire regularly, with very small noise. It becomes deterministic.
    • First passage for the left-biased walk. Let's now suppose . In this case, the state of the system is going to be returning back to many times before it finally reaches . Starting with , if we go forward without turning back, the probability of such trajectory is . If we turn back once, we have to make one more forward step as well, resulting in . But there could have been different places where the step back could have occurred. So the total probability of any such 1-step-back trajectory is . Hence the most direct, no turn, trajectory is more probable than 1-turn ones when . Thus for some small the direct trajectory is always the most probable exit trajectory. Hence the exit process consists of a waiting time to this unlikely event, and then the event itself. Hence the waiting time to exit is exponentially distributed, with .
    • Overall: the coefficient of variation of the exit time as a function of changes from 1 to 0 as goes from 0 to 1. We have discussed these processed in detail in (Bel et al., 2010).
  • E. coli chemotaxis as a biased random walk: going up the gradient of an attractant, time to a tumble increases. This is described very well in (Berg 2000, Berg and Brown 1972).
    • Showing that the E. coli can find the greener pastures with this protocol: looking at nearby points , closer than the length of a single typical run, , where is the mean waiting time to a tumble at a concentration . Similarly, . In steady state: . Therefore, , so that is higher in the direction where increases.
    • Simulations of E. coli trajectories and intro to Matlab. See Matlab simulation code.
    • Generation of exponential random numbers: log of uniform random number is an exponentially distributed random number.
  • First passage and first return
    • Connections between the two moment generating functions: typically problems for first/eventual passage/return/location analysis are solved using moment generating functions. E.g., probability of being at point at time is equal to a probability of first passing through at and then returning to in time . Hence .
    • Return and passage probabilities in different dimensions: mean return times diverge in all dimensions; probability of eventual return is 1 in 1-d and 2-d, and about 0.65 in 3-d.
  • Return times and Berg-von Hippel transcription factor searching for a binding site. What is an optimal strategy for a transcription factor to search for a binding site?
    • For a diffusive process, the radius of explored region goes as . The number of different sites in the explored region is . But the number of different visited sites is . Hence each site is explored about times. Hence in 1-d each site is explored many times, in 2-d each site is (barely) explored, and in 3-d very few sites are ever explored
    • Why 1-d search would fail? Because too much time is spent on exploration -- you always come back.
    • Why 3-d search would fail? Because very few sites are ever explored, and the TF will not come close to its needed target.
    • Why 1-d/3-d search is faster? You can move fast between patches (3-d), and then explore each patch throughly in 1-d way. Details of 1-3d search (following Slutsky and Mirny, 2004):
      • Search partitioned into 1-3d search rounds.
      • Total search time is the sum of search times in both modes: , where is the number of rounds.
      • In 3-d search the protein almost never come back to the same search patch.
      • In 1-d search the protein explores sites. Hence , where is the DNA length.
      • We get
      • for this model, where is the 1d diffusion constant. In general, we get .
      • Thus .
    • Is there an optimal time to spend on a 1-d search? Differentiating w.r.t. , we get . The transcription factor should spend the same amount of time in 1-d and 3-d search modes. Slutsky and Mirny (2004) review experimental confirmations of this.
  • Wiener process: A good model of random walk at long temporal and spatial scales is diffusion. That is . It's useful to represent such as a solution of an ordinary differential equation where is a Gaussian random variable with zero mean and the covariance . See the Homework problem No. 1 for the derivation of this. The random variable is called the Wiener process, after Norbert Wiener, who invented it.

Homework (due Sep 17)

Note that from now on, we will have a lot of numerical simulations in our homeworks. Unless you have access to Matlab (university owned computers do), I suggest that you download Octave as described above and do all of these simulations in Octave. Save your programs -- we will keep reusing some pieces of them later in the course.

  1. Suppose the variable undergoes a diffusive motion with the mean drift of and the variance of . I would like to numerically simulate this stochastic dynamics on time scales much larger than the time of a single hop. For this, I write that , where are deterministic numbers, and is a Gaussian random number with zero mean and unit variance. Find a relation between and . Now let's take , move to the left of the equal sign, and divide everything by . We will get , where is deterministic, and is a random number. Show that , , and if . This is a very interesting differential equation that has a random term of an infinite variance in its right hand side. However, the random terms are independent from one moment of time to the next, and the infinities cancel, leaving only a small random component over long times. As a shorthand, we write such equations as (recall our definition of the -function): . This is called a stochastic differential equation (SDE), and, if , then is called the Wiener process. While these definitions might sound confusing, especially with the infinities floating around, they will turn out to be very useful later. To avoid confusion, whenever we see such an SDE, we always interpret it as obeying a finite difference equation above, .
  2. In class, we have discussed the first passage time in the random walk model of an action potential generation. Let's compare our findings to numerical simulations. Let's suppose a neuron starts at rest with the voltage . Every time step it's voltage can either go up by or down by with probabilities and , respectively. If the voltage reaches , then it cannot be lowered anymore. If it reaches , then the neuron releases an action potential (that is, it fires). Write an Octave program to simulate this random walk for arbitrary and to record the time it takes for the neuron to fire. Run this program sufficient number of times to estimate the mean and the standard deviation of the time to firing for , , , , and for value of . Plot the curve of the coefficient of variation of the firing time as a function of . Is it similar to what we saw in class? Note that Octave is very inefficient in implementing for or while cycles. However, it is very fast when operating on entire arrays (vectors) of numbers. Knowing this will help you to write programs that operate faster. The programs we wrote during the Wed and Thu study sessions can be downloaded here.
  3. For Graduate Students and especially devoted Undergraduates (read: an Extra Credit assignment): Let's verify whether what I told you in class about random walk return probabilities is correct. We will solve Problem 1.1.17 in Grinstead and Snell book. Mathematicians have been known to get some of the best ideas while sitting in a cafe, riding on a bus, or strolling in the park. In the early 1900s the famous mathematician George Polya lived in a hotel near the woods in Zurich. He liked to walk in the woods and think about mathematics. Polya describes the following incident:
At the hotel there lived also some students with whom I usually took my meals and had friendly relations. On a certain day one of them expected the visit of his fiancee, what (sic) I knew, but I did not foresee that he and his fiancee would also set out for a stroll in the woods, and then suddenly I met them there. And then I met them the same morning repeatedly, I don’t remember how many times, but certainly much too often and I felt embarrassed: It looked as if I was snooping around which was, I assure you, not the case.
This set him to thinking about whether random walkers were destined to meet. Polya considered random walkers in one, two, and three dimensions. In one dimension, he envisioned the walker on a very long street. At each intersection the walker flips a fair coin to decide which direction to walk next. In two dimensions, the walker is walking on a grid of streets, and at each intersection he chooses one of the four possible directions with equal probability. In three dimensions (we might better speak of a random climber), the walker moves on a three-dimensional grid, and at each intersection there are now six different directions that the walker may choose, each with equal probability.
  • Write a program to simulate a random walk in one dimension starting at 0. Have your program print out the lengths of the times between returns to the starting point (returns to 0). See if you can guess from this simulation the answer to the following question: Will the walker always return to his starting point eventually or might he drift away forever?
  • The paths of two walkers in two dimensions who meet after n steps can be considered to be a single path that starts at (0, 0) and returns to (0, 0) after 2n steps. This means that the probability that two random walkers in two dimensions meet is the same as the probability that a single walker in two dimensions ever returns to the starting point. Thus the question of whether two walkers are sure to meet is the same as the question of whether a single walker is sure to return to the starting point. Write a program to simulate a random walk in two dimensions and see if you think that the walker is sure to return to (0, 0). If so, Po ́lya would be sure to keep meeting his friends in the park. Perhaps by now you have conjectured the answer to the question: Is a random walker in one or two dimensions sure to return to the starting point? Po ́lya answered this question for dimensions one, two, and three. He established the remarkable result that the answer is yes in one and two dimensions and no in three dimensions.
  • Write a program to simulate a random walk in three dimensions and see whether, from this simulation and the results of (a) and (b), you could have guessed P ́olya’s result.