Physics 212, 2019: Lecture 15
Back to the main Teaching page.
Back to Physics 212, 2019: Computational Modeling.
Optimization is hard
While a lot more can be added here (see the appropriate section of the Numerical Recipes book), it suffices to say that optimization is hard. And the more dimensional the optimization problem is, the harder it is, generally. Overall, this in because
- In large dimensions, the volume of the parameter space grows quickly. If one needs to find the minimum to the accuracy , and the typical size of the system along each dimension is , then there are spaces to explore in dimensions, which very quickly becomes hard even for relatively small .
- There's usually just one global minimum, but many local minima that we are not interested in. The optimizers get stuck in the local minima and can't escape them to find the global ones. This is more likely in large dimensional problems.
- There are also regions of space where the loss function is nearly flat, so that the optimizer does not know which way to go towards the minimum. Again, this is more likely in higher dimensions.
- In higher dimensions there are a lot of other exotic situations, such as valleys, narrow canyons, funnels, discontinuous walls -- and all those make optimization harder.
There are some exceptions -- problems like the multivariate linear regression one -- that are known as "convex", which means that there's just one minimum (global or local), and there are no flat regions, and so one can easily find the minimum. Nonetheless, even in non-convex problems, one often can find at least a local minimum by simply following the gradient, and sliding downhill. We will explore this in the later lectures.
Why is fitting hard: Large dimensionality, multiple minima, flat regions. What is the maximum accuracy of optimization?
For future lectures, start reading:
- http://docs.scipy.org/doc/scipy/reference/optimize.html -- Optimization with Python/SciPy
In the class lectures for this module, we will talk about the following methods:
- Multi-dimensional linear regression (no derivations) using numpy.linalg.lstsq http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.linalg.lstsq.html
- General considerations for nonlinear minimization (with and without derivatives; with and without second derivatives)
- Golden mean 1-d search
- Parabolic interpolation
- Newton method
- Your turn
- Write down a parabolic interpolation or Newton method minimizer of 1-d functions. Minimize a using your minimizer. Do you get the expected result?