{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Random numbers\n", "Module 5 of Phys212/Biol212 class\n", "### by Ilya Nemenman, 2016-2019" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Defining High Performance Computing\n", "\n", "Note to self: need to expand the introduction\n", "Nowadays news are full with articles about new technological breakthroughs in artificial intelligence: self-driving cars, automatic translation, and even computer-generated art. What makes all of this possible, among other things, is a dramatic increase in our computational capabilities in the last decade. Interestingly, the speed of computer processors has not changed that much over this time -- an individual processor now is only a bit faster than is used to be ten years ago. What has changed is that we figured out how to put more processors into a single computer (and sometimes dramatically more -- literally thousands of specialized processors in some video cards) and how to make all of these processors to work collectively on solving large computational problems we want to solve. In this module, we discuss the basics of such parallel computing and will figure out how to implement it on your laptops.\n", "\n", "What is high performance computing? To answer this, we need to define a series of terms. \n", " - **Processor** , or Central Processing Unit (**CPU**): The *brain* of the computer, the part in which most computing operations are controlled and executed. \n", " - **Serial Processing**: Execution of a single program step by step on one CPU.\n", " - **Sequential Processing**: Execution of multiple programs on a single CPU in a sequence.\n", " - **Concurrent Processing**: Execution of multiple programs at the same time. \n", " - **Parallel Processing**: Execution of multiple programs (or parts of the same one) on multiple CPUs. \n", " - **High Performance Computing**: It is basically another name for solving computational problems concurrently on multiple CPUs.\n", "It gets a bit more complicated. One can have serial and sequential processing (completing one task before starting the other on a single CPU). One can also have serial and concurrent processing (alternating between two or more tasks on the same single CPU). Finally, one can have triw parallel and concurrent processing, which means executing multiple tasks simultaneously at the same instant of time on multiple CPUs.\n", "\n", "### Organization of Parallel Computing\n", "What are the different ways of organizing parallel processing? We can develop the intuition by considering how you can arrange for multiple people in a group to do a certain complex task that consists of many different subtasks. For example, let's suppose that a group of people are arriving to a supermarket to buy a long list of supplies for a party. How should the long shopping list be split among the group? There are many different ways of doing this.\n", " - The simplest option is not to delegate different parts of the shopping list to different people, but to have just one person to complete all of the steps of the task. This is equivalent to sequential processing.\n", " - The second option is to have the group captain assign a list of subtasks to each of the people in the group, and let them all focus on each of their assigned tasks. This incurs communication cost. First, the partitioning of the full list of tasks into pieces must be done, and then these sublists must be communicated to the people. When they complete their individual tasks, the results of the completion must be communicated to the leader. Further, there are additional inefficiencies. While the task is being partitioned, the people will be waiting for their assignments, doing nothing. Then some of them will finish their subtasks before the others, and will again wait, doing nothing.\n", " - The arrangement above may possibly be improved, where whenever a person finishes his/her task, s/he starts helping the others who are still doing theirs. But this is easier said then done. Whom should they help? To know this, each person must be constantly communicating his/her progress, either broadcasting it to everyone in the group, or at least sending it to the captain, so that either everyone knows who needs help, or the captain can tell everyone. Further, when a helper arrives, the task that is running behind now needs to be partitioned again, which will take additional time and resources.\n", " - To avoid some of these problems, one may want to duplicate the tasks originally, but prioritize them, so that every person first works on his/her task, and then switches on other pre-assignees tasks, constantly reporting results. You can fill in the gaps of what kind of problems this solution carries with itself.\n", "\n", ">### Your turn\n", "Can you suggest other arrangements of how the tasks can be divided over multiple individuals?\n", "\n", "Note that, in all of these (and yours) suggestion, the more complicated the arrangement is, the more communication needs to be done. And communication takes time -- so that, at a certain point, communication cost outweighs the savings one will generate from a complex task-partitioning scheme.\n", "\n", "With these different arrangements, it makes sense to have a bit finer characterization of concurrent / parallel processing. \n", " - **Parallel processing** per se: Collection of connected, tightly coupled, strongly communicating processors concurrently solving a problem.\n", " - **Distributed processing**: Loosely coupled processors, perhaps great distances from each other, working concurrently on problems with minimal communication among them.\n", "\n", "True parallel processing is further distinguished by how memory of the computers is organized\n", " - **Shared memory**: Parallel processors access the same memory during concurrent processing, requiring minimal communication beyond the shared memory.\n", " - **Distributed memory**: Each processor has it's own memory to operate with, and synchronization of tasks is achieved through communication.\n", "\n", "Crucially, a program written for sequential processing won't execute on many processors. It needs to be **parallelized** or **scaled** to many processors. This usually involves careful consideration of how concurrent processing will be organized, how the tasks will be partitioned, how multiple data streams and execution streams will be synchronized, etc. A useful metric of how well the code is parallelized is the **speedup**, the ratio of the time it takes to execute it on $n$ processors compared to execution on $1$. It's essentially impossible to achieve the speedup of larger than $n$, and it is usually less than $n$ due to communication costs. \n", "\n", "### Types of parallel algorithms\n", "Note to self: need pictures of different parallelization schemes.\n", "Some problems are easier to parallelize than others, and different approaches are required for parallelization. These include:\n", " - **Embarrassingly parallel algorithms**: The task can be divided into subtasks that have virtually no communication.\n", "Typically problems such as generating statistics (e.g., producing many DLA clusters of Module 4) are embarrassingly parallelizable. For example, you could have run multiple Jupyter notebooks at the same time, and each would produce a single cluster, whose lengths can then be averaged.\n", " - **Data partitioning algorithms**: For example, when calculating a larger sum, the data set can be partitioned into parts, each part can be summed independently, and then the sums will be added. Here a *root* process partitions the data into subsets, and then collects the results. The root has a privileged state -- it distributes the tasks, but doesn't execute them itself.\n", " - **Divide and conquer algorithms**: Work is done without a root. Here each processor gets a task and, if it can't execute it in a requested time, it subdivides the task into two, leaves one half for itself, and recruits another processor for the remaining half. Both of the processors then continue doing the same, until each of the recruited processors has a manageable task. When the tasks are completed, each processor reports its results to the processor who originally gave it the task. \n", " - **Coarsening algorithms**: Here the processors solve the problem on multiple scales. Note to self: more to be added.\n", "\n", "For all of these algorithms, it is crucial to think how the data must be organized to speed up communication." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problems to be solved with parallel processing\n", "We already saw a few such problems in Module 4. Essentially every problem solved using agent-based simulations can be parallelized using embarrassingly parallelizeable approach. Embarrassing parallelization won't work, however, for approaches using ceullar automata. The problems we discussed last time (DLA, annihilation) can be solved by data partitioning, where each one of the processors analyzes a sub-part of the lattice.\n", "\n", "There's a special type of cellular automata, for which parallel processing is usually used (and, in fact, was developed originally). These are called *partial differential equations*. We will ontroduce one of those here -- the heat diffusion equation, but the Chemical Master Equation from Track 2 in Module 4 is another example. Note to self: maybe stick with CME instead of diffusion? \n", "\n", "Note to self: This needs to be expanded a lot. \n", "The heat diffusion equation -- one of many partial differential equations (differential equations involving partial derivatives).\n", " - Fick's first law of diffusion $J=-D\\frac{d\\phi}{dx}$ or $\\vec{J}=-D\\nabla \\phi$ (also known as Newton's law of cooling if $\\phi=T$, or the Fourier law of heat conduction).\n", " - Fick's second law of diffusion $\\partial_t \\phi = -D\\nabla^2 \\phi$.\n", " - Finite difference form of the heat diffusion equation; need boundary conditions to complete the solution.\n", " - Boundary conditions (implemented with extending the grid matrix):\n", " - Absorbing\n", " - Reflecting\n", " - Peeriodic\n", "\n", "**Solution of the equation using Python code** To be added.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" } }, "nbformat": 4, "nbformat_minor": 2 }