{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# HPX + Cling + Jupyter\n", "This tutorial works in a special Jupyter notebook that can be used in one of two ways:\n", "* From this website: https://hpx-jupyter.cct.lsu.edu\n", "* From the docker image: stevenrbrandt/fedora-hpx-cling\n", "* Normally, each cell should contain declarations, e.g. definitions of functions,\n", " variables, or `#include` statements.\n", "
\n", " ```#include \n", "using namespace std;```
\n", "* If you wish to process an expression, e.g. ```cout << \"hello world\\n\"``` you\n", " can put ```.expr``` at the front of the cell.\n", "
\n", " ```.expr cout << \"hello, world\\n\";```
\n", "* Sometimes you will want to test a cell because you are uncertain whether\n", " it might cause a segfault or some other error that will kill your kernel.\n", " Othertimes, you might want to test a definition without permanently adding\n", " it to the current namespace. You can do this by prefixing your cell with\n", " ```.test```. Whatever is calculated in a test cell will be thrown away\n", " after evaluation and will not kill your kernel.\n", "
\n", " ```.test.expr int foo[5];\n", "foo[10] = 1;```
\n", "## Docker Instructions\n", "* Frist, install Docker on your local resource\n", "* Second, start Docker, e.g. ```sudo service docker start```\n", "* Third, run the fedora-hpx-cling container, e.g.\n", "\n", "
```$ docker pull stevenrbrandt/fedora-hpx-cling\n", "$ docker run -it -p 8000:8000 stevenrbrandt/fedora-hpx-cling```
\n", " \n", " After you do this, docker will respond with something like\n", " \n", "
`http://0.0.0.0:8000/?token=5d1eb8a4797851910de481985a54c2fdc3be80280023bac5`
\n", " \n", " Paste that URL into your browser, and you will be able to interact with the notebook.\n", "* Fourth, play with the existing ipynb files or create new ones.\n", "* Fifth, save your work! This is an important step. If you simply quit the container, everything you did will be lost. To save your work, first find your docker image using ```docker ps```.\n", "\n", "
```$ docker ps\n", "CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES\n", "4f806b5f4fb3 stevenrbrandt/fedora-hpx-cling \"/bin/sh -c 'jupyter \" 11 minutes ago Up 11 minutes 0.0.0.0:8000->8000/tcp dreamy_turing```
\n", "\n", " Once you have it (in this case, it's 4f806b5f4fb3), you can use ```docker cp``` to transfer files to or from your image.\n", " \n", "
```$ docker cp 4f806b5f4fb3:/home/jup/HPX_by_example.ipynb .\n", "$ docker cp HPX_by_example.ipynb 4f806b5f4fb3:/home/jup```
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#include " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using namespace std;\n", "using namespace hpx;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# What is a (the) Future?\n", "\n", "Many ways to get hold of a future, simplest way is to use (std) async:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "int universal_answer() { return 42; }\n", "void deep_thought()\n", "{\n", " future promised_answer = async(util::annotated_function(&universal_answer,\"universal answer\"));\n", " // do other things for 7.5 million years\n", " cout << promised_answer.get() << endl; // prints 42\n", " apex::dump(true);\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we want to do something other than a declaration, use the \".expr\" prefix." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "42\n", "\n", "Elapsed time: 65.7861 seconds\n", "Cores detected: 4\n", "Worker Threads observed: 2\n", "Available CPU time: 131.572 seconds\n", "\n", "Timer : #calls | mean | total | % total \n", "------------------------------------------------------------------------------------------------\n", " : 1 0.00e+00 0.00e+00 0.000\n", " APEX MAIN : 1 6.58e+01 6.58e+01 100.000\n", " background_work : 1 2.01e-05 2.01e-05 0.000\n", " call_startup_functions_action : 1 0.00e+00 0.00e+00 0.000\n", " load_components_action : 1 0.00e+00 0.00e+00 0.000\n", " pre_main : 1 0.00e+00 0.00e+00 0.000\n", " run_helper : 1 0.00e+00 0.00e+00 0.000\n", " task_object::apply : 1 6.58e+01 6.58e+01 49.984\n", " universal answer : 1 1.49e-05 1.49e-05 0.000\n", " APEX Idle : 6.58e+01 50.016\n", "------------------------------------------------------------------------------------------------\n", " Total timers : 9\n" ] }, { "data": { "text/plain": [] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ ".expr deep_thought()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Compositional Facilities" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "future make_string()\n", "{\n", " future f1 = async([]()->int { return 123; });\n", " future f2 = f1.then(\n", " [](future f) -> string\n", " {\n", " return to_string(f.get()); // here .get() won't block\n", " });\n", " return f2;\n", "}" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Elapsed time: 5.3791 seconds\n", "Cores detected: 4\n", "Worker Threads observed: 2\n", "Available CPU time: 10.7582 seconds\n", "\n", "Timer : #calls | mean | total | % total \n", "------------------------------------------------------------------------------------------------\n", " : 1 0.00e+00 0.00e+00 0.000\n", " APEX MAIN : 1 5.38e+00 5.38e+00 100.000\n", " background_work : 1 2.06e-05 2.06e-05 0.000\n", " call_startup_functions_action : 1 0.00e+00 0.00e+00 0.000\n", " load_components_action : 1 0.00e+00 0.00e+00 0.000\n", " pre_main : 1 0.00e+00 0.00e+00 0.000\n", " run_helper : 1 0.00e+00 0.00e+00 0.000\n", " task_object::apply : 1 5.42e+00 5.42e+00 50.420\n", " universal answer : 1 0.00e+00 0.00e+00 0.000\n", " APEX Idle : 5.33e+00 49.579\n", "------------------------------------------------------------------------------------------------\n", " Total timers : 9\n", "123\n" ] }, { "data": { "text/plain": [] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ ".expr cout << make_string().get() << endl << apex::dump(true);\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Parallel Algorithms\n", "HPX allows you to write loop parallel algorithms in a generic fashion, applying to specify the way in which parallelism is achieved (i.e. threads, distributed, cuda, etc.) through polcies." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#include \n", "#include \n", "#include " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "vector v = { 1, 2, 3, 4, 5, 6 };" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Transform\n", "Here we demonstrate the transformation of a vector, and the various mechnanisms by which it can performed in parallel." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ ".expr\n", "// This parallel tranformation of vector v\n", "// is done using thread parallelism. An\n", "// implicit barrier is present at the end.\n", "parallel::transform (\n", " parallel::par,\n", " begin(v), end(v), begin(v),\n", " [](int i) -> int\n", " {\n", " return i+1; \n", " });\n", "for(int i : v) cout << i << \",\";" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ ".expr\n", "// This parallel tranformation of vector v\n", "// is done using thread parallelism. There\n", "// is no implicit barrier. Instead, the\n", "// transform returns a future.\n", "auto f = parallel::transform (\n", " parallel::par (parallel::v3::task),\n", " begin(v), end(v), begin(v),\n", " [](int i) -> int\n", " {\n", " return i+1; \n", " });\n", " \n", "// wait for the future to be ready.\n", "f.wait();\n", "\n", "for(int i : v) cout << i << \",\";" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#include \n", "#include \n", "#include " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "auto host_targets = hpx::compute::host::get_local_targets();\n", "typedef hpx::compute::host::block_executor<> executor_type;\n", "executor_type exec(host_targets);" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ ".expr\n", "// Print out a list of the localities, i.e. hosts\n", "// that can potentially be involved in this calculation.\n", "// This notebook will probably show 1, alas.\n", "for(auto host : host_targets)\n", " cout << host.get_locality() << endl;" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ ".expr\n", "// This parallel tranformation of vector v\n", "// is done using using distributed parallelism.\n", "parallel::transform (\n", " parallel::execution::par.on(exec),\n", " begin(v), end(v), begin(v),\n", " [](int i) -> int\n", " {\n", " return i+1; \n", " });\n", "\n", "for(int i : v) cout << i << \",\";" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Other Algorithms\n", "There are a great many algorithms. Here we demonstrate \"fill\"." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ ".expr\n", "std::vector vd;\n", "for(int i=0;i<10;i++) vd.push_back(1.f);\n", "parallel::fill(parallel::execution::par.on(exec),vd.begin(),vd.end(),0.0f);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Let’s Parallelize It – Adding Real Asynchrony\n", "\n", "Here we take a step back. Instead of using a pre-designed parallel operation on a vector, we instead introduce task-level parallelism to an existing program.\n", "\n", "Calculate Fibonacci numbers in parallel (1st attempt)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "uint64_t fibonacci(uint64_t n)\n", "{\n", " // if we know the answer, we return the value\n", " if (n < 2) return n;\n", " // asynchronously calculate one of the sub-terms\n", " future f = async(launch::async, &fibonacci, n-2);\n", " // synchronously calculate the other sub-term\n", " uint64_t r = fibonacci(n-1);\n", " // wait for the future and calculate the result\n", " return f.get() + r;\n", "}" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ ".expr cout << fibonacci(10) << endl;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Let’s Parallelize It – Introducing Control of Grain Size\n", "Parallel calculation, switching to serial execution below given threshold" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "const int threshold = 20;\n", "\n", "uint64_t fibonacci_serial(uint64_t n)\n", "{\n", " if (n < 2) return n;\n", " uint64_t f1 = fibonacci_serial(n-2);\n", " uint64_t f2 = fibonacci_serial(n-1);\n", " return f1 + f2;\n", "}\n", "\n", "uint64_t fibonacci2(uint64_t n)\n", "{\n", " if (n < 2) return n;\n", " if (n < threshold) return fibonacci_serial(n);\n", " // asynchronously calculate one of the sub-terms\n", " future f = async(launch::async, &fibonacci, n-2);\n", " // synchronously calculate the other sub-term\n", " uint64_t r = fibonacci2(n-1);\n", " // wait for the future and calculate the result\n", " return f.get() + r;\n", "}" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ ".expr cout << fibonacci2(22) << endl;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Let’s Parallelize It – Apply Futurization\n", "Parallel way, futurize algorithm to remove suspension points" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "future fibonacci3(uint64_t n)\n", "{\n", " if(n < 2) return make_ready_future(n);\n", " if(n < threshold) return make_ready_future(fibonacci_serial(n));\n", "\n", " future f = async(launch::async, &fibonacci3, n-2);\n", " future r = fibonacci3(n-1);\n", "\n", " return dataflow(\n", " [](future f1, future f2) {\n", " return f1.get() + f2.get();\n", " },\n", " f, r);\n", "}\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ ".expr cout << fibonacci3(22).get() << endl;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Let’s Parallelize It – Unwrap Argument Futures" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#include \n", "\n", "using hpx::util::unwrapped;\n", "\n", "future fibonacci4(uint64_t n)\n", "{\n", " if(n < 2) return make_ready_future(n);\n", " if(n < threshold) return make_ready_future(fibonacci_serial(n));\n", "\n", " future f = async(launch::async, &fibonacci4, n-2);\n", " future r = fibonacci4(n-1);\n", "\n", " return dataflow(\n", " unwrapped([](uint64_t f1, uint64_t f2) {\n", " return f1+f2;\n", " }),\n", " f, r);\n", "}\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ ".expr cout << fibonacci4(22).get() << endl;" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Excercise: Parallelize a sort" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#include \n", "#include \n", "#include \n", "#include \n", "#include \n", "using namespace std;\n", "function&)> myqsort = [](vector& v)->void {};" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ ".expr\n", "myqsort = [](vector& v)->void {\n", " if(v.size()<2) return;\n", " vector pre, eq, post;\n", " int pivot = v[rand() % v.size()];\n", " for(int val : v) {\n", " if(val < pivot) pre.push_back(val);\n", " else if(pivot < val) post.push_back(val);\n", " else eq.push_back(val);\n", " }\n", " myqsort(pre);\n", " myqsort(post);\n", " for(int i=0;i vv{20};\n", "for(int i=0;i<20;i++) vv.push_back(rand() % 100);\n", "for(int val : vv) cout << val << \" \";\n", "cout << endl;\n", "myqsort(vv);\n", "for(int val : vv) cout << val << \" \";\n", "cout << endl;" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "C++14", "language": "", "name": "cling-cpp14" }, "language_info": { "codemirror_mode": "c++", "file_extension": ".c++", "mimetype": "text/x-c++src", "name": "c++" } }, "nbformat": 4, "nbformat_minor": 2 }