# Guide:TAUChapel

### From TAU Wiki

Revision as of 00:37, 3 October 2013 (edit)Scottb (Talk | contribs) ← Previous diff |
Revision as of 23:23, 4 October 2013 (edit)Scottb (Talk | contribs) Next diff → |
||

Line 4: |
Line 4: | ||

== MonteCarlo example == | == MonteCarlo example == | ||

- | To test out some Chapel's language features let program a MonteCarlo simulation to calculate PI. We can calculate PI by assess how many points with coordinates x,y fit in the unit circle, ie x^2+y^2<=1. | + | To test out some Chapel's language features let us program a MonteCarlo simulation to calculate PI. We can calculate PI by assessing how many points with coordinates x,y fit in the unit circle, ie x^2+y^2<=1. |

=== Basic === | === Basic === | ||

Line 30: |
Line 30: | ||

=== Procedure promotion === | === Procedure promotion === | ||

- | Only feature of Chapel is procedure promotion where calling a procedure that takes scalar arguments with an array, the procedure is called for each element of the array in parallel: | + | One feature of Chapel is procedure promotion, this is where calling a procedure that takes scalar arguments with an array, will have be as if each element of the array is passed to the procedure in parallel: |

proc compute_pi(p_x: [] real(64), p_y: [] real(64)) : real { | proc compute_pi(p_x: [] real(64), p_y: [] real(64)) : real { |

## Revision as of 23:23, 4 October 2013

## Contents |

# Chapel

## MonteCarlo example

To test out some Chapel's language features let us program a MonteCarlo simulation to calculate PI. We can calculate PI by assessing how many points with coordinates x,y fit in the unit circle, ie x^2+y^2<=1.

### Basic

Here is the basic routine that computes PI:

proc compute_pi(p_x: [] real(64), p_y: [] real(64)) : real { var c : sync int; c = 0; forall i in 1..n { if (x ** 2 + y ** 2 <= 1) then c += 1; } return c * 4.0 / n; }

Notice that the **forall** here will compute each iteration in parallel, hence the need to define variable **c** as a **sync** variable. Performance here is limited by the need to synchronize access to **c**. Take a look of this profile:

70% percent of the time is spent in synchronization. Let's see if we can do better.

### Procedure promotion

One feature of Chapel is procedure promotion, this is where calling a procedure that takes scalar arguments with an array, will have be as if each element of the array is passed to the procedure in parallel:

proc compute_pi(p_x: [] real(64), p_y: [] real(64)) : real { var c : sync int; forall i in in_circle(p_x, p_y) { c += i; } return c * 4.0 / n; } proc in_circle(x: real(64), y: real(64)): bool { return (x ** 2 + y ** 2) <= 1; }

### Reduction

Furthermore with reorganization will allow us to take advantage of Chapel's built in reduction:

proc compute_pi(p_x: [] real(64), p_y: [] real(64)) : real { var c : int; c= +reduce in_circle(p_x, p_y); return c * 4.0 / n; }

This also improves performance:

### Multiple Locales

Let's look at how the array of x and y values are allocated:

var p_x: [1..n] real(64); var p_y: [1..n] real(64);

However Chapel provides a way to distribute these array across multiple locales:

const space = {1..n}; var Dom: domain(1) dmapped Block(boundingBox=space) = space; var p_x: [Dom] real(64); var p_y: [Dom] real(64);

This **Block** mapping will allocate the elements block-wise among the locales. Furthermore the reduction used earlier will continue to work.