from the city of Vienna: Notes for a multiprocessor skeleton V

Node programming model.

It's hardware that makes a machine fast,

it's software that makes a fast machine slow

Craig Bruce

This post describes the programming model for the input, output and processing nodes.

In terms of nodes’ functionality, as shown through the “Notes on SPMD architecture” series, the SPMD architecture consists of three different types of nodes. According to this, a maximum of three different software components is necessary: those corresponding to the input, the output, and the processing nodes.

As a general principle, we will consider that the nodes’ programming model must enable the implementations of the algorithm and the Skeleton to be independent.

In this post, as in others of this series, when for the sake of clarity, has been considered necessary to explain something related in some way to the programming language, it has been done using a C-like pseudocode.

--------------------

We will consider that the operation states for the input, output, and processing nodes are the same. Following state machine (Figure 5-1) displays these operation states and the transitions between them.

Figure 5-1

The Init, Run and Exit states correspond to the normal operation of the node. The Error state has been included to provide the machine with a controlled state when errors considered as non-recoverable by the error management occur.

We will assume that all the nodes implement this state machine, although the functionality of the states varies depending on the type of node.

Next, the states’ functionality is described for the processing node. The C-like functions used were described in the previous post “Channels”. Error management is not addressed.

Two different versions of the Run state are shown. They correspond to the cases of 1-D processing and 2-D processing on local memory and 2-D processing on distributed memory. In the first of the cases, a Redistribution channel is not necessary –since there are not Corner Turns– but in the second case, it is.

The pseudocode of the input and output nodes can be easily derived from the pseudocode presented for the processing node in the first case above mentioned.

State Init

/* Get resources; when finished go to State Run */

/* On error, go to Error State */

createChannel(inputDistributionChannel);

createChannel(outputCollectionChannel);

createChannel(inputRedistributionChannel);

createChannel(outputRedistributionChannel);

...

State Exit

/* Free resources; when finish exit the code */

/* On error, go to Error State */

destroyChannel(inputDistributionChannel);

destroyChannel(outputCollectionChannel);

destroyChannel(inputRedistributionChannel);

destroyChannel(outputRedistributionChannel);

…

State Error

/* Do nothing or almost nothing. Ask or wait for help */

…

Following, the state Run corresponding to the cases of 1-D processing and 2-D processing on local memory is described. In the presented pseudocode, note that for the input node, the inputChannel is the channel correspondent to the input link and for the output node, the outputChannel is the channel correspondent to the output link.

State Run

/* 1-D processing and 2-D processing on local memory*/

/* Execute the algorithm */

/* On signal of terminate, go to Exit State */

/* On error, go to Error State */

/* Get a data buffer received through the input distributionChannel */

get(inputBuffer, inputDistributionChannel);

/* Get an empty buffer from the output collectionChannel to store the results of the process */

get(outputBuffer, outputCollectionChannel);

/* Process the received buffer */

algorithm();

/* Return an empty buffer to the input distributionChannel */

put(inputBuffer, intputDistributionChannel);

/* Send a full buffer through the output collectionChannel */

put(outputBuffer, outputCollectionChannel);

Following, the state Run corresponding to the case of 2-D processing on distributed memory is described. The pseudocode shown corresponds to an n-stages’ algorithm with n-2 Corner Turns.

State Run.

/* 2-D processing on distributed memory */

/* Execute the algorithm */

/* On signal of terminate, go to Exit State */

/* On error, go to Error State */

switch (stage)
{

case 0:

/* Get a data buffer received through the input distributionChannel */

get(inputBuffer, inputDistributionChannel);

/* Get an empty buffer from the output redistributionChannel to store the results of the process */

get(outputBuffer, outputRedistributionChannel);

/* Process the received buffer applying the stage 1 of the algorithm */

algorithmStage(0);

/* Return an empty buffer to the input distributionChannel */

put(inputBuffer, intputDistributionChannel);

/* Send a full buffer through the output redistributionChannel */

put(outputBuffer, outputRedistributionChannel);

break;

case 1:

get(inputBuffer, inputRedistributionChannel);

get(outputBuffer, outputRedistributionChannel);

/* Process the received buffer applying the stage 2 of the algorithm */

algorithmStage(1);

put(inputBuffer, inputRedistributionChannel);

put(outputBuffer, outputRedistributionChannel);

break;

…

case n-1:

get(inputBuffer, inputRedistributionChannel);

get(outputBuffer, outputCollectionChannel);

/* Process the received buffer applying the stage N of the algorithm */

algorithmStage(n-1);

put(inputBuffer, inputRedistributionChannel);

put(outputBuffer, outputCollectionChannel);

break;

default:

break;

}

Take into account that, for the case of 2-D processing on distributed memory, Corner turns have to be performed between stages of the algorithm. Consequently, in the intermediate stages (stages 1 to n-2), the Redistribution channel is either the input as the output channel. However, in the first stage, the input and output channels are the Distribution and the Redistribution channels, respectively, and in the last stage, the input and output channels are the Redistribution and the Collection channels, respectively.

In the writing of this article, Franco Battiato (No Time No Space. Mondi lontanissimi, 1985) has collaborated in an involuntary but decisive way.