from the city of Vienna: Notes for a multiprocessor skeleton I

Introduction.

Work together, help each other and communicate

Mauricio Pellegrino

The object of this work is to describe the multiprocessor skeleton (the Skeleton) as a component of the software of Single Program Multiple Data (SPMD) machines dedicated to signal and/or image processing applications.

The Skeleton is the logical communication’s infrastructure that connects the processors. This work first addresses the requirements. Next, the channels -the software objects that support the transferences-, the configuration, and the nodes' programming model. Finally, some points of the presented model are discussed.

This series "Notes for a multiprocessor skeleton" takes as its starting point the series "Notes on SPMD architecture” series. In particular, the Architecture model defined there has been taken as reference for this work.

Within this text, the terms computer and machine are used interchangeably.

--------------------

In order to state the concept of Skeleton, we will make use of the Architecture model described throughout the "Notes on SPMD Architecture" series. A summary of that description is included below.

SPMD machines are parallel processing computers that operate using an identical copy of the program in each processor, and in which each processor acts on different chunks of data. The following figure shows the topology of the Architecture model.

Figure 1-1

This work is centered in SPMD machines dedicated to signal and image processing applications. So hereinafter, we will use the term “algorithm” instead of “program”.

What the previous figure depicts is, in terms of nodes:

A set of N processing nodes (n_pi, i=1,..,N). Each node runs a “copy” of the algorithm over different blocks of input data.
One input node (n_i), which manages the input link and distributes the input data to the processing nodes.
One output node (n_o), which collects the processing results and manages the output link.

And in terms of data transfers:

Data distribution, which are the transfers from the input node to the processing nodes.
Data collection, which are the transfers from the processing nodes to the output node.
Data redistribution, which are the transfers among the processing nodes.

Data distribution and collection capabilities are required by the topology itself. Data redistribution functionality is necessary at least to the extent of covering the distributed matrix transposition in the case of 2-D processing on distributed memory (see Appendix “1-D and 2-D Processing implementation” of the “Notes on SPMD architecture series”). In the previous figure, the Data redistribution capability is represented by the curved arrow that connects the processing nodes set outputs to the inputs.

Although, it may sometimes be necessary that the input node and/or the output node process the data with some algorithm’s lightweight section (see Figure 1-2 in post Introduction of the “Notes on SPMD architecture” series. In any case, we will continue using the term “algorithm” for the code that runs on the processing nodes since it will be the heaviest weight section.

We will consider that the hardware of the MP machine is not linked to the application -so the hardware is reusable for different applications- but rather it is the software that customizes the machine for each application.

From a physical point of view, a MP computer consists of processor boards connected by a high-speed bus plus input and output interfaces. From a logical point of view, it consists of nodes connected by channels. Nodes support processing, and channels support communications.

As said above, the Skeleton is the logical communication’s infrastructure that connects the nodes. In the previous figure, it is represented by the thick black arrows and the curved gray arrow.

From the perspective of the software implementation, the MP machine application code can be partitioned into two different layers; the Skeleton code layer and the algorithm code layer. The Skeleton layer is supported by the channel’s library and the algorithm layer by the mathematical functions’ library, as is shown in the next figure. The operating system, in addition to channel and mathematical functions’ libraries, are either supplied, recommended or supported by the machine provider.

Multiprocessor machine software layers diagram

Figure 1-2

This approach enables code re-usability and independent code development. Code re-usability is possible because different applications based on different algorithms can make use of the same Skeleton implementation. Independent code development is also possible because Skeleton and algorithm code can be developed independently.

In the writing of this article, Paco de Lucía, John McLaughlin and Al Di Meola (Mediterranean sun dance, Pavarotti & Friends for Ward Child) have collaborated in an involuntary but decisive way.