from the city of Vienna: Notes on SPMD architecture III

Requirements.

Notes on SPMD architecture II. General considerations

Wenn man rudert, ist es nicht das Rudern, was das Schiff bewegt, sondern Rudern ist nur eine magische Zeremonie, durch welche man einen Dämon zwingt, das Schiff zu bewegen^[1]

Friedrich Nietzsche

This post states the high level (system level) requirements for the SPMD architecture model (Architecture model) to support 1-D and 2-D processing, both in the cases of using Local memory and Distributed memory as well as in conditions of steady and non-steady nodes’ workload over time.

This post is based on the 1-D and 2-D processing implementation described in Appendix "1-D and 2-D processing implementation". The situations where Local memory processing is possible and where Distributed memory processing is necessary are addressed in Appendix "Local and distributed memory processing". The working conditions of the nodes of steady and non-steady workload are defined in Appendix "Nodes’ workload over time".

--------------------

The simple SPMD topology presented in Figure 1-1 of the previous article “Introduction” (Introduction), is included here for convenience (see Figure 3-1). What the figure depicts is:

A set of N processing nodes (n_pi, i=1,..,N). Each node runs a “copy” of the algorithm over different blocks of input data.
One input node (n_i), which manages the input link and distributes the input data to the processing nodes.
One output node (n_o), which collects the processing results and manages the output link.

In Figure 3-1, the following types of data transfers among nodes are considered:

Data distribution, which are the transfers from the input node to the processing nodes.
Data collection, which are the transfers from the processing nodes to the output node.
Data redistribution, which are the transfers among the processing nodes.

Figure 3-1

Data distribution and collection capabilities are required by the topology itself. Data redistribution functionality is necessary at least to the extent of covering the distributed matrix transposition in the case of 2-D processing on distributed memory. In the previous figure the Data redistribution capability is represented by the curved arrow that connects the processing nodes set outputs to the inputs.

In order to get the maximum possible Throughput from the MP machine, internal asynchrony in the nodes (internal asynchrony) will be addressed. That is, the devices of the node have to work asynchronously, the ones with respect to the others. The processing and the communications are performed asynchronously the one from the others, as well as the input and output communications. Note that, as a consequence, the internal asynchrony gives rise to a second level of asynchrony, what is that the nodes work asynchronously, the ones with respect to the others (external asynchrony).

Certainly, in addition to processor and memory devices, all the nodes have a communications' device that support the transfers among them, this device will be referred as the bridge device. Moreover, the input and output nodes have communications devices that support the transfers from the input link and to the output link, respectively. These devices will be referred as the input link communications device and the output link communications device.

The communications devices allows the processor to perform the transferences asynchronously to the rest of the processing. This means that, while they perform transfers the processor can perform signal or image processing or any other task.

If, additionally, the input and output communications of the nodes are asynchronous, finally the nodes will work asynchronously.

As denoted in the Appendix "Local and Distributed memory processing", the Architecture model have to be scalable in order to adequate the number of processing nodes to the necessities –in terms of memory and/or Throughput– of the processing.

Hereinafter, we will consider that the architecture meets the following requirements:

It supports the Data distribution, redistribution and collection capabilities
It is scalable
It gets the maximum possible Throughput from the MP Machine
It supports continuous input and output data flows without losing data (no-data-loss)

In the writing of this article, Saodaj' (Pokor Lèr) have collaborated in an involuntary but decisive way.

---------------------

1. When one rows it is not the rowing which moves the ship: rowing is only a magical ceremony by means of which one compels a demon to move the ship.

2. Picture: Embarcadero Center, San Francisco | Pixabay https://pixabay.com/en/spiral-staircase-architecture-1149509/

3. I want to thank Carol G. for her revision of this text.

from the city of Vienna

Notes on SPMD architecture III

Requirements.

Leopoldo Gomez

0 Comments