Exercise 4: putting it all together, using the SDF3 design-flow

In the previous exercises you have seen how to model resource allocation decisions (e.g., buffer sizes) and architecture components (e.g., interconnect) into an FSM-SADF graph. Using sdf3analyze-fsmsadf, the throughput of the FSM-SADF graph under this resource allocation can be analyzed. A design-flow that maps a throughput-constrained application, modeled with an FSM-SADF graph, onto a multi-processor platform should perform exactly these two steps. Model the resource allocation decisions in the FSM-SADF graph and next check whether the throughput constraint is met. If so, the design-flow can continue with a next step. Otherwise, the design-flow needs to look for an alternative solution (i.e., mapping or scheduling).

The SDF3 tool-kit contains a tool called sdf3flow-fsmsadf that takes as input an FSM-SADF graph modeling an application and an (abstract) description of a multi-processor platform. In a series of steps, the tool maps the application onto the multi-processor platform. As a first step, the tool determines the buffer size requirements of the channels in the FSM-SADF graph. Next, the tool refines the throughput constraint in a number of additional constraints that are used to guide the binding of actors to processors. The actual binding of actors to processors is performed in the third step. Each scenario in the SADF graph could in principle use a different mapping. To implement this, a run-time reconfiguration mechanism is needed that can transfer data items (tokens) and code (actors) between different memories whenever a scenario switch occurs. To provide timing guarantees, a design flow should take the overhead of the run-time reconfiguration into account. In the worst-case, a reconfiguration is performed after executing a single iteration of the graph. Hence, scenario switches can occur very frequently (the MPEG-4 decoder may switch scenarios 20 times per second). Providing timing guarantees while allowing such frequent reconfigurations may lead to large resource reservations. Therefore sdf3flow-fsmsadf assumes that the actors of an SADF graph are mapped to the same resources in all scenarios. This unified mapping avoids that data items or code need to be moved between different memories when switching between scenarios. Next, the tool constructs static-order schedules for all processors to which actors have been bound. Finally, the tool computes the minimal TDMA time slices needed on these processors to guarantee that the throughput constraint of the application is met. By minimizing the TDMA time slices, processor resources are saved for other applications. The output of the tool is a set of Pareto optimal mappings that provide a trade-off in their resource usage. In some of these mappings, the application could for example use a lot of computational resources, but limited storage resources, whereas an opposite situation may be obtained in other mappings. At run-time the most suitable mapping can then be selected based on the resource usage of the applications which are already running on the platform.

In this exercise, you will be using the sdf3flow-fsmsadf tool to map a very simple FSM-SADF graph onto a homogeneous two processor platform. The figure below shows the graph as well as the platform used in this exercise. The platform consists of two tiles connected through (abstract) connections. In reality, these connections may be implemented using for example dedicated FIFOs or a network-on-chip. From the perspective of sdf3flow these connections provide timing guarantees on the amount of time needed to sent a token from one tile to another (i.e., it should be possible to model the timing behavior of the interconnect with a dataflow graph similar to the models we have used in the previous exercise). The tiles contain a processor, a local memory, and a network interface that connects a tile to its connections.

Open your file browser and go to the folder '<path where you unpacked the archive>/hands-on/exercise 4'.
Open the file 'example.xml' in a text-editor and study the contents of this file.
Is the platform used in this example a homogeneous or heterogeneous processor platform?
The platform contains two processors that are both of type p1. Hence, this platform is homogeneous. The sdf3flow tool can also handle heterogeneous platforms. To map an application onto a heterogeneous platform, a designer can simply change the processor type on one of the processors to something different from type p1. Of course, the tool will only be able to use this different processor type when one or more actors support this type. In our example, actor R supports processors of both type p1 and p2. So, this actor could potentially be mapped to a processor of type p2 if it were available in the platform.
What is the throughput constraint that must be met when mapping the application onto the platform?
The application when running on the platform should be able to achieve at least a throughput of 0.001 iterations/time-unit (see line 89 of example.xml).

As mentioned before, the tool sdf3flow-fsmsadf can be used to map an application onto a multi-processor platform. You are now going to run this tool to generate such a mapping.

Open a terminal (command window) and change the local directory to '<path where you unpacked the archive>/hands-on/exercise 4'.
Execute depending on your platform the following command:
../../sdf3/linux/sdf3flow-fsmsadf --output mapping.xml
or
..\..\sdf3\windows\sdf3flow-fsmsadf --output mapping.xml
At start-up sdf3flow-fsmsadf reads the content of sdf3.opt. This file contains the location of the application graph and architecture graph that are used in the mapping flow. If you have renamed the file example.xml, then you should change file attribute in the applicationGraph and architectureGraph elements inside the file sdf3.opt. These two file attributes should point to the correct location.

The tool will print information on its decisions in the various steps of the flow to the terminal. At the end, it will report to you whether it has found one or more feasible mappings of the application onto the platform. These mappings have been stored in the file 'mapping.xml'.

Open the file 'mapping.xml' in a text-editor and study the contents of this file.
How many different mappings have been found by sdf3flow-fsmsadf?

There are three mappings listed inside the sdf3 element. These mappings are called 'initial', '5', '11'. The last two mappings provide information on which actors have been bound to which processor. They also specify all memory and interconnect resource requirements for these mappings. Both of these mappings could be implemented on the platform. Whenever the resource allocation is performed as indicated in these mapping, then we can guarantee that our application will always meet its throughput constraint independent of whatever other applications are running simultaneously on the platform.

The first mapping (called 'initial') is not a real mapping. It can be used to provide mapping constraints to the tool. You could copy this initial mapping to the 'example.xml' file and specify a partial mapping in it. When running sdf3flow-fsmsadf again, the tool would use this initial mapping to constrain its search for valid mappings. We will not explore this option any further in this tutorial, but you can of course always give it a try yourself.

What is the difference between the various mappings found by sdf3flow-fsmsadf?

The two mapping '5' and '11' are in fact very similar. The only difference is that mapping '5' uses tile t1 whereas mapping '11' uses tile t2. Note that in this example, both mappings use only a single processor. This is due to the fact that our interconnect has a relatively large delay compared to the actor execution times. As a result, a single processor solution is always more resource efficient then a two processor solution. Therefore, the tool discards all two processor solutions that it considers along the way.

Previous exercise | Concluding remarks