Home

DOL

Moses

MPA

CBE

Publications

Contact

 
 

The DOL framework features a code generation back-end that allows to efficiently execute applications on the Sony/Toshiba/IBM Cell Broadband Engine (CBE). The code generation back-end relies on a lightweight run-time system based on protothreads and windowed FIFOs. The technical details behind this back-end are described in the following paper:

W. Haid, L. Schor, K. Huang, I. Bacivarov, and L. Thiele. Efficient Execution of Kahn Process Networks on Multi-Processor Systems Using Protothreads and Windowed FIFOs. In Proc. IEEE Workshop on Embedded Systems for Real-Time Multimedia (ESTIMedia), pages 35—44, Grenoble, France, Oct. 2009. (online access) (BibTex)

The aim of this page is to shortly summarize the main ideas and show how to actually use the back-end. In addition, the page contains links to further documents and hosts the material for replicating the experiments performed in the paper.

  • Protothreads: Protothreads are usually used for programming constrained (in terms of memory and performance) embedded systems. Protothreads are a simple, yet effective, approach to execute (preemptive) processes using a single CPU context and a single stack. Therefore, the context switch overhead is very low and no further multi-threading support is required to execute multiple processes on a single processor.

  • Windowed FIFO: Unlike standard FIFOs, windowed FIFOs support direct access to a continuous data segment in the (circular) FIFO buffer. These segments are called "windows" which leads to the name "windowed FIFO". Compared to standard FIFOs, windowed FIFOs are more efficient because unnecessary memory copies can be avoided. The Kahn process network semantics is not affected by using windowed FIFOs instead of standard FIFOs.

The main features of the developed run-time system are:

  • cooperative multi-threading on the PPE and ths SPEs

  • direct windowed FIFO communication between processes mapped to SPEs (PPE not involved)

  • overlapping of computation and communication by making use of DMA engines (memory flow controllers)

In summary, this allows for an efficient, completely distributed execution of Kahn process networks on the CBE.


cbe back-end overview


CBE Package
The CBE package (click to download) includes the following directories and files. Refer to the Get Started section below for instructions how to set up the CBE framework on a computer.

  • dol: DOL distribution including CBE runtime-environment, code generator, and programmer's guide. This is an extended version of DOL compared to the one contained in dol_ethz.zip.
  • multiprocessor: experiments for execution on the CBE
  • singleprocessor: experiments for execution on a single Linux workstation
  • source: source files of all the experiments
  • README: explanations for executing the experiments

Get Started
This section provides the basic steps, starting from the download of the CBE package to the execution of an example application on the CBE. More detailed information is available in the tool guide which is included in the DOL framework. The programmer's guide describing CBE-specific issues is available in the CBE package.

The requirements for executing applications leveraging the CBE package are:

If the above mentioned environments are in place, do the following to execute an application. For illustration, a simple producer-consumer example referred to as examplecell will be used.

Note: examplecell is in the source directory of the CBE package and it includes the process network, the architecture, and the mapping, all described in a DOL compliant manner (see the DOL documentation for further details).

  1. Set up the DOL framework, as described on the DOL page.

  2. After the build directory has been created, change to this directory,
    $ cd build/bin/main

  3. $ ant -f runexample.xml -Dnumber=cell cell

    The output should look then similar to the following one:

    $ ant -f runexample.xml -Dnumber=cell cell

    Buildfile: runexample.xml
    showversion:
    showantversion:
    [echo] Use Apache Ant version 1.6.5 compiled on February 17 2006.
    showjavaversion1:
    [echo] Use Java version 1.5.0_06 (required version: 1.5.0 or higher).
    showjavacversion1:
    [echo] Use Java version 1.5.0_06 (required version: 1.5.0 or higher).
    cell:
    prepare:
    [echo] Create directory examplecell.
    [echo] Copy C source files.
    validate:
    [echo] check XML compliance of examplecell_flattened.xml.
    [java] /home/user/DOL/DOLCrt/dolPrototype/trunk/examples/examplecell/examplecell.xml is valid.
    flatten1:
    [echo] Create flattened XML examplecell_flattened.xml.
    [java] .............................................
    [javac] Compiling 1 source file to /home/user/DOL/DOLCrt/dolPrototype/trunk/build/bin/main/examplecell
    dol_cell1:
    [echo] Run cell generation.

    [java] Read process network from XML file
    [java] -- full filename: file:/home/user/DOL/DOLCrt/dolPrototype/trunk/build/bin/main/examplecell/examplecell_flattened.xml
    [java] -- Process network model from XML [Finished]

    [java] Read architecture from XML file
    [java] -- full filename: file:/home/user/DOL/DOLCrt/dolPrototype/trunk/examples/examplecell/cell.xml
    [java] -- Architecture model from XML [Finished]

    [java] Read mapping from XML file
    [java] -- full filename: file:/home/user/DOL/DOLCrt/dolPrototype/trunk/examples/examplecell/mapping.xml
    [java] -- Mapping from XML [Finished]

    [java] Consistency check:
    [java] APPL: Checking resource name ...
    [java] APPL: Checking channel ports ...
    [java] APPL: Checking channel connection .
    .. [java] APPL: Checking Process connection ...
    [java] ARCH: Checking resource name ...
    [java] ARCH: Checking network simulators ...
    [java] MAP: Checking multiple bindings ...
    [java] MAP: Checking that all processes have a binding ...
    [java] -- Consistency check [Finished]

    [java] Generating Mapping in Dotty format:
    [java] -- Generation [Finished]

    [java] Generating Cell-package:
    [java] Cell: Use predefined mapping.
    [java]       All other parameters are ignored.
    [java] Read architecture from XML file
    [java] -- full filename: file:/home/user/DOL/DOLCrt/dolPrototype/trunk/examples/examplecell/cell.xml
    [java] -- Architecture model from XML [Finished]

    [java] Read mapping from XML file
    [java] -- full filename: file:/home/user/DOL/DOLCrt/dolPrototype/trunk/examples/examplecell/mapping.xml
    [java] -- Mapping from XML [Finished]
    [java] Mapped process generator to the PPU
    [java] Mapped process square_0 to the SPU_4
    [java] Mapped process square_1 to the SPU_1
    [java] Mapped process square_2 to the SPU_2
    [java] Mapped process square_3 to the SPU_3
    [java] Mapped process square_4 to the SPU_4
    [java] Mapped process square_5 to the SPU_5
    [java] Mapped process square_6 to the SPU_1
    [java] Mapped process square_7 to the SPU_2
    [java] Mapped process square_8 to the SPU_3
    [java] Mapped process consumer to the SPU_0
    [java] Cell: Nr of SPE is 6
    [java] Cell: Mapped some processes to the PPE
    [java] -- Generation [Finished]
    BUILD SUCCESSFUL
    Total time: 7 seconds

  4. Copy the generated source files to the CBE platform (or simulator).

  5. On the CBE platform, change to the directory examplecell/cell, where you can compile the application by simply executing make.

  6. Your executable has been successfully created, and now you can execute the application, by running ./sc_application.

    The output should look then similar to the following one:

    $ ./sc_application

    )PPE: spe thread start run
    )PPE: spe thread start run
    )PPE: spe thread start run
    )PPE: spe thread start run
    )PPE: spe thread start run
    )PPE: spe thread start run
    SPU 0: start to execute
    SPU 4: start to execute
    SPU 5: start to execute
    SPU 1: start to execute
    SPU 2: start to execute
    SPU 3: start to execute
    consumer: 0.000000
    consumer: 512.000000
    consumer: 1024.000000
    consumer: 1536.000000
    consumer: 2048.000000
    consumer: 2560.000000
    )PPE: spe thread finish run
    )PPE: spe thread finish run
    )PPE: spe thread finish run
    )PPE: spe thread finish run
    )PPE: spe thread finish run
    )PPE: spe thread finish run
    )PPE:) Complete running all super-fast SPEs and PPE


  7. Note: To execute the experiments, several run.sh scripts are provided in the singleprocessor and the multiprocessor directories.
 
   
!!! Dieses Dokument stammt aus dem ETH Web-Archiv und wird nicht mehr gepflegt !!!
!!! This document is stored in the ETH Web archive and is no longer maintained !!!