Package at.dms.kjc.vanillaSlice

This package was written to provide a test for the classes in backendSupport.

See:
          Description

Class Summary
EmitStandaloneCode Takes a ComputeNode collection, a collection of Channel's, and a mapping from Channel x end -> ComputeNode and emits code for the ComputeNode.
UniBackEnd The entry to the back end for a uniprocesor or cluster.
UniBackEndFactory Specialization of BackEndFactory for uniprocessor backend.
UniComputeCodeStore Modest extension to ComputeCodeStore.
UniProcessor Completely vanilla extension to ComputeNode for a processor (computation node) with no quirks.
UniProcessors Implementation of at.backendSupport.ComputeNodesI to provide a collection of UniProcessors.
 

Package at.dms.kjc.vanillaSlice Description

This package was written to provide a test for the classes in backendSupport.

This package currently supports compiling to a uniprocessor. But it is planed to extend this package to compile for a shared memory multiprocessor.

strc -newSimple N XXX.str

where N is the number of processors to use in a shared memory multiprocessor, and defaults to 1 if not specified. This number is actually just the number of threads to create and could be set to more than the number of processors if necessary for good load balancing.

When at.dms.kjc.backendSupport supports all the compiler features currently supported by the --cluster 1 --standalone, then it is proposed that the strc switch name be changed to uni.

Current shortcomings of this package:

Any unimplemented compiler features inherited from at.dms.kjc.backendSupport

We still need to complete implementation of shared memory multi-cores / shared memory multiprocessors: If >1 processor specified then need to create a (p)thread per processor and add code to start the (p)threads at the beginning.

There are a few possible schemes for distributing code between threads, none implemented yet:

All the following can run with double (multi) buffering and blocking: Block producer on all buffers full, block consumer on all buffers empty, A thread should be awake when all of the consumers of its output have at least one empty buffer available and all of its producers of data that it consumes have filled at least one buffer. (The above allows bounding buffer sizes when there is a static schedule.

  1. Software pipelining: each thread takes all stateless filters at different steady states. Stateful filters are somehow divided up. One copy of code for each filter exists and is called form each thread.
  2. Space multiplexing: each thread takes a subgraph (not necessarily a connected subgraph). The CodeStore can inline the filter work functions since a filter will only appear once in a thread. (The existing Uniprocessor implementation does this inlining but does not generate threads.)
  3. If all filters except I/O are data parallel, then each thread can run in parallel with special threads assigned for I/O. (once again call shared filter code).

A back end for a COW (cluster of workstations) is proposed. This will present a challenge to the idea of "one compute node - one code store" since each node in the cluster may be a multi-core assembling code for multiple threads.

If can work out issues with code stores, it is proposed that each machine in the cluster be responsible for some group of threads, but now need socket communication and some degree of asynchrony between the machines in the cluster.

Janis' original cluster implementation, giving a separate thread to such filter, splitter, joiner had some problems with undiagnosable hangs.

Please update this document to keep pace with design and implementation

.