&&ReWrAp:HEADERFOOTER:0:ReWrAp&&

The Microcode Level Timeslicing Processor Architecture

Daniel Curtis McCrackin, McMaster University

Abstract

The von Neumann computer model, upon which most of our modern computers are based, is lacking in both parellelism and support for multitasking. One convenient measure of processor-parallelism is the degree to which a processor utilizes its available memory bandwidth for useful work: its Memory Bandwidth Efficiency (MBE). A processor which exhibits an MBE of 100% is operating as fast as possible for its given memory speed.

Conventional processor accelerators like prefetching and pipelining increase parallelism but suffer from the jump problem, in which a taken jump may cause incorrect prefetching. These accelerators are not well adapted to multitasking, although enhancements like selectable register files may be used to reduce context switching overhead. Pipelined multistreaming achieves a high degree of parallelism, avoids the jump problem and supports efficient context switching, but its performance is load dependent and it is awkward to implement. Furthermore, none of these architectures support efficient process resource sharing.

Microcode Level Timeslicing (MLT) is a multistream processor architecture that achieves very high processer MBE, has no-overhead context switching and provides support for resource sharing. Within the processor, process state information is replicated N-fold. Prefetching occurs horizontally across streams, allowing the jump problem to be circumvented. Context switching occurs at microinstruction boundaries, giving no overhead for up to N streams. The fetching and executing mechanisms are controlled by a Stream Control Unit (SCU), which contains task status information for each stream. Efficient process control and resource sharing operations are readily supported in the SCU.

The hardware and software design of a prototype processor demonstrating the MLT principle for up to 16 streams is presented. Process control and syncronization operations are implemented at the microcode level. Two high-level language benchmarks, the Sieve of Eratosthenes and they Dhrystone, are used to evaluate the prototype's performance.