pencil and rubber

Logo of Triple-A Level WCAG-1 Conformance, W3C-WAI Web Content Accessibility Guidelines 1.0

XHTML 1.0 Conformance Validation CSS 3 Conformance Validation
Logo of Department of Mathematics and Computer Science, Course on Dedicated systems, link to Forum

Microprocessor interfaces

Lecture 10 on Dedicated systems

Teacher: Giuseppe Scollo

University of Catania
Department of Mathematics and Computer Science
Graduate Course in Computer Science, 2018-19

Table of Contents

  1. Microprocessor interfaces
  2. lecture topics
  3. memory-mapped register
  4. integration in the memory hierarchy
  5. mailbox
  6. FIFO queue
  7. shared memory
  8. coprocessor interfaces
  9. custom instruction interfaces
  10. ASIP design flow
  11. example: the Nios-II custom-instruction interface
  12. register files for Nios-II custom instructions
  13. references

lecture topics

outline:

memory-mapped register

memory-mapped interfaces are the most general type of HW/SW interface

Schaumont, Figure 11.1 - A memory-mapped register

Schaumont, Figure 11.1 - A memory-mapped register

integration in the memory hierarchy

why must the pointer be a volatile pointer?

Schaumont, Figura 11.2 -  Integrating a memory-mapped register 
                                    in a memory hierarchy

Schaumont, Figure 11.2 - Integrating a memory-mapped register in a memory hierarchy

however, defining a memory-mapped register with a volatile pointer will not prevent that memory address from being cached!

two approaches to deal with this problem:

  • allocation into a non-cacheable memory area, if the processor has a configurable cache (e.g. a Microblaze)
  • use of specific cache-bypass instructions of the processor (e.g. a Nios-II)

mailbox

simple extension of a memory-mapped register with a handshake mechanism, whereby the communicating parties signal the register state to each other

Schaumont, Figure 11.3 - A mailbox register between hardware 
                                   and software

Schaumont, Figure 11.3 - A mailbox register between hardware and software

the protocol shown in the figure has two synchronization points, viz. just after req and ack taking the same value

two main disadvantages of this protocol:

FIFO queue

the use of a FIFO queue compensates temporary imbalances between the read and write throughputs

Schaumont, Figure 11.4 - A FIFO with handshakes on the read and 
                                   write ports

Schaumont, Figure 11.4 - A FIFO with handshakes on the read and write ports

a FIFO may be built by chaining multiple FIFO sections, each acting as a slave on input and as a master on output

Schaumont, Figure 11.5 - A one-place FIFO with a slave input 
                               handshake and a master output handshake

Schaumont, Figure 11.5 - A one-place FIFO with a slave input handshake and a master output handshake

shared memory

instead of controlling access to a single register, a single handshake can also be used to control access to a region of memory

Schaumont, Figure 11.6 - A double-buffered shared memory with 
                          a memory-mapped request/acknowledge handshake

Schaumont, Figure 11.6 - A double-buffered shared memory with a memory-mapped request/acknowledge handshake

in one phase of the protocol in figure, changes are allowed to region 1 of the memory, while in the other phase of the protocol, changes are allowed in region 2 of the memory

coprocessor interfaces

Schaumont, Figure 11.7 - Coprocessor interface

Schaumont, Figure 11.7 - Coprocessor interface

when high data-throughput between the software and the custom hardware is needed, a dedicated processor interface outperforms memory-mapped interfaces

  • a coprocessor interface does not make use of the on-chip bus, it uses a dedicated port on the processor, driven by coprocessor instructions

both the coprocessor instruction set and the specific coprocessor interface depend on the type of processor—not all processors have a coprocessor interface

main advantages of a coprocessor interface over an on-chip bus:

custom instruction interfaces

the integration of hardware and software can be considerably accelerated as follows:

  1. reserve a portion of the opcodes from a microprocessor for new instructions
  2. integrate the custom-hardware modules directly into the micro-architecture of the micro-processor
  3. control the custom-hardware modules using new instructions derived from the reserved opcodes

the resulting design is called an Application-Specific Instruction-set Processor (ASIP)

ASIP design automates some of the more difficult aspects of HW/SW codesign:

ASIP design flow

Schaumont, Figure 11.12 - ASIP design flow

Schaumont, Figure 11.12 - ASIP design flow

sequential ASIP design does not generally deliver better performance than SoC design based on custom hardware modules, yet it does deliver less error-prone results

example: the Nios-II custom-instruction interface

the Nios-II softcore processor has a coprocessor interface whereby custom instructions may be defined and hardware modules may be attached to

Schaumont, Figure 11.15 - 
          Nios-II custom-instruction interface timing

Schaumont, Figure 11.15 - Nios-II custom-instruction interface timing

the interface supports variable-length execution of custom instructions through a two-way handshake

the clk_en input is used to mask off the clock to the custom hardware when the instruction is inactive

register files for Nios-II custom instructions

the use of a local register file in the custom hardware module is also supported

Schaumont, Figure 11.16a - Nios-II custom-instruction integration 
                                     with processor register file

Schaumont, Figure 11.16a - Nios-II custom-instruction integration with processor register file

Schaumont, Figure 11.16b - Nios-II custom-instruction integration 
                                     with local register file

Schaumont, Figure 11.16b - Nios-II custom-instruction integration with local register file

a custom instruction may take operands from either register file: registers prefixed with r are located in the processor, while registers prefixed with c are located in the custom hardware

  • instructions that use both are allowed, such as custom 0x5, c2, c3, r5

figure 11.16b shows the case for the first input operand only: the control signal reada selects either the processor's or the local register file

  • in the former case, the operand is provided through the dataa port, that is associated with a processor's register
  • in the latter case, the input a selects the local register to use as operand

references

recommended readings:

for further consultation: