| Computer System Architecture

Pipelining & RISCs

Introduction to Pipelining

Pipelining and parallelism are 2 methods used to achieve concurrency.

· - Pipelining increases concurrency by dividing a computation into a number of steps.

· - Parallelism is the use of multiple resources to increase concurrency.

Pipelining is a key implementation technique used to build fast processors that can be seen in RISC architecture. It allows the execution of multiple instructions to overlap in time.

In a non-pipelined processing, by contrast, the next data/instruction is processed after the entire processing of the previous data/instruction is complete.

Pipelined Laundry: Start work ASAP

LESSONS:

Pipelining doesn’t help latency of single task, it helps throughput of entire workload
Multiple tasks operating simultaneously using different resources
Potential speedup = Number pipe stages
Pipeline rate limited by slowest pipeline stage
Unbalanced lengths of pipe stages reduces speedup

Instruction of Pipelining

Typical instruction execution sequences:

— fetch, decode, read, execute, write, etc

In a non-pipelined CPU, instructions are performed “one at a time”.ie. before an instruction is begun, the preceding instruction is completed.

To implement instruction pipelining, desirable features of (instruction set) IS:

o all instructions same length

o registers specified in same place in instruction

o memory operands only in loads or stores, i.e. RISC

Two Stage Instruction Pipeline

Pipelining of Unequal Stages

Important for pipelining where stages are unequal:

— Always take the largest of the stage delay to be the cycle time

— No stage overlaps and latency must be constant

— Ensure that instruction overlap is the same as the cycle time else get timing diagram is wrong

Pipiline Performance

• n:number of instructions

• k: stages in pipeline

• t : cycletime (time in seconds needed to advance a set of instructions one stage through the pipeline)

• T_k: total time for pipelining

• T₀ : total time without pipelining

Total time for equal stages

• For n instructions, with k equal stages and delay of t for each stage

Total time with no pipelining, T₀ = nkt

Total time with pipelining, T_k = (k + (n-1))t

Speedup

• Speedup of a k-stage pipeline for n instructions :

S = T0 / Tk

= nk t / ((k+(n-1)) t

à k (for large n)

Throughput

• Pipelined Throughput= n/Tk (n)

• Non-pipeline Throughput = n/To (n)

Limits to Pipelining

• Factors that limits performance enhancement:

— Unequal duration/delay of stages

— Conditional branch instruction or interrupts. Ex:

– Instruction 3 is a conditional branch to instruction 15

– No instructions completed during time units 9-12. This is performance penalty incurred because we could not anticipate the branch

– Flushing of pipeline

– Pipelined operation cannot be maintained in the presence of branch or jump instructions.

Hazards as limitations to pipelining

• 3 types of hazards:

- Resource hazards : HW cannot support this combination of instructions (single person to fold and put clothes away, washer-drier)

- Data hazards: Instruction depends on result of prior instruction still in the pipeline

- Data dependencies example

A = B + C

D = E + A

C = G x H

- Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps).

RISC: Reduced Instruction Set Computers

• Major advances in computer :

— The family concept

– Separates architecture from implementation

— Microprogrammed control unit

— Cache memory

— Solid State RAM

— Microprocessors

— Pipelining

– Introduces parallelism into fetch execute cycle

— Multiple processors

CISC and RISC

• The next step: Reduced Instruction Set Computer in processor architecture

• Key features of CISC:

— Large number of predefined instructions making high level programming languages easy to design and implement.

— Supports microprogramming to simplify computer architecture

— Key features of RISC

— Limited and simple instruction set

— Large number of general purpose registers or use of compiler technology to optimize register use.

— Emphasis on optimizing the instruction pipeline

Arguments for CISC

A rich instruction set should simplify the compiler by having instructions which match the high-level language instructions.

Since the programs are smaller in size, they have better performance

Drawbacks of CISC

-CPU complexity

-System size and cost

-Complex machine instructions may not match high-level language statements exactly, in which case they may be of little use

CISC characteristics

• Varying number of instructions per cycle

• Small number of general purpose registers

• More addressing modes

• More instruction formats : fewer instructions can be used to implement a given task

• Use microcode

• Variable length instruction

• Simplified compiler: microprogram instructions could be written to match constructs of high level languages

RISC Characteristics

• One instruction per cycle

• Register to register operations

• Few, simple addressing modes

• Few, simple instruction formats

• Hardwired design (no microcode)

• Fixed instruction format

• More compile time/effort

RISC vs. CISC