Pipeline execution
Let’s look at how pipelined execute can be affected by resource hazards, control hazards and instruction set architecture. Looking at the following fragment of code:
ADD X5, X2, x1
LDUR X3, [X5, #4]
LDUR X2, [X2, #0]
ORR X3, X5, X3
STUR X3 [X5, #0]
Assume that all of the branches are perfectly predicted as this eliminates all potential control hazards and that no delay slots will be needed. If we only have one memory for both the instructions and data, there is a structural hazard every time that we need to fetch an instruction in the same cycle which another instruction accesses data. To guarantee that we have forward progress, this structural hazard has to be resolved by giving the favor to the instruction that accesses data. What would be the total execution time of the sequence in the 5-stage pipeline that only has one memory? Explain your answer.
Assume now that all the branches are perfectly predicted as this eliminates all potential control hazards and that no delay slots will be needed. If we change the load/store instructions to use a register without an offset as the addresses, the instructions would no longer need to use the ALU. As a result, the MEM and EX stages can be overlapped and the pipeline would only now have 4 stages. Change the code to accommodate the changed ISA. What is the speedup achieved in this instruction sequence? Explain your answer.
Show a pipeline execution diagram for these series of instructions (both initial and end) result.