hw4_cse490-590-su2021_sol

.pdf

School

SUNY Buffalo State College *

*We aren’t endorsed by this school

Course

590LR

Subject

Computer Science

Date

Jan 9, 2024

Type

pdf

Pages

6

Uploaded by SuperHumanWombatMaster946

CSE 490/590 Summer 2021 Homework 4 1. For the code sequence shown below. loop: l.d $f12, 0($f5) add.d $f6, $f6, $f12 daddui $f5, $f5, -8 bne $f5, $f9, loop // $f9 holds the address of the last value to be operated on a) Show loop unrolling so that there are four copies of the loop body Assume $f5, $f9 (that is, the size of the array) are initially a multiple of 32, which means that the number of loop iterations is a multiple of 4. Eliminate any obviously redundant computations and do not reuse any of the registers. l.d $f12, 0($f5) add.d $f7, $f7, $f12 l.d $f13, -8($f5) add.d $f8, $f8, $f13 l.d $f14, -16($f5) add.d $f10, $f10, $f14 l.d $f15, -24($f5) add.d $f11, $f11, $f15 daddui $f5, $f5, -32 bne $f5, $f9, loop add.d $f16, $f7, $f8 add.d $f17, $f10, $f11 add.d $f18, $f16, $f17 or l.d $f12, 0($f5) add.d $f6, $f6, $f12 l.d $f13, -8($f5) add.d $f6, $f6, $f13 l.d $f14, -16($f5) add.d $f6, $f6, $f14 l.d $f15, -24($f5) add.d $f6, $f6, $f15 daddui $f5, $f5, -32 bne $f5, $f9, loop b) Computer the number of cycles needed for 4 iterations 1. l.d $f12, 0($f5) 2. stall 3. add.d $f7, $f7, $f12
CSE 490/590 Summer 2021 Homework 4 4. l.d $f13, -8($f5) 5. stall 6. add.d $f8, $f8, $f13 7. l.d $f14, -16($f5) 8. stall 9. add.d $f10, $f10, $f14 10. l.d $f15, -24($f5) 11. stall 12. add.d $f11, $f11, $f15 13. daddui $f5, $f5, -32 14. stall 15. bne $f5, $f9, loop 16. add.d $f16, $f7, $f8 17. add.d $f17, $f10, $f11 18. stall 19. stall 20. stall 21. add.d $f18, $f16, $f17 or 1. l.d $f12, 0($f5) 2. stall 3. add.d $f6, $f6, $f12 4. stall 5. l.d $f13, -8($f5) 6. stall 7. add.d $f6, $f6, $f13 8. stall 9. l.d $f14, -16($f5) 10. stall 11. add.d $f6, $f6, $f14 12. stall 13. l.d $f15, -24($f5) 14. stall 15. add.d $f6, $f6, $f15 16. daddui $f5, $f5, -32 17. stall 18. bne $f5, $f9, loop 2. For the code sequence shown below L.D F0,0(R1) ADD.D F4,F0,F2 S.D F4,0(R1)
CSE 490/590 Summer 2021 Homework 4 L.D F0,-8(R1) ADD.D F4,F0,F2 S.D F4,-8(R1) Rename the registers as needed and schedule the sequence to minimize the stalls L.D F0,0(R1) stall ADD.D F4,F0,F2 Stall stall S.D F4,0(R1) L.D F5,-8(R1) stall ADD.D F6,F5,F2 Stall stall S.D F6,-8(R1) Schedulling L.D F0,0(R1) L.D F5,-8(R1) ADD.D F4,F0,F2 ADD.D F6,F5,F2 stall S.D F4,0(R1) S.D F6,-8(R1) 3. For the given code sequence below executed on a 2-issue processor. I1: LW r2, 0(r1) I2: LW r3, 4(r1) I3: LW r4, 8(r1) I4: LW r4, 12(r1) I5: ADD r6, r4, r5 I6: ADD r7, r2, r3 I7: ADD r8, r7, r6 I8: LW r9, 4(r8) a) Draw a pipeline diagram [Consider Data Forwarding] You can also follow datapath design in lecture b) Calculate IPC
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help