1. [20%] Consider a MIPS processor with an additional floating point unit. Assume functional unit delays in the processor are as follows: memory (2 ns), ALU and adders (2 ns), register file access (1 ns), FPU add (8 ns), FPU multiply (16 ns), and the remaining units (0 ns). Also assume instruction mix are as follows: load (31%), stores (21%), R-format instructions (27%), branches (5%), jumps (2%), FP adds and subtracts (7%), and FP multiplies and divides (7%).

(a) What is the delay in nanosecond to execute a load, store, R-format, branch, jump, FP add/subtract, and FP multiply/divide instruction in a single MIPS design?
(b) What is the averaged delay in nanosecond to execute a load, store, R-format, branch, jump, FP add/subtract, and FP multiply/divide instruction in a multicycle MIPS design?

2. [15%] Consider a pipelined processor that executes the MIPS code shown in Figure 1 using the logic of hazard detection and data forwarding unit shown in Figure 2. If the MIPS code cannot be executed correctly, then how do we revise the logic shown in Figure 2 such that the code can be correctly executed?

.wp-block-code{border:0;padding:0}.wp-block-code>div{overflow:auto}.shcb-language{border:0;clip:rect(1px,1px,1px,1px);-webkit-clip-path:inset(50%);clip-path:inset(50%);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px;word-wrap:normal;word-break:normal}.hljs{box-sizing:border-box}.hljs.shcb-code-table{display:table;width:100%}.hljs.shcb-code-table>.shcb-loc{color:inherit;display:table-row;width:100%}.hljs.shcb-code-table .shcb-loc>span{display:table-cell}.wp-block-code code.hljs:not(.shcb-wrap-lines){white-space:pre}.wp-block-code code.hljs.shcb-wrap-lines{white-space:pre-wrap}.hljs.shcb-line-numbers{border-spacing:0;counter-reset:line}.hljs.shcb-line-numbers>.shcb-loc{counter-increment:line}.hljs.shcb-line-numbers .shcb-loc>span{padding-left:.75em}.hljs.shcb-line-numbers .shcb-loc::before{border-right:1px solid #ddd;content:counter(line);display:table-cell;padding:0 .75em;text-align:right;-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;user-select:none;white-space:nowrap;width:1%}Figure 1: The MIPS code:

add $1,$1, $5 add$1, $1,$6

add $1,$1, \$7Code language: plaintext (plaintext)
Figure 2: The logic of hazard detection and data forwarding unit:

if (MEM/WB.RegWrite

and (MEM/WB.RegisterRd != 0)

and (EXE/MEM.RegisterRd = ID/EX.RegisterRs)

and (MEM/WB.RegisterRd = ID/EX.RegisterRs))

then ForwardA = 01

if (MEM/WB.RegWrite

and (MEM/WB.RegisterRd != 0)

and (EXE/MEM.RegisterRd = ID/EX.RegisterRt)

and (MEM/WB.RegisterRd = ID/EX.RegisterRt))

then ForwardB = 01Code language: plaintext (plaintext)

The logic should be revised as follows:

3. [10%] What is the biased single precision IEEE 754 floating point format of 0.9375? What is the purpose to bias the exponent of the floating point numbers?

(1) 0011 1111 0111 0000 0000 0000 0000 0000
(2) Keeping the exponent field in positive numbers can speed up the comparison process.

4. [5%] Which of the following techniques can resolve control hazards?

(a) Branch prediction
(b) Stall
(c) Delayed branch

5. [15%] In a demand-paging system using a system-wide inverted page table,

(a) why are per-process page table still required?
(b) when is a per-process page table accessed?

(a) Because inverted page table no longer contains complete information about the logical address space of a process, and that information is required if a referenced page is not in memory.
(b) When page-fault occurs, we will access the per-process page table.

6. [15%] Which of the following may reduce TLB misses? Explain your answers briefly.

(a) increase the level of paging
(b) use pre-paging
(c) decrease the page size

7. [10%] In Linux, a process cannot hold a spinlock while attempting to acquire a semaphore. Please explain why this policy is in place.

8. [10%] Please explain why UNIX inodes support large files while allowing fast accesses to small files.