Example exam: Computer Architecture and Organization

28 March 2000

Always choose the most correct answer.
The exam lasts 2 hours.
There are 22 tasks:

2 "open" questions, and
20 multiple-choice questions.

For the moment, the judgment goes as follows. For the open questions 20 points can be gained, 10 per question. The amount A of points for the multiple choice is conputed as follows. Let G be the number of multiple-choice questions answered correctly. Then A = max{0, 5*G - 2}. The final mark is equal to the total number of points achieved, divided by 10.

Task 1

Write a MIPS assembly function that compares two arrays of integers. The function must put the value 1 in register $v0 if the two arrays are equal with each other, and the value 0 otherwise. The start addresses and the length of both arrays are parameters of the function. You can consider that the two arrays have the same length. Follow the MIPS register conventions as given in the bijlage .

Task 2

You received on a separate sheet a copy of the single-cycle MIPS processor. We want now to implement the instruction jal (jump and link). Indicate in the figure which elements need to be included in the data-path and which extra control signals are necessary. Explain also (on the back of the separate sheet) how the control signals (the old as well as the new ones) should be activated for this new instruction.

Task 3

Consider the following statements:

On computer C₁ all instructions last 10 nsec. On computer C₂ all instructions last 5 nsec. C₂ is therefore faster than C₁.
The Pentium has a complex, CISC instruction set, because compilers can generate better code for CISC computers than for RISC computers.

Of these statements is:

a. A incorrect en B incorrect.
b. A correct en B incorrect.
c. A incorrect en B correct.
d. A correct en B correct.

Task 4

An instruction set has three classes of instructions: A, B and C. Four realizations are designed, - M₁, M₂, M₃ and M₄ – with the following properties:

	M₁	M₂	M₃	M₄
Amount cycles for class A instructions	1	2	2	1
Amount cycles for class B instructions	2	2	2	1
Amount cycles for class C instructions	3	2	4	1
Clock speed	200 MHz	250 MHz	300 MHz	100 MHz

A program P has the following mix of instructions: 46% class A instructions, 30% class B instructions en 24% class C instructions. Which realization executes program P the fastest?

a. M₁
b. M₂
c. M₃
d. M₄

Task 5

The translation of n MIPS assembly-instructions into the MIPS machine language results in k machine-instructions. For k it holds:

a. k <= n
b. k = n
c. k >= n
d. k can be smaller, equal or bigger than n

Task 6

The MIPS-instruction srl$1,$2,1 (shift right logical) pushes the bits in register $2 1 position to the right and puts the result in $1. Which of the following statements is correct?

a. This instruction is equivalent with ``divide integer by 2'', provided that no overflow occurs.
b. This instruction is equivalent with ``divide integer by 2'', provided that the rightmost bit of the operand is 0.
c. This instruction is equivalent with ``divide integer by 2'', provided that the leftmost bit of the operand is 0.
d. None of the answers a, b or c is correct.

Task 7

Consider that $a0 , at the beginning of the execution of the following piece of code, contains the value 0x0000a7f3 (hexadecimal). What is the value of $v0 at the end?

                    addi $v0,$zero,0

                 L: beq  $a0,$zero,exit

                    andi $t0,$a0,1

                    add  $v0,$v0,$t0

                    srl  $a0,$a0,1

                    j    L

              exit: ...

a. 11
b. 15
c. 21
d. Another value.

Task 8

Consider the following statements:

A 32-bit constant can be loaded in a register by first loading a 16-bit constant with lui and then by summing to it a 16-bit constant with addi. For instance: lui 0xffff; addi 0xffff loads the 32-bit constant 0xffffffff (= -1 decimal).
If one of the operands is 0, an instruction sub cannot cause an overflow.

Of these statements is:

a. A incorrect and B incorrect.
b. A correct and B incorrect.
c. A incorrect and B correct.
d. A correct and B correct.

Task 9

Someone writes an octal number of 3 ciphers, represented by xyz. A second person interpretes this number as decimal, and obtains from it 82 (decimal) in excess. What is the cipher y?

a. 1
b. 3
c. 5
d. 7

Task 10

The IEEE 754 single-precision floating-point representation makes use of an 8-bits exponent, a 23-bits significand, and 127 as a bias. What is the decimal value of the single precision FP number denoted by 01010000110101000000000000000000?

a. 2.254857 x 10¹⁰
b. 2.845415 x 10¹⁰
c. 3.836441 x 10⁴⁸
d. 4.841224 x 10⁴⁸

Task 11

Consider the single-cycle implementation of the MIPS processor and the format of the beq instruction (see bijlage, Figure 2, 3 and 5). Which control signals are not relevant (i.e., can be substituted by a don't care) for the execution of the beq instruction?

a. RegDst and RegWrite
b. RegWrite and MemWrite
c. MemtoReg and MemWrite
d. RegDst and MemtoReg

Task 12

We want to add to the MIPS architecture an instruction swap :

        swap    $t0,$t1   # exchange the content of the registers $t0 and $t1

Hereafter two statements are given:

In order to add this instruction to the single-cycle implementation, the register file needs to be changed.
In order to add this instruction to the multi-cycle implementation, the register file needs to be changed.

Of these statements is:

a. A incorrect and B incorrect.
b. A correct and B incorrect.
c. A incorrect and B correct.
d. A correct and B correct.

Task 13

Consider the multi-cycle data path of the MIPS processor and the description of the microinstructions (see bijlage, Figure 6 and 8). Which instruction is implemented by the micro-program below?

	ALU			Register		PCWrite
Label	control	SRC1	SRC2	control	Memory	Control	Sequencing
Fetch	Add	PC	4		Read PC	ALU	Seq
	Add	PC	Extshft	Read			Dispatch 1
XXX1	Add	A	Extend				Dispatch 2
YYY2					Write ALU		Fetch

a. lw
b. sw
c. beq
d. An R-format instruction

Task 14

Someone proposes to modify the multi-cycle implementation of the MIPS architecture as follows: the update of a register occurs in the same cycle of an ALU operation or a readout operation from memory. (Hint: which changes does this imply for the FSM in Figure 7?). This however is possible only by lowering the clock speed from 500 to 400 MHz. What is the speedup of this modified implementation compared with the original, if it is given that 22% of all the instructions are loads, 49% are R-format instructions and that the CPI of the original implementation amounts to 4.04 ?

a. 0.82
b. 0.97
c. 1.03
d. 1.30

Task 15

In the multi-cycle data path of the MIPS processor (see bijlage, Figure 6) the program counter (PC) is incremented in the first cycle (during the instruction fetch step). Why?

a. That can be done immediately after that the following instruction is acquired.
b. Because otherwise for instance the instruction beq $s0,$s1,100 should result in the wrong instruction-address (PC+400 instead of PC+404).
c. Because the ALU is free during the first cycle.
d. There is no special reason; the PC can be incremented also in another cycle.

Task 16

Consider the pipelined implementation of the MIPS processor with a forwarding unit and a hazard detection unit (see bijlage, Figure 9 and 11). How many clock cycles are necessary to execute the following piece of code ?

        lw      $2, 0($2)

        lw      $3, 4($2)

        add     $5, $5, $3

        addi    $5, $5, 1

a. 8
b. 10
c. 12
d. 14

Task 17

A processor has a 32-bits address-space, a byte-addressable memory and a cache with the following properties:

4-way set-associative,
16 kilobytes size (1K = 1024),
the block-size is 4 words,
one word is 4 bytes.

How many bits does the cache-index consist of?

a. 8
b. 10
c. 16
d. 20

Task 18

A computer system has a byte addressable memory and a (mini-) cache with the following properties:

Size: 4 bytes.
Block size: 1 byte.

The processor generates in succession the following addresses:

0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,... etc.

Which cache-organization shows the best performance for this sequence of addresses?

a. Direct-mapped cache
b. 2-way set-associative cache with LRU replacement
c. Fully-associative cache with LRU replacement
d. Fully-associative cache with MRU replacement (i.e., the most recently used block is substituted)

Task 19

A computer system uses virtual memory. The size of one page is 4KB, the physical memory is 4MB, the virtual address-space is 16MB and the background memory (hard disk) is 128MB. How many entries does a page table have?

a. 2¹⁰
b. 2¹²
c. 2¹⁵
d. Another amount.

Task 20

Almost every processor has a number of privileged instructions. For what purpose are these necessary?

a. In order to implement semaphores.
b. In order to make sure that user processes do not damage the hardware.
c. In order to give to certain groups of users more rights (professors have more rights than students, etc.).
d. In order to be able to enforce protection..

Task 21

There are two processes that have access to a shared buffer. Process I writes items in the buffer, process II reads items from the same buffer. The buffer is initially empty and can contain N items. By means of 3 semaphores A, B en C it is taken care that the processes do not try

to read from the buffer and to write into it simultaneously,
to read from an empty buffer,
to write in a full buffer.

The processes have the following structure:

     process I                           process II

     --------                           ---------

     while (TRUE){                      while (TRUE){

       wait(A);                            wait(C);

       wait(B);                            wait(B);

       ... write item ...                ... read item ...

       signal(B);                          signal(B);

       signal(C);                          signal(A);

     }                                   }

It must be possible to utilize the whole capacity of the buffer. What are the correct initial values for the semaphores A and B?

a. Semaphore A: 0. Semaphore B: 0.
b. Semaphore A: 1. Semaphore B: 0.
c. Semaphore A: 1. Semaphore B: N.
d. Semaphore A: N. Semaphore B: 1.

Task 22

In the OS notes a small threads library is described. Which of the following statements is correct?

a. A thread context switch is implemented by means of a jr (jump register) instruction.
b. A thread context switch is implemented by making the ebp register point to a location in the stack of another thread.
c. The threads cannot communicate by means of shared variables.
d. Since each thread has its own stack, threads are not allowed to call any function.