Cycles per element (CPE): The CPE denotes performance of program that helps in improving code. It helps to understand detailed level loop performance for an iterative program. It is suitable for programs that use a repetitive calculation. The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

Question

Want to see more full solutions like this?

Answer 1

Question

Chapter 5, Problem 5.13HW

A.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

A.

Expert Solution

Explanation of Solution

Diagram for instruction sequence:

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 1

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 2

Explanation:

The data dependencies between instructions are been depicted in diagram.
The given instruction sequence is been decoded into operations.
It creates a critical path of operations.
The data flow between instructions is been shown in diagram.

B.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

B.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “double”, it denotes the float add cell.
The lower bound on CPE is 3.0 based on the architecture.

C.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

C.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “integer”, it denotes the long add cell.
The lower bound on CPE is 1.0 based on the architecture.

D.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

D.

Expert Solution

Explanation of Solution

Given C Code:

// Define method inner4

void inner4(vec_ptr u, vec_ptr v, data_t *dest)

{

// Declare variable

long i;

//Compute length of vector

long length = vec_length(u);

//Get first vector

data_t *udata = get_vec_start(u);

//Get second vector

data_t *vdata = get_vec_start(v);

//Initialize variable

data_t sum = (data_t) 0;

//Loop

for (i = 0; i < length; i++)

{

//Compute product and add

sum = sum + udata[i] * vdata[i];

}

//Store result

*dest = sum;

}

CPE value for floating-point versions:

The inner product computed is been accumulated in temporary.
The float add operation is only on key path.
The multiplication operation takes 5 clock cycles.
The overall operation takes 3 cycles to complete on average.
Hence, CPE value for floating-point versions is 3.0.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Students have asked these similar questions

Inner loop of inner4. data_t = double, OP = * udata in %rbp, vdata in %rax, sum in %xmmo i in %rcx, limit in %rbx 1 .L15: loop: vmovsd 0(%rbp,%rcx,8), %xmm1 vmulsd (%rax,%rcx,8), %xmm1, %xmm1 vaddsd %xmm1, %xmm0, %xmmo $1, %rcx %rbx, %rcx 2 Get udata[i] 3 Multiply by vdata[i] 4 Add to sum addq Increment i Compare i:limit If !=, goto loop стра 7 jne .L15 Assume that the functional units have the characteristics listed in Figure 5.12. A. Diagram how this instruction sequence would be decoded into operations and show how the data dependencies between them would create a critical path of operations, in the style of Figures 5.13 and 5.14. B. For data type double, what lower bound on the CPE is determined by the critical path? C. Assuming similar instruction sequences for the integer code as well, what lower bound on the CPE is determined by the critical path for integer data? D. Explain how the floating-point versions can have CPES of 3.00, even though the multiplication operation requires…

Q2: Implement function F (A, B, C) = m (2,4,6,7) using 1:4 deMUX F (B, C).

Write pseudocode for a function Det-Quicksort(A, p, r) that receives array A[1..n], and indices p and r. The function should sort the subarray A[p..r] recursively (meaning you should call itself). You can also use a function LinearSearch(A, p, r, v) that searches subarray A[p..r] for an element of value v and return its index (in case it exists) in O(r − p) time. (Just to give you something to compare to, the solution has 8 lines.)

Answer 2

Question

Chapter 5, Problem 5.13HW

A.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

A.

Expert Solution

Explanation of Solution

Diagram for instruction sequence:

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 1

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 2

Explanation:

The data dependencies between instructions are been depicted in diagram.
The given instruction sequence is been decoded into operations.
It creates a critical path of operations.
The data flow between instructions is been shown in diagram.

B.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

B.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “double”, it denotes the float add cell.
The lower bound on CPE is 3.0 based on the architecture.

C.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

C.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “integer”, it denotes the long add cell.
The lower bound on CPE is 1.0 based on the architecture.

D.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

D.

Expert Solution

Explanation of Solution

Given C Code:

// Define method inner4

void inner4(vec_ptr u, vec_ptr v, data_t *dest)

{

// Declare variable

long i;

//Compute length of vector

long length = vec_length(u);

//Get first vector

data_t *udata = get_vec_start(u);

//Get second vector

data_t *vdata = get_vec_start(v);

//Initialize variable

data_t sum = (data_t) 0;

//Loop

for (i = 0; i < length; i++)

{

//Compute product and add

sum = sum + udata[i] * vdata[i];

}

//Store result

*dest = sum;

}

CPE value for floating-point versions:

The inner product computed is been accumulated in temporary.
The float add operation is only on key path.
The multiplication operation takes 5 clock cycles.
The overall operation takes 3 cycles to complete on average.
Hence, CPE value for floating-point versions is 3.0.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Answer 3

Question

Chapter 5, Problem 5.13HW

A.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

A.

Expert Solution

Explanation of Solution

Diagram for instruction sequence:

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 1

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 2

Explanation:

The data dependencies between instructions are been depicted in diagram.
The given instruction sequence is been decoded into operations.
It creates a critical path of operations.
The data flow between instructions is been shown in diagram.

B.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

B.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “double”, it denotes the float add cell.
The lower bound on CPE is 3.0 based on the architecture.

C.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

C.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “integer”, it denotes the long add cell.
The lower bound on CPE is 1.0 based on the architecture.

D.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

D.

Expert Solution

Explanation of Solution

Given C Code:

// Define method inner4

void inner4(vec_ptr u, vec_ptr v, data_t *dest)

{

// Declare variable

long i;

//Compute length of vector

long length = vec_length(u);

//Get first vector

data_t *udata = get_vec_start(u);

//Get second vector

data_t *vdata = get_vec_start(v);

//Initialize variable

data_t sum = (data_t) 0;

//Loop

for (i = 0; i < length; i++)

{

//Compute product and add

sum = sum + udata[i] * vdata[i];

}

//Store result

*dest = sum;

}

CPE value for floating-point versions:

The inner product computed is been accumulated in temporary.
The float add operation is only on key path.
The multiplication operation takes 5 clock cycles.
The overall operation takes 3 cycles to complete on average.
Hence, CPE value for floating-point versions is 3.0.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Answer 4

Question

Chapter 5, Problem 5.13HW

A.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

A.

Expert Solution

Explanation of Solution

Diagram for instruction sequence:

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 1

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 2

Explanation:

The data dependencies between instructions are been depicted in diagram.
The given instruction sequence is been decoded into operations.
It creates a critical path of operations.
The data flow between instructions is been shown in diagram.

B.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

B.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “double”, it denotes the float add cell.
The lower bound on CPE is 3.0 based on the architecture.

C.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

C.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “integer”, it denotes the long add cell.
The lower bound on CPE is 1.0 based on the architecture.

D.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

D.

Expert Solution

Explanation of Solution

Given C Code:

// Define method inner4

void inner4(vec_ptr u, vec_ptr v, data_t *dest)

{

// Declare variable

long i;

//Compute length of vector

long length = vec_length(u);

//Get first vector

data_t *udata = get_vec_start(u);

//Get second vector

data_t *vdata = get_vec_start(v);

//Initialize variable

data_t sum = (data_t) 0;

//Loop

for (i = 0; i < length; i++)

{

//Compute product and add

sum = sum + udata[i] * vdata[i];

}

//Store result

*dest = sum;

}

CPE value for floating-point versions:

The inner product computed is been accumulated in temporary.
The float add operation is only on key path.
The multiplication operation takes 5 clock cycles.
The overall operation takes 3 cycles to complete on average.
Hence, CPE value for floating-point versions is 3.0.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Answer 5

Chapter 5, Problem 5.13HW

A.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

A.

Expert Solution

Explanation of Solution

Diagram for instruction sequence:

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 1

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 2

Explanation:

The data dependencies between instructions are been depicted in diagram.
The given instruction sequence is been decoded into operations.
It creates a critical path of operations.
The data flow between instructions is been shown in diagram.

B.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

B.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “double”, it denotes the float add cell.
The lower bound on CPE is 3.0 based on the architecture.

C.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

C.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “integer”, it denotes the long add cell.
The lower bound on CPE is 1.0 based on the architecture.

D.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

D.

Expert Solution

Explanation of Solution

Given C Code:

// Define method inner4

void inner4(vec_ptr u, vec_ptr v, data_t *dest)

{

// Declare variable

long i;

//Compute length of vector

long length = vec_length(u);

//Get first vector

data_t *udata = get_vec_start(u);

//Get second vector

data_t *vdata = get_vec_start(v);

//Initialize variable

data_t sum = (data_t) 0;

//Loop

for (i = 0; i < length; i++)

{

//Compute product and add

sum = sum + udata[i] * vdata[i];

}

//Store result

*dest = sum;

}

CPE value for floating-point versions:

The inner product computed is been accumulated in temporary.
The float add operation is only on key path.
The multiplication operation takes 5 clock cycles.
The overall operation takes 3 cycles to complete on average.
Hence, CPE value for floating-point versions is 3.0.

Answer 6

Chapter 5, Problem 5.13HW

A.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

A.

Expert Solution

Explanation of Solution

Diagram for instruction sequence:

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 1

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 2

Explanation:

The data dependencies between instructions are been depicted in diagram.
The given instruction sequence is been decoded into operations.
It creates a critical path of operations.
The data flow between instructions is been shown in diagram.

Answer 7

Chapter 5, Problem 5.13HW

A.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

Answer 8

A.

Expert Solution

Explanation of Solution

Diagram for instruction sequence:

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 1

Computer Systems: A Programmer's Perspective (3rd Edition), Chapter 5, Problem 5.13HW , additional homework tip 2

Explanation:

The data dependencies between instructions are been depicted in diagram.
The given instruction sequence is been decoded into operations.
It creates a critical path of operations.
The data flow between instructions is been shown in diagram.

Answer 9

A.

Expert Solution

Answer 10

A.

Expert Solution

Answer 11

Expert Solution

Answer 12

B.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

B.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “double”, it denotes the float add cell.
The lower bound on CPE is 3.0 based on the architecture.

Answer 13

B.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

Answer 14

B.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “double”, it denotes the float add cell.
The lower bound on CPE is 3.0 based on the architecture.

Answer 15

B.

Expert Solution

Answer 16

B.

Expert Solution

Answer 17

Expert Solution

Answer 18

C.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

C.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “integer”, it denotes the long add cell.
The lower bound on CPE is 1.0 based on the architecture.

Answer 19

C.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

Answer 20

C.

Expert Solution

Explanation of Solution

Lower bound on CPE:

The lower bound on CPE is been determined by critical path.
For data type “integer”, it denotes the long add cell.
The lower bound on CPE is 1.0 based on the architecture.

Answer 21

C.

Expert Solution

Answer 22

C.

Expert Solution

Answer 23

Expert Solution

Answer 24

D.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

D.

Expert Solution

Explanation of Solution

Given C Code:

// Define method inner4

void inner4(vec_ptr u, vec_ptr v, data_t *dest)

{

// Declare variable

long i;

//Compute length of vector

long length = vec_length(u);

//Get first vector

data_t *udata = get_vec_start(u);

//Get second vector

data_t *vdata = get_vec_start(v);

//Initialize variable

data_t sum = (data_t) 0;

//Loop

for (i = 0; i < length; i++)

{

//Compute product and add

sum = sum + udata[i] * vdata[i];

}

//Store result

*dest = sum;

}

CPE value for floating-point versions:

The inner product computed is been accumulated in temporary.
The float add operation is only on key path.
The multiplication operation takes 5 clock cycles.
The overall operation takes 3 cycles to complete on average.
Hence, CPE value for floating-point versions is 3.0.

Answer 25

D.

Program Plan Intro

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is suitable for programs that use a repetitive calculation.

The processor’s activity sequencing is measured by a clock that provides signal of some frequency.

Answer 26

D.

Expert Solution

Explanation of Solution

Given C Code:

// Define method inner4

void inner4(vec_ptr u, vec_ptr v, data_t *dest)

{

// Declare variable

long i;

//Compute length of vector

long length = vec_length(u);

//Get first vector

data_t *udata = get_vec_start(u);

//Get second vector

data_t *vdata = get_vec_start(v);

//Initialize variable

data_t sum = (data_t) 0;

//Loop

for (i = 0; i < length; i++)

{

//Compute product and add

sum = sum + udata[i] * vdata[i];

}

//Store result

*dest = sum;

}

CPE value for floating-point versions:

The inner product computed is been accumulated in temporary.
The float add operation is only on key path.
The multiplication operation takes 5 clock cycles.
The overall operation takes 3 cycles to complete on average.
Hence, CPE value for floating-point versions is 3.0.

Answer 27

D.

Expert Solution

Answer 28

D.

Expert Solution

Answer 29

Expert Solution

Answer 30

Ch. 5.1 - Prob. 5.1PP Ch. 5.2 - Prob. 5.2PP Ch. 5.4 - Prob. 5.3PP Ch. 5.6 - Prob. 5.4PP Ch. 5.7 - Practice Problem 5.5 (solution page 575) Suppose...Ch. 5.7 - Practice Problem 5.6 (solution page 575) Let us...Ch. 5.8 - Prob. 5.7PP Ch. 5.9 - Prob. 5.8PP Ch. 5.11 - Prob. 5.9PP Ch. 5.12 - Prob. 5.10PP

Concept explainers

Explanation of Solution

Explanation of Solution

Explanation of Solution

Explanation of Solution

Want to see more full solutions like this?

Chapter 5 Solutions