Errata


Posted by reinders on Wednesday March 16, 2016 at 06:53:15
This is a list of known errata for the first printing of Intel High Performance Parallel Programming Pearls, Volume 2, er. Jim Jeffers and James Reinders; Morgan Kaufmann (Elsevier), 2015.
If you find other items we should review for possible inclusion, please email us.

Chapter 8:

  • Page 118: One should read float c = euro_call_payout(Si, X, r, r - b, sigma, time) in the first line of the while loop. Without this correction, the Newton method does not converge at all.
    In the code that can be downloaded from the web site, it appears that the values are not properly initialized. First, “float b = -0.04” looks strange as b should be >= 0. Moreover the CostofCarry is usually lower than RISKFREE which is not the case here. With those changes, the Newton iteration converges and it should not take 100 iterations to do so. Therefore it could be easily lowered to 10. A better approach would be to have a stopping criteria that depends upon the value of g
  • Page 117: On bs_euro_call, the “float” are passed by const reference which really looks weird.

Posted by reinders on Thursday July 18, 2013 at 04:41:04
This is a list of known errata for the first printing of Intel Xeon Phi Coprocessor High Performance Programming, Jim Jeffers and James Reinders; Morgan Kaufmann (Elsevier), 2013.
If you find other items we should review for possible inclusion, please email us.

General comments:

  • none

Typographical errors:

  • Page 38
    • "Petal to the metal" should read "Pedal to the metal"
  • Page 39
    • Figure 2.5. The “0” under Core 1, Thread 0 in the “Scatter Affinity” portion of the figure should be “1”
  • Page 99
    • Line 2 of terminal output "% export OMP_NUM__THREADS = 244" has an extra underscore before THREADS. Should be:
      % export OMP_NUM_THREADS = 244
    • Line 9 of terminal output "% Export OMP_NUM_THREADS = 183" the word export should not be capitalized. Should be:
      % export OMP_NUM_THREADS = 183
  • Page 141
    • Line 5 there is a missing space in "the-vec-report6 output" should be:
      "the -vec-report6 output".
  • Page 381
    • last line in page "widely in used" should be:
      "widely used"

Code errors:

  • Page 352 - MPI + Offload Trapezoidal Rule Source Code

    The MPI Offload Trapezoidal example no longer works with the latest Intel compiler versions. The Intel compiler no longer supports function inlining in offload regions. This is the corrected code:

    #include
    #include
    #include
    #include

    #define NUM_TRAPEZOIDS 1000000000

    __attribute__((target(mic))) inline double f(double x) {
    return 1.00 * x*x * exp(-(x-0.0)*(x-0.0)/(2.0*0.25*0.25))
    + 0.50 * x*x * exp(-(x-0.2)*(x-0.2)/(2.0*0.50*0.50))
    + 0.50 * x*x * exp(-(x+0.2)*(x+0.2)/(2.0*0.50*0.50))
    + 0.25 * x*x * exp(-(x-0.4)*(x-0.4)/(2.0*1.00*1.00))
    + 0.25 * x*x * exp(-(x+0.4)*(x+0.4)/(2.0*1.00*1.00));
    }

    __attribute__((target(mic))) double kernel(const int chunk_size, const double x0, const double width) {
    double integral = 0;

    #pragma omp parallel
    #pragma omp for reduction(+:integral)
    for (int i = 0; i integral += 0.5 * width * (f(x0+width*i) + f(x0+width*(i+1)));
    }

    return integral;
    }

    int main (int argc, char *argv[]) {
    int namelen, rank, size;
    char name[MPI_MAX_PROCESSOR_NAME];
    double upper_bound = 5.0, lower_bound = -5.0;
    double x0, x1, width;
    double integral = 0;
    double compute_time, total_time;
    int chunk_size;

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Get_processor_name(name, &namelen);

    chunk_size = NUM_TRAPEZOIDS / size;

    x0 = lower_bound + (upper_bound - lower_bound)*rank/size;
    x1 = x0 + (upper_bound - lower_bound)/size;
    width = (x1-x0)/chunk_size;

    MPI_Barrier(MPI_COMM_WORLD);

    compute_time = total_time = MPI_Wtime();
    #pragma offload target(mic)
    integral = kernel(chunk_size, x0, width);
    compute_time = MPI_Wtime() - compute_time;

    MPI_Allreduce(MPI_IN_PLACE, &integral, 1, MPI_DOUBLE, MPI_SUM,
    MPI_COMM_WORLD);
    total_time = MPI_Wtime() - total_time;

    printf("rank %d of %d running on %s: %f seconds\n", rank, size, name, compute_time);

    if (rank == 0) {
    printf("integral = %f, time = %f seconds\n", integral, total_time);
    }

    MPI_Finalize();

    return(0);
    }

  • Page 232 - Fortran asynchronous data transfer code example
    The code example has 2 errors

    Line 04 should be:
    integer:: signal_1 = 1, signal_2 = 2

    Explanation: For the signaling to work properly in the latest compiler version, a value must be assigned to the signal variable

    Line 09 should be:
    f1 = 1.0

    Explanation: f1 was inadvertently missed in printing the book.