How to run HPL/HPCG/IO500 in WSL

3. Compile HPL

Get HPL from www.netlib.org/benchmark/h….

wget https://www.netlib.org/benchmark/hpl/hpl-2.3.tar.gz

Unpack file.
```
tar -xzvf hpl-2.3.tar.gz && cd hpl-2.3
```
Copy a sample configuration from the Setups folder.
```
cp setup/Make.Linux_PII_CBLAS ./
```
Edit the Make.Linux_PII_CBLAS file. You can use editors like vi/vim/gedit/nano, I'll use nano as an example.
```
nano Make.Linux_PII_CBLAS
```
And change the following lines:
```
TOPdir       = $(HOME)/hpl-2.3

MPdir        = /usr/lib/x86_64-linux-gnu/openmpi
MPlib        = $(MPdir)/lib/libmpi.so

LAdir        = /usr/lib/x86_64-linux-gnu/openblas-pthread
LAlib        = $(LAdir)/libopenblas.a $(LAdir)/libbblas.a

CC           = /usr/bin/mpicc
LINKER       = /usr/bin/gfortran
```
For nano, Ctrl + o and Enter to save, Ctrl + x to exit.

If you have other MPI or BLAS libraries installed, you also need to modify MPdir MPlib and LAdir LAlib to the corresponding installation path and library files.
Compile HPL, -j8 is the number of threads you want to open, a larger number will speed up compilation.
```
make arch=Linux_PII_CBLAS -j8
```
Waiting for the end of compilation, if everything is fine with your configuration, you can find HPL.dat and xhpl under bin/Linux_PII_CBLAS.

If you don't find xhpl in bin, but can find in testing, you may have entered the wrong path. If you can't find even in tesing directory, there may be a configuration problem, please check the Make.Linux_PII_CBLAS file.

To run the sample HPL test, we use mpirun to run it in multiple threads, the number after -np is the number of processes to run.

cd bin/Linux_PII_CBLAS
touch HPL.out
mpirun -np 8 xhpl

You will get a lot of output, just a small snippet here.

================================================================================
HPLinpack 2.3  --  High-Performance Linpack benchmark  --   December 2, 2018
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :      29       30       34       35
NB     :       1        2        3        4
PMAP   : Row-major process mapping
P      :       2        1        4
Q      :       2        4        1
PFACT  :    Left    Crout    Right
NBMIN  :       2        4
NDIV   :       2
RFACT  :    Left    Crout    Right
BCAST  :   1ring
DEPTH  :       0
SWAP   : Mix (threshold = 64)
L1     : transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR00L2L2          29     1     2     2               0.00             1.9114e-02
HPL_pdgesv() start time

HPL_pdgesv() end time

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=   1.88218349e-02 ...... PASSED

...
...
...

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR00R2R4          35     4     4     1               0.00             5.9052e-01
HPL_pdgesv() start time

HPL_pdgesv() end time

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=   1.99396688e-02 ...... PASSED
================================================================================

Finished    864 tests with the following results:
            864 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================

We need to focus on the Gflops value, my maximum value in the sample configuration is 1.3211e+00.

How to run HPL/HPCG/IO500 in WSL (2) | 青训营笔记

How to run HPL/HPCG/IO500 in WSL

3. Compile HPL