intel编译器编译lammps

25 阅读3分钟

环境:CentOS7.9

Vmware pro 16.2.5

已安装parallel_studio_xe_2020_update4_cluster_edition 并配置好环境变量,参考我之前的文章

安装 lammps-3Mar2020.tar.gz

编译libfftw3xf_intel.a

在自己安装路径中~/intel/compilers_and_libraries_2020.4.304/linux/mkl/interfaces/fftw3xf  中编译:make libintel64 得到libfftw3xf_intel.a

解压安装

解压缩lammps,tar -xvf lammps-*****,进入lammps-3Mar20/src

依次执行

make yes-all
make no-lib
make  -j n intel_cpu_intelmpi   (n 为cpu核心数)

完成后得到:lmp_intel_cpu_intelmpi

测试

mkdir lammps_test && cd lammps_test
[root@mgt lammps_test]# cp -r /share/lammps-3Mar20/examples/shear . && cd shear/
[root@mgt shear]# mpirun -np 4 /share/lammps-3Mar20/src/lmp_intel_cpu_intelmpi < in.shear

-np 后面是线程数,根据自己硬件情况填

LAMMPS (3 Mar 2020)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (../comm.cpp:94)
  using 1 OpenMP thread(s) per MPI task
Lattice spacing in x,y,z = 3.52 3.52 3.52
Created orthogonal box = (0 0 0) to (56.32 35.2 9.95606)
  2 by 2 by 1 MPI processor grid
Lattice spacing in x,y,z = 3.52 4.97803 4.97803
Created 1912 atoms
  create_atoms CPU = 0.00295113 secs
Reading potential file Ni_u3.eam with DATE: 2007-06-11
264 atoms in group lower
264 atoms in group upper
528 atoms in group boundary
1384 atoms in group mobile
Setting atom values ...
  264 settings made for type
Setting atom values ...
  264 settings made for type
WARNING: Temperature for thermo pressure is not for group all (../thermo.cpp:485)
Neighbor list info ...
  update every 1 steps, delay 5 steps, check yes
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 5.1
  ghost atom cutoff = 5.1
  binsize = 2.55, bins = 23 14 4
  1 neighbor lists, perpetual/occasional/extra = 1 0 0
  (1) pair eam, perpetual
      attributes: half, newton on
      pair build: half/bin/atomonly/newton
      stencil: half/bin/3d/newton
      bin: standard
Setting up Verlet run ...
  Unit style    : metal
  Current step  : 0
  Time step     : 0.001
Per MPI rank memory allocation (min/avg/max) = 3.372 | 3.372 | 3.372 Mbytes
Step Temp E_pair E_mol TotEng Press Volume 
       0          300   -8317.4367            0   -8263.8067   -7100.7667     19547.02 
      25    219.81848   -8272.1577            0   -8232.8615    5206.8057     19547.02 
      50          300   -8238.3413            0   -8184.7112    13308.809    19688.933 
      75    294.78636   -8232.2217            0   -8179.5237    13192.782    19748.176 
     100          300   -8248.1223            0   -8194.4923    7352.0246    19816.321 
Loop time of 0.0675873 on 4 procs for 100 steps with 1912 atoms

Performance: 127.835 ns/day, 0.188 hours/ns, 1479.568 timesteps/s
90.3% CPU use with 4 MPI tasks x 1 OpenMP threads

MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 0.056151   | 0.057039   | 0.058211   |   0.3 | 84.39
Neigh   | 0.0018907  | 0.001902   | 0.0019073  |   0.0 |  2.81
Comm    | 0.0053503  | 0.0066736  | 0.0077041  |   1.1 |  9.87
Output  | 9.3606e-05 | 0.0001099  | 0.00014841 |   0.0 |  0.16
Modify  | 0.00062962 | 0.00069847 | 0.00081757 |   0.0 |  1.03
Other   |            | 0.001165   |            |       |  1.72

Nlocal:    478 ave 490 max 466 min
Histogram: 1 0 1 0 0 0 0 1 0 1
Nghost:    1036.25 ave 1046 max 1027 min
Histogram: 1 1 0 0 0 0 0 1 0 1
Neighs:    11488 ave 11948 max 11157 min
Histogram: 1 0 1 0 1 0 0 0 0 1

Total # of neighbors = 45952
Ave neighs/atom = 24.0335
Neighbor list builds = 4
Dangerous builds = 0
WARNING: Temperature for thermo pressure is not for group all (../thermo.cpp:485)
Setting up Verlet run ...
  Unit style    : metal
  Current step  : 0
  Time step     : 0.001
Per MPI rank memory allocation (min/avg/max) = 3.372 | 3.372 | 3.372 Mbytes
Step Temp E_pair E_mol TotEng Press Volume 
       0    302.29407   -8248.1223            0   -8212.0956    6393.6774     19845.81 
     100    291.61298   -8259.5472            0   -8224.7933   -1300.9229     19874.36 
     200    293.36405   -8256.9998            0   -8222.0373   -799.49219    19965.148 
     300    305.94188   -8252.9181            0   -8216.4566   -1335.0012    20062.063 
     400    309.95918   -8247.5756            0   -8210.6354   -1062.2448    20094.446 
     500    301.94062   -8239.3596            0    -8203.375    797.08496    20172.635 
     600    302.21507   -8230.7027            0   -8194.6854    3987.1988     20265.23 
     700    296.32595   -8221.2036            0   -8185.8881    5409.7911    20394.703 
     800    291.23487   -8207.8671            0   -8173.1583     10667.09     20510.74 
     900    297.88948   -8196.1164            0   -8160.6146     13967.96     20646.32 
    1000    301.54921   -8182.0007            0   -8146.0627    17939.885    20752.586 
    1100    308.95153   -8164.9247            0   -8128.1046    22823.971    20889.388 
    1200    301.95399    -8153.476            0   -8117.4898    25618.698    21000.539 
    1300          300   -8143.3818            0   -8107.6284    26668.263    21122.684 
    1400          300   -8136.2928            0   -8100.5395    26328.325    21252.157 
    1500          300   -8132.5465            0   -8096.7931    23584.447    21379.187 
    1600          300   -8129.9298            0   -8094.1764    20684.486    21497.667 
    1700          300    -8131.655            0   -8095.9016    15384.272    21617.369 
    1800          300   -8149.3135            0   -8113.5601    9698.7054    21738.292 
    1900          300   -8156.1776            0   -8120.4243    9887.2669    21861.658 
    2000          300   -8161.9857            0   -8126.2324    8382.4517    21988.688 
    2100          300   -8163.9644            0    -8128.211    5288.1872    22107.168 
    2200     309.9432   -8171.1806            0   -8134.2422    331.97612    22234.198 
    2300          300    -8173.679            0   -8137.9256   -2756.1784    22346.571 
    2400          300   -8183.2429            0   -8147.4895   -6494.1612     22472.38 
    2500    309.13407   -8186.7918            0   -8149.9499   -8827.4368     22599.41 
    2600    299.71761   -8177.7445            0   -8142.0248   -7906.1647    22721.555 
    2700          300   -8174.4672            0   -8138.7138   -8920.5441    22832.706 
    2800    306.09492   -8173.4147            0    -8136.935   -10981.226    22960.958 
    2900    303.27397   -8168.2141            0   -8132.0706   -8905.5017    23078.216 
    3000    301.48023   -8165.8151            0   -8129.8854   -10668.385    23201.582 
Loop time of 2.17941 on 4 procs for 3000 steps with 1912 atoms

Performance: 118.931 ns/day, 0.202 hours/ns, 1376.521 timesteps/s
94.2% CPU use with 4 MPI tasks x 1 OpenMP threads

MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 1.8085     | 1.8659     | 1.9289     |   3.1 | 85.61
Neigh   | 0.11359    | 0.11912    | 0.12499    |   1.2 |  5.47
Comm    | 0.079575   | 0.15125    | 0.21504    |  12.4 |  6.94
Output  | 0.00070049 | 0.00082434 | 0.0011907  |   0.0 |  0.04
Modify  | 0.019464   | 0.01998    | 0.020798   |   0.4 |  0.92
Other   |            | 0.02237    |            |       |  1.03

Nlocal:    478 ave 509 max 446 min
Histogram: 2 0 0 0 0 0 0 0 0 2
Nghost:    1009.5 ave 1054 max 963 min
Histogram: 2 0 0 0 0 0 0 0 0 2
Neighs:    11210.5 ave 12215 max 10197 min
Histogram: 1 0 1 0 0 0 0 1 0 1

Total # of neighbors = 44842
Ave neighs/atom = 23.4529
Neighbor list builds = 225
Dangerous builds = 0
Total wall time: 0:00:02

测试用脚本

#!/bin/sh
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --ntasks-per-node=1
#SBATCH --partition=normal
#SBATCH --output=%j.out
#SBATCH --error=%j.err
#source /share/intel/parallel_studio_xe_2020.4.912/bin/psxevars.sh intel64
#默认已经在系统变量中配置好了可以选择性注释或解除没有影响
export PATH=/share/lammps-3Mar20/src:$PATH
mpirun -np $SLURM_NTASKS /share/lammps-3Mar20/src/lmp_intel_cpu_intelmpi < in.shear