如何用Python从格式不规则的数据文件中提取数据

50 阅读1分钟

需要从一个文件中提取某些数据,但文件格式不适合电脑读取,因此是不规则的。首先,在任何数据实际开始之前都有大量文本:

huake_00183_.jpg

DL_POLY Version 2.20

                        Running on   10 nodes





***************     DLPOLY: LiNbO3                                                                       >***************




SIMULATION CONTROL PARAMETERS
simulation temperature           1.4500E+03
simulation pressure (katm)       0.0000E+00
selected number of timesteps         8000
equilibration period                  500
data printing interval                 80
statistics file interval               80
simulation timestep              5.0000E-04
Nose-Hoover  (Melchionna) isotropic N-P-T 
  thermostat relaxation time       1.0000E-01
  barostat relaxation time         5.0000E-01
trajectory file option on
  trajectory file start                   1
  trajectory file interval               80
  trajectory file info key                2
  ...

然后过一会儿就会出现实际数据,但它采用这种奇怪的形式:

step     eng_tot    temp_tot     eng_cfg     eng_vdw     eng_cou     eng_bnd     >    eng_ang     eng_dih     eng_tet
  time(ps)      eng_pv    temp_rot     vir_cfg     vir_vdw     vir_cou     vir_bnd     >vir_ang     vir_con     vir_tet
  cpu  (s)      volume    temp_shl     eng_shl     vir_shl       alpha        beta       >gamma     vir_pmf       press


1 -1.1289E+05  1.4750E+03 -1.1386E+05  1.7276E+04 -1.3114E+05  0.0000E+00  >0.0000E+00  0.0000E+00  0.0000E+00
         0.0 -1.1545E+05  0.0000E+00  9.6539E+03 -1.2118E+05  1.3083E+05  0.0000E+00  >0.0000E+00  0.0000E+00  0.0000E+00
         0.8  5.3733E+04  1.2367E+02  0.0000E+00  0.0000E+00  5.6396E+01  5.6396E+01  >5.6396E+01  0.0000E+00 -7.5549E+01

rolling -1.1289E+05  1.4750E+03 -1.1386E+05  1.7276E+04 -1.3114E+05  0.0000E+00  >0.0000E+00  0.0000E+00  0.0000E+00
  averages -1.1545E+05  0.0000E+00  9.6539E+03 -1.2118E+05  1.3083E+05  0.0000E+00  >0.0000E+00  0.0000E+00  0.0000E+00
            5.3733E+04  1.2367E+02  0.0000E+00  0.0000E+00  5.6396E+01  5.6396E+01  >5.6396E+01  0.0000E+00 -7.5549E+01


80 -1.1290E+05  1.5021E+03 -1.1392E+05  2.1894E+04 -1.3726E+05  0.0000E+00  >0.0000E+00  0.0000E+00  0.0000E+00
         0.0 -1.1256E+05  0.0000E+00  8.6671E+02 -1.3974E+05  1.3707E+05  0.0000E+00  >0.0000E+00  0.0000E+00  0.0000E+00
        10.6  5.3149E+04  1.1377E+03  1.4419E+03  3.5382E+03  5.6396E+01  5.6396E+01  >5.6396E+01  0.0000E+00  1.1119E+01

rolling -1.1290E+05  1.6145E+03 -1.1398E+05  2.0750E+04 -1.3588E+05  0.0000E+00  >0.0000E+00  0.0000E+00  0.0000E+00
  averages -1.1333E+05  0.0000E+00  3.3694E+03 -1.3512E+05  1.3565E+05  0.0000E+00  >0.0000E+00  0.0000E+00  0.0000E+00
            5.3481E+04  1.0997E+03  1.1430E+03  2.8391E+03  5.6396E+01  5.6396E+01  >5.6396E+01  0.0000E+00 -1.2096E+01


160 -1.1287E+05  1.2629E+03 -1.1376E+05  2.1450E+04 -1.3633E+05  0.0000E+00  >0.0000E+00  0.0000E+00  0.0000E+00
         0.1 -1.1249E+05  0.0000E+00  3.8761E+02 -1.3824E+05  1.3612E+05  0.0000E+00  >0.0000E+00  0.0000E+00  0.0000E+00
        20.5  5.3375E+04  4.9015E+02  1.1243E+03  2.5052E+03  5.6396E+01  5.6396E+01  >5.6396E+01  0.0000E+00  1.2676E+01

rolling -1.1288E+05  1.4677E+03 -1.1389E+05  2.1589E+04 -1.3663E+05  0.0000E+00  0.0000E+00  0.0000E+00  0.