在使用 Python Pandas 库读取以制表符分隔的文本文件时,发现所有文本列都被显示为 NaN。输入文件示例如下:
Blah Blah
Blah Blah
Blah Blah
Blah Blah
Blah Blah
Blah Blah
Blah Blah
Period: Oct 28 2013 - Apr 27 2014
Note:
Brand Variant Industry Major Category Market Media Type Parent Company Product Category Report Period (multiple) PCC Sub Group Subsidiary Units $$$ (000)
3 LADIES HAND-DIPPED CANDIES CANDY CONFECT., SNACKS & SOFT DRINKS CONFECTIONERY & SNACKS Columbus Combo Local Newspaper COTTAGE FOOD PRODUCTION OPERATION CANDY 11/18/13 - 11/24/13 F211 CANDY & GUM COTTAGE FOOD PRODUCTION OPERATION 1 0.286
3 MUSKETEERS CANDY BAR CONFECT., SNACKS & SOFT DRINKS CONFECTIONERY & SNACKS Atlanta Combo Spot Radio MARS INC CANDY BAR 11/04/13 - 11/10/13 F211 CANDY & GUM MARS SNACKFOOD US LLC 22 1.403
使用以下 Python 代码读取文件:
import pandas as pd
df = pd.read_csv(csvFile, delimiter='\t', header=[9])
print(df)
输出结果如下:
Brand Variant \
3 LADIES HAND-DIPPED CANDIES CANDY NaN
3 MUSKETEERS CANDY BAR NaN
Industry \
3 LADIES HAND-DIPPED CANDIES CANDY NaN
3 MUSKETEERS CANDY BAR NaN
Major Category \
3 LADIES HAND-DIPPED CANDIES CANDY NaN
3 MUSKETEERS CANDY BAR NaN
Market \
3 LADIES HAND-DIPPED CANDIES CANDY NaN
3 MUSKETEERS CANDY BAR NaN
Media Type \
3 LADIES HAND-DIPPED CANDIES CANDY NaN
3 MUSKETEERS CANDY BAR NaN
Parent Company \
3 LADIES HAND-DIPPED CANDIES CANDY NaN
3 MUSKETEERS CANDY BAR NaN
Product Category \
3 LADIES HAND-DIPPED CANDIES CANDY NaN
3 MUSKETEERS CANDY BAR NaN
Report Period (multiple) \
3 LADIES HAND-DIPPED CANDIES CANDY NaN
3 MUSKETEERS CANDY BAR NaN
PCC Sub Group \
3 LADIES HAND-DIPPED CANDIES CANDY NaN
3 MUSKETEERS CANDY BAR NaN
Subsidiary \
3 LADIES HAND-DIPPED CANDIES CANDY NaN
3 MUSKETEERS CANDY BAR NaN
Units $$$ (000)
3 LADIES HAND-DIPPED CANDIES CANDY NaN NaN
3 MUSKETEERS CANDY BAR NaN NaN
可以看到,所有文本列都被显示为 NaN。
- 解决方案
import pandas as pd
df = pd.read_csv(csvFile, sep='\t', skiprows=9, index_col=False)
print(df)
通过设置 skiprows=9 跳过文件中的前 9 行(即 "Blah Blah" 行),并设置 index_col=False 来关闭自动将第一列设置为索引,就可以正确读取文本列。
输出结果如下:
Brand Variant Industry Major Category Market Media Type Parent Company Product Category Report Period (multiple) PCC Sub Group Subsidiary Units $$$ (000)
0 3 LADIES HAND-DIPPED CANDIES CANDY CONFECT., SNACKS & SOFT DRINKS CONFECTIONERY & SNACKS Columbus Combo Local Newspaper COTTAGE FOOD PRODUCTION OPERATION CANDY 11/18/13 - 11/24/13 F211 CANDY & GUM COTTAGE FOOD PRODUCTION OPERATION 1 0.286
1 3 MUSKETEERS CANDY BAR CONFECT., SNACKS & SOFT DRINKS CONFECTIONERY & SNACKS Atlanta Combo Spot Radio MARS INC CANDY BAR 11/04/13 - 11/10/13 F211 CANDY & GUM MARS SNACKFOOD US LLC 22 1.403
可以看到,文本列现在正确显示了。