当使用 LOAD DATA INFILE 命令将 CSV 文件直接导入 MySQL 时,可能会遇到一些字段残留了 Excel 公式字段,例如值之前带有一个等号 (=)。当目标表结构设置允许第一个字段设置为 VARCHAR(100) 时,理想情况下希望将其设置为 INT。以下是一个 CSV 内容示例,其中第一个字段包含等号:
"MID","DBA Name","Partner ID","Partner Name","Sub Partner ID","Sub Partner Name","Active Months","Bonus Amount","Bonus Applied Date","Partner Percentage","Partner Share","Total Payment"
="0008788014065741","company2","7968","me,"11839","Joe Blow","0","350.64"
="0008788014065756","company2","7968","you","11839","Joe Blow","0","294.60"
以下是要使用的 MySQL 导入命令:
sql = """
LOAD DATA LOCAL INFILE '%(upload)s' IGNORE INTO TABLE %(table)s
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES ;
""" % {"upload": file, "table": report}
self.db.query( sql )
任务是弄清楚如何在导入时处理这些公式值,以使其在导入到 MySQL 表时被正确解析。
2、解决方案
有两种方法可以使用 LOAD DATA INFILE 解决此问题:
方法一:使用 SET 子句 这种方法涉及在 LOAD DATA INFILE 命令中使用 SET 子句。SET 子句允许在导入过程中对列值进行转换和修改。在 SET 子句中,我们可以使用 TRIM 函数删除公式值开头的等号 (=) 和双引号 (")。我们还可以对其他列进行必要的转换,例如删除货币符号、将日期转换为正确的格式等。
以下是一个使用 SET 子句的示例:
LOAD DATA LOCAL INFILE '/path/to/your/file.csv'
IGNORE INTO TABLE table_name
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(@MID, DBAName, PartnerID, PartnerName, SubPartnerID, SubPartnerName, ActiveMonths,
@BonusAmount, @BonusAppliedDate, @PartnerPercentage, @PartnerShare, @TotalPayment)
SET MID = TRIM(BOTH '"' FROM SUBSTR(@MID, 2)), -- here we get rid of equal sign and double quotes
BonusAmount = TRIM(LEADING '$' FROM NULLIF(@BonusAmount, '')),
BonusAppliedDate = NULLIF(@BonusAppliedDate, ''),
PartnerPercentage = NULLIF(@PartnerPercentage, ''),
PartnerShare = TRIM(LEADING '$' FROM NULLIF(@PartnerShare, '')),
TotalPayment = TRIM(LEADING '$' FROM NULLIF(@TotalPayment, ''))
方法二:使用 LINES STARTING BY 子句 这种方法涉及使用 LINES STARTING BY 子句。LINES STARTING BY 子句允许指定导入文件中数据的开始行。我们可以使用 LINES STARTING BY 子句指定公式值所在行的开始标志,例如等号 (=)。然后,我们可以跳过这些行,只导入不包含公式值的数据。
以下是一个使用 LINES STARTING BY 子句的示例:
LOAD DATA LOCAL INFILE '/path/to/your/file.csv'
IGNORE INTO TABLE table_name
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n' STARTING BY '='
IGNORE 1 LINES
(MID, DBAName, PartnerID, PartnerName, SubPartnerID, SubPartnerName, ActiveMonths, @BonusAmount, BonusAppliedDate, PartnerPercentage, PartnerShare, @TotalPayment)
SET BonusAmount = TRIM(LEADING '$' FROM NULLIF(@BonusAmount, '')),
BonusAppliedDate = NULLIF(@BonusAppliedDate, ''),
PartnerPercentage = NULLIF(@PartnerPercentage, ''),
PartnerShare = TRIM(LEADING '$' FROM NULLIF(@PartnerShare, '')),
TotalPayment = TRIM(LEADING '$' FROM NULLIF(@TotalPayment, ''))
这两种方法都可以有效地从 CSV 文件中删除公式值,并将其正确导入到 MySQL 表中。