如何使用 Laravel 处理大型 CSV 文件处理大量的 CSV 文件在商业世界中是非常标准的，尤其是当您有大量数据需

处理大量的 CSV 文件在商业世界中是非常标准的，尤其是当您有大量数据需要分析、报告或四处移动时。如果您使用的是 Laravel 并且需要处理大型 CSV 文件，那么您来对地方了。我们将指导您以最顺畅的方式处理此任务，而不会导致应用程序性能拥堵。我们将使用一个名为 Simple Excel by Spatie 的漂亮包，而不是创可贴的方法。如果你因为期望 Spatie 有解决方案而点头，那么你并不孤单。

composer require spatie/simple-excel

假设您已经准备好了 CSV 文件，我们将使用 SimpleExcelReader 来加载它。很酷的是，默认情况下，它会返回一个 LazyCollection – 将其视为一种更周到的方式来处理您的数据，而不会耗尽服务器的内存。这意味着您可以一点一点地处理文件，让您的应用程序保持轻松。

$rows is an instance of Illuminate\Support\LazyCollection
$rows 是 Illuminate\Support\LazyCollection 的实例

现在，在我们深入研究代码之前，让我们设置一个 Laravel 作业来管理我们的 CSV 处理。

php artisan make:job ImportCsv

现在，我们的 ImportCsv 作业如下所示：

<?php

namespace App\Jobs;

use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Spatie\SimpleExcel\SimpleExcelReader;

class ImportCsv implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    /**
     * Create a new job instance.
     */
    public function __construct()
    {
        //
    }

    /**
     * Execute the job.
     */
    public function handle(): void
    {
        SimpleExcelReader::create(storage_path('app/public/products.csv'))
            ->useDelimiter(',')
            ->useHeaders(['ID', 'title', 'description'])
            ->getRows()
            ->chunk(5000)
            ->each(
				// Here we have a chunk of 5000 products
			);
    }
}

对 CSV 进行分块：我们将该文件分解为可管理的部分，为我们提供一个 LazyCollection 供我们使用。 Job Dispatching：对于每个 chunk，我们将发送一个 job。这样，我们就可以分批处理，这在您的服务器上会更容易。 数据库插入：然后将每个块插入到数据库中，这既简单又好。准备好 LazyCollection 后，我们将 CSV 切成块。把它想象成一个巨大的三明治变成一口大小的块——更容易处理。

php artisan make:job ImportProductChunk

对于 CSV 的每一部分，我们都会创建并启动一个 job。这些工作就像勤奋的工作程序，每个工作都占用一个数据块并小心地将数据插入到您的数据库中。

<?php

namespace App\Jobs;

use App\Models\Product;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldBeUnique;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Database\Eloquent\Model;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Illuminate\Support\Str;

class ImportProductChunk implements ShouldBeUnique, ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public $uniqueFor = 3600;

    /**
     * Create a new job instance.
     */
    public function __construct(
		public $chunk
	) {
        //
    }

    /**
     * Execute the job.
     */
    public function handle(): void
    {
        $this->chunk->each(function (array $row) {
            Model::withoutTimestamps(fn () => Product::updateOrCreate([
                'product_id' => $row['ID'],
				'title' => $row['title'],
				'description' => $row['description'],
           ]));
        });
    }

    public function uniqueId(): string
    {
        return Str::uuid()->toString();
    }
}

Ensuring Uniqueness 确保唯一性

要记住的一件重要事情是在您的作业中使用 $uniqueFor 和 uniqueId。这就像给每个工人一个唯一的 ID 徽章，这样你就不会意外地让两个人做同样的工作——这是效率的一大禁忌。

回到我们的 ImportCsv 作业，我们将为 each 方法中的每个 chunk dispatch 一个作业。这就像说，“你得到一块，你得到一块——每个人都得到一块！

<?php

namespace App\Jobs;

use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Spatie\SimpleExcel\SimpleExcelReader;

class ImportCsv implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    /**
     * Create a new job instance.
     */
    public function __construct()
    {
        //
    }

    /**
     * Execute the job.
     */
    public function handle(): void
    {
        SimpleExcelReader::create(storage_path('app/public/products.csv'))
            ->useDelimiter(',')
            ->useHeaders(['ID', 'title', 'description'])
            ->getRows()
            ->chunk(5000)
            ->each(
				fn ($chunk) => ImportProductChunk::dispatch($chunk)
			);
    }
}

好了！您的 chunk 可以独立处理，没有任何内存剧。如果您赶时间，只需添加更多工作人员，就像一台运转良好的机器一样，您的数据将得到更快的处理。