问题描述
我在Git目录下添加了一个数据文件夹,其中有超过100M的大文件。没有在.gitignore文件中忽略该文件夹就进行了一次commit和git push。导致其报了一系列远端错误如下:
(picard) jxqi@han-server-01:~/text2sql/picard_train_no_docker$ git push
Counting objects: 23, done.
Delta compression using up to 96 threads.
Compressing objects: 100% (22/22), done.
Writing objects: 100% (23/23), 192.99 MiB | 9.48 MiB/s, done.
Total 23 (delta 13), reused 0 (delta 0)
remote: Resolving deltas: 100% (13/13), completed with 9 local objects.
remote: warning: File dataset_files/spider.zip is 95.12 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB
remote: warning: File dataset_files/sparc.zip is 94.83 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB
remote: error: Trace: a96964555cb62419f429ef26997f589c792978711199bd3e3219ea8280af366b
remote: error: See http://git.io/iEPt8g for more information.
remote: error: File dataset_files/cosql_dataset.zip is 100.44 MB; this exceeds GitHub's file size limit of 100.00 MB
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
To https://github.com/JiexingQi/picard.git
! [remote rejected] main -> main (pre-receive hook declined)
error: failed to push some refs to 'https://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@github.com/JiexingQi/picard.git'
之后,我修改了.gitignore文件,将该目录“/dataset_files”添加到了里面,但依旧无法解决该问题。因为我后续在一些程序文件中进行了修改,所以后续每次commit都成功,但是每次push都会报这个错误。检查github也发现,在出现大文件后的所有后续版本都未更新到github中。
问题解决
方法一:使用git filter-branch
首先,通过网上查找资料,发现很多人都使用
git filter-branch -f --index-filter 'git rm --cached --ignore-unmatch large_file_path'
的方式先删除掉提交中的大文件缓存再进行重新提交。
但我按步骤执行后又出现报错问题:
Cannot rewrite branches: You have unstaged changes.
方法二:使用git reset
后来,我又查到另一种解决方法。即先将本地版本回退到没有大文件之前的版本,然后重新提交这些更改,然后再push到远端即可。
具体分为以下四步:
- git reset version_not_include_large_file
- git add .
- git commit -m "changes"
- git push origin main
(picard) jxqi@han-server-01:~/text2sql/picard_train_no_docker$ git reset c60a1ebd6b72b5
Unstaged changes after reset:
M .gitignore
M README.md
M configs/train_0228_sparc.json
M seq2seq/datasets/cosql/cosql.py
M seq2seq/datasets/sparc/sparc.py
M seq2seq/datasets/spider/spider.py
(picard) jxqi@han-server-01:~/text2sql/picard_train_no_docker$ git status
On branch main
Your branch is up to date with 'origin/main'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: .gitignore
modified: README.md
modified: configs/train_0228_sparc.json
modified: seq2seq/datasets/cosql/cosql.py
modified: seq2seq/datasets/sparc/sparc.py
modified: seq2seq/datasets/spider/spider.py
no changes added to commit (use "git add" and/or "git commit -a")
(picard) jxqi@han-server-01:~/text2sql/picard_train_no_docker$ git add .
(picard) jxqi@han-server-01:~/text2sql/picard_train_no_docker$ git commit -m "make dataset locally"
[main 5898539] make dataset locally
6 files changed, 24 insertions(+), 9 deletions(-)
(picard) jxqi@han-server-01:~/text2sql/picard_train_no_docker$ git push origin main
Counting objects: 14, done.
Delta compression using up to 96 threads.
Compressing objects: 100% (13/13), done.
Writing objects: 100% (14/14), 1.54 KiB | 1.54 MiB/s, done.
Total 14 (delta 9), reused 0 (delta 0)
remote: Resolving deltas: 100% (9/9), completed with 9 local objects.
To https://github.com/JiexingQi/picard.git
c60a1eb..5898539 main -> main
使用这种方法后,完美解决问题。