1 Git 安装

Win7 系统,从 https://git-scm.com/download/win 下载。

下载网址

Figure 1.1: 下载网址

下载较新版本

Figure 1.2: 下载较新版本

设置Git可以被第三方软件调用,为后期RStudio使用

Figure 1.3: 设置Git可以被第三方软件调用,为后期RStudio使用

2 clone 第一个项目

复制项目地址

Figure 2.1: 复制项目地址

输入注册的云端账号密码

Figure 2.2: 输入注册的云端账号密码

打开 Git Bash

定位到桌面

cd ~/Desktop
git clone http://xxx/xxx/xxx.git

写账号密码。

$ git clone http://xxx/xxx/xxx.git
Cloning into 'xxx'...
remote: Counting objects: 2729, done.
remote: Compressing objects: 100% (89/89), done.
Receiving objects:  17% (483/2729), 182.68 MiB | 31.63 MiB/s

表示正在下载,xx MB/s 是下载速度的意思。

cd xxx
cd sql-script/xxx/
mv ~/Desktop/xxx_roi ./
$ git status
On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        xxx_roi/

nothing added to commit but untracked files present (use "git add" to track)

告诉你文件夹路径修改成功。

git add .

.表示add 所有。

$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        new file:   xxx_roi/01-01-log.sql
        new file:   xxx_roi/01-02-log.sql
        new file:   xxx_roi/02-01-first-register.sql

...
git commit -m 'add all files related to xxx roi metrics'
git push

推 = 上传。

$ git push
Enumerating objects: 73, done.
Counting objects: 100% (73/73), done.
Delta compression using up to 2 threads
Compressing objects: 100% (69/69), done.
Writing objects: 100% (70/70), 30.18 KiB | 1.68 MiB/s, done.
Total 70 (delta 11), reused 0 (delta 0)
To http://xxx/xxx/xxx.git
   0094def..c5054d9  master -> master

add, commit, push

  1. add 只是加,如果 mv 掉,计算机是可以发现。
  2. add 的行为,需要描述,所有用 commit
  3. push 推送到云端

显示新的 commit 完成。

显示有60多个文件上传成功。

xxx@xxx MINGW64 ~/Desktop/xxx/sql-script/xxx (master)
$ ls
xxx_roi/  roi/  test.sql

xxx@xxx MINGW64 ~/Desktop/xxx/sql-script/xxx (master)
$ rm -rf roi

xxx@xxx MINGW64 ~/Desktop/xxx/sql-script/xxx (master)
$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        deleted:    roi/.gitkeep

no changes added to commit (use "git add" and/or "git commit -a")
xxx@xxx-0301000855 MINGW64 /d/Work/xxx (master)
$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   xxx.Rproj

no changes added to commit (use "git add" and/or "git commit -a")

xxx@xxx-0301000855 MINGW64 /d/Work/xxx (master)
$ git checkout xxx.Rproj

如果修改了 metadata .Rproj,checkout 恢复。

3 批量上传

 /e/work
$ cd xxx/

 /e/work/xxx (master)
$ mv ../xxx/mod
mod-2019-04-30.rds        xxx-feature-perf/ mod-overfitting.rds

 /e/work/xxx (master)
$ mv ../xxx/xxx-feature-perf ./

 /e/work/xxx (master)
$ mkdir output

 /e/work/xxx (master)
$ mv xxx-feature-perf output/
 /e/work/xxx (master)
$ git status
On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        output/

nothing added to commit but untracked files present (use "git add" to track)

 /e/work/xxx (master)
$ git add output/

 /e/work/xxx (master)
$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        new file:   output/xxx-feature-perf/cmcnt_suc_1.png
        new file:   output/xxx-feature-perf/cmnum_app_brw_1.png
        new file:   output/xxx-feature-perf/cmnum_app_inv_2.png
        new file:   output/xxx-feature-perf/cmnum_app_inv_4.png
        new file:   output/xxx-feature-perf/cmstr_char_tran.png
        new file:   output/xxx-feature-perf/cmstr_thir_rela_3.png
        new file:   output/xxx-feature-perf/cmstr_thir_rela_5.png
        new file:   output/xxx-feature-perf/cmstr_thir_rela_6.png
        new file:   output/xxx-feature-perf/sec_pro_pro.png
        new file:   output/xxx-feature-perf/thir_pro_pro.png


 /e/work/xxx (master)
$ git commit -m '上传模型的图片'
[master c59efaf] 上传模型的图片
 Committer: unknown <caojie@xxxai.com>
Your name and email address were configured automatically based
on your xxx and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly. Run the
following command and follow the instructions in your editor to edit
your configuration file:

    git config --global --edit

After doing this, you may fix the identity used for this commit with:

    git commit --amend --reset-author

 10 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 output/xxx-feature-perf/cmcnt_suc_1.png
 create mode 100644 output/xxx-feature-perf/cmnum_app_brw_1.png
 create mode 100644 output/xxx-feature-perf/cmnum_app_inv_2.png
 create mode 100644 output/xxx-feature-perf/cmnum_app_inv_4.png
 create mode 100644 output/xxx-feature-perf/cmstr_char_tran.png
 create mode 100644 output/xxx-feature-perf/cmstr_thir_rela_3.png
 create mode 100644 output/xxx-feature-perf/cmstr_thir_rela_5.png
 create mode 100644 output/xxx-feature-perf/cmstr_thir_rela_6.png
 create mode 100644 output/xxx-feature-perf/sec_pro_pro.png
 create mode 100644 output/xxx-feature-perf/thir_pro_pro.png

 /e/work/xxx (master)
$ git push
Enumerating objects: 15, done.
Counting objects: 100% (15/15), done.
Delta compression using up to 8 threads
Compressing objects: 100% (14/14), done.
Writing objects: 100% (14/14), 34.24 KiB | 6.85 MiB/s, done.
Total 14 (delta 10), reused 0 (delta 0)
To http://xxx.git
   f779347..c59efaf  master -> master

4 Git

git diff, git status, git add and git commit

这是 git 最基本的四个命令,建议熟练掌握,之后按需学习。

4.1 git diff

  1. git status阶段是 staging files
  2. add后,是staged files,这一步也叫做 Add one or more files to the staging area.
    1. git add filename
  3. git diff反映的是 staging files
    1. git diff 查看所有变化
    2. git diff filename
    3. git diff directory
$ git diff
diff --git a/data/northern.csv b/data/northern.csv
index 5eb7a96..5a2a259 100644
--- a/data/northern.csv
+++ b/data/northern.csv
@@ -22,3 +22,4 @@ Date,Tooth
 2017-08-13,incisor
 2017-08-13,wisdom
 2017-09-07,molar
+2017-11-01,bicuspid

结果解释

  1. 第一行 diff --git a/data/northern.csv b/data/northern.csv 表示哪个文件修改
  2. @@ -22,3 +22,4 @@ Date,Tooth表示
    1. 去掉从第22行起,一共3行改变;增加从第22行起,一共四行改变

4.1.1 git diff -r HEAD

  1. -r flag means “compare to a particular revision”
  2. HEAD is a shortcut meaning “the most recent commit”.

git diff -r HEAD path/to/file

https://campus.datacamp.com/courses/introduction-to-git-for-data-science/basic-workflow?ex=8

4.1.2 git diff ID1..ID2

对比两个commit的差异

  1. git diff abc123..def456
  2. git diff HEAD~1..HEAD~3

4.2 nano

nano filename

  1. 打开已存在的文件
  2. 创建未存在的文件

1 方向键控制方向,删除键删除1. 方向键控制方向,删除键删除 1. Ctrl-K: delete a line. 1. Ctrl-U: un-delete a line. 1. Ctrl-O: save the file (‘O’ stands for ‘output’). 1. Ctrl-X: exit the editor. 1. nano exit command+z

已经成功测试nano

4.2.1 Git launches a text editor

  1. 只输入git commit,不加-m
  2. 写 readable 的message
  3. 通过 Ctrl-O 和 Ctrl-X 来完成

4.3 git commit

git commit -m "Program appears to have become self-aware."
git commit --amend -m "new message"
  1. 当上一条commit写错了,这条可以修改
  2. 注意- m中间有空格
$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        modified:   report.txt

git 需要commit了。

4.4 git log

$ git log
commit 2e3b95ed8bdbcbe08c2dfbce318027085d09f597 (HEAD -> master, origin/master, origin/HEAD)
Merge: ef8cb3e 38ac7e2
Author: Jiaxiang Li <alex.xxx@foxmail.com>
Date:   Fri Nov 23 23:21:41 2018 +0800

    merge

commit ef8cb3e8d8b00eb460b9dceb84bceef049b41caa
Author: Jiaxiang Li <alex.xxx@foxmail.com>
Date:   Fri Nov 23 19:48:16 2018 +0800

    commit for merge before
  1. q退出
  2. ef8cb3e8d8b00eb460b9dceb84bceef049b41caa 是 hash
hash is a unique identifier
  1. a pseudo-random number generator called a hash function
  2. written as a 40-character hexadecimal string
  3. 但是实际上只需要看前1. 但是实际上只需要看前6-8位

git log path

git log -3 filename 最近三条

4.5 git show

查看commit

$ git show
commit 9a8320c4b8e2765aae74ce55f44038248651e95aAuthor: Rep Loop <repl@datacamp.com>
Date:   Thu Sep 13 13:18:14 2018 +0000

    Added year to report title.

diff --git a/report.txt b/report.txt
index e713b17..4c0742a 100644
--- a/report.txt
+++ b/report.txt
@@ -1,4 +1,4 @@
-# Seasonal Dental Surgeries 2017-18
+# Seasonal Dental Surgeries (2017) 2017-18

 TODO: write executive summary.

其中包含了git diff 的内容,见 diff --git a/report.txt b/report.txt

4.5.1 HEAD~1

$ git show HEAD~1
commit 65f236a2ec2e6cd80e77a6c37c2d6b24b2707907
Author: Rep Loop <repl@datacamp.com>
Date:   Thu Sep 13 13:18:14 2018 +0000

    Adding fresh data for western region.

diff --git a/data/western.csv b/data/western.csv
index f6d6374..f7c4509 100644
--- a/data/western.csv
+++ b/data/western.csv
@@ -27,3 +27,6 @@ Date,Tooth
 2017-10-05,molar
 2017-10-06,incisor
 2017-10-07,incisor
+2017-10-15,molar
+2017-10-17,bicuspid
+2017-10-18,bicuspid
HEAD~1
just before the most recent one

4.6 git annotate

$ git annotate report.txt
9a8320c4        (  Rep Loop     2018-09-13 13:18:14 +0000       1)# Seasonal Dental Surgeries (2017) 2017-18
56f80e3e        (  Rep Loop     2018-09-13 13:18:14 +0000       2)
56f80e3e        (  Rep Loop     2018-09-13 13:18:14 +0000       3)TODO: write executive summary.
56f80e3e        (  Rep Loop     2018-09-13 13:18:14 +0000       4)
56f80e3e        (  Rep Loop     2018-09-13 13:18:14 +0000       5)TODO: include link to raw data.
7f4b3efa        (  Rep Loop     2018-09-13 13:18:14 +0000       6)
7f4b3efa        (  Rep Loop     2018-09-13 13:18:14 +0000       7)TODO: remember to cite funding sources!

可以查看作者

  1. The first eight digits of the hash, 04307054.
  2. The author, Rep Loop.
  3. The time of the commit, 2017-09-20 13:42:26 +0000.
  4. The line number, 1.
  5. The contents of the line, # Seasonal Dental Surgeries (2017) 2017-18.

4.7 git add

  1. 没有 add 的文件,是不会被 track 的,因此无法进行版本控制。所以可以使用git status进行查看。 [Wilson (2017)
  1. git add 可以让对同一文件的修改,进行分别的commit,从而进行版本控制。

4.8 .gitignore

如果含有

build
*.mpl

那么 Git 不会同步 build文件夹,和*.mpl类型的文件。

4.9 git clean -n

对于 untracked 文件,可以进行删除 git clean -f执行

4.10 git config

git config --global setting.name setting.value
git config --global user.name Jiaxiang Li
git config --global user.email alex.xxx@foxmail.com

4.11 git reset

$ git status
On branch masterChanges not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   data/eastern.csv
        modified:   data/northern.csv

no changes added to commit (use "git add" and/or "git commit -a")
$ git add -A
$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        modified:   data/eastern.csv
        modified:   data/northern.csv

当某个文件被错误的add了,可以通过git reset HEAD还原到 unstaged 的状态。

$ git reset HEAD
Unstaged changes after reset:
M       data/eastern.csv
M       data/northern.csv
$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   data/eastern.csv
        modified:   data/northern.csv

no changes added to commit (use "git add" and/or "git commit -a")

git reset HEAD filename中未指定对应的文件和路径,就是全部staged文件执行unstage命令 git reset HEAD~1,即使是commit了,也可以先撤销commit,然后不动之前已经做好的修改记录。zhuanlan git checkout -- .对当前路径所有unstaged文件进行撤销修改。 常用的场景是,本地还没有保存,但是不小心git pull,导致本地文件被覆盖了,这时只需要git checkout -- .,还原后,再git pull就好了。 但是每次在git pull 前,要先commit 所有本地的change,因此每次git pull 前先commit 是好习惯。

4.11.1 test git add

$ git add .
$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    modified:   commit.Rmd

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   commit.Rmd
$ git reset HEAD
Unstaged changes after reset:
M   commit.Rmd
$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   commit.Rmd

no changes added to commit (use "git add" and/or "git commit -a")

4.11.2 test git commit

$ git log
commit 3e165b620a50eeb10f3bd06b60517e91db73af36 (HEAD -> master)
Author: Jiaxiang Li <alex.xxx@foxmail.com>
Date:   Thu Mar 12 15:03:02 2020 +0800

    test git commit
$ git reset 3e165b620a50eeb10f3bd06b60517e91db73af36
$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   commit.Rmd

no changes added to commit (use "git add" and/or "git commit -a")

4.12 git checkout

对unstaged 的文件使用 git checkout -- filename撤销编辑 例如 git checkout -- data/northern.csv

因此两者搭配可以处理对staged 的文件撤销编辑。

$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        modified:   data/eastern.csv
        modified:   data/northern.csv

$ git reset data/northern.csv
Unstaged changes after reset:
M       data/northern.csv
$ git checkout -- data/northern.csv
.git/       bin/        data/       report.txt  results/
$ git checkout -- data/northern.csv

有时候 checkout 会失败,因为本地同路径下生成了一个lock文件,删除即可。

4.12.1 版本恢复

使用git log

commit ab8883e8a6bfa873d44616a0f356125dbaccd9ea
Author: Author: Rep Loop <repl@datacamp.com>
Date:   Thu Oct 19 09:37:48 2017 -0400

    Adding graph to show latest quarterly results.

commit 2242bd761bbeafb9fc82e33aa5dad966adfe5409
Author: Author: Rep Loop <repl@datacamp.com>
Date:   Thu Oct 16 09:17:37 2017 -0400

    Modifying the bibliography format.
  1. git checkout c5f567 -- file1/to/restore file2/to/restore恢复到对应版本,参考 Stack Overflow
  2. 再进行 commit

4.13 git init

$ pwd
/home/repl/dental
$ git init
Initialized empty Git repository in /home/repl/dental/.git/

4.14 git clone

会将源文件的历史记录一起复制。

同时也可以复制本地文件

git clone /existing/project newprojectname

4.15 git remote

查看 clone 的源头

$ git remote -v
origin  /home/thunk/repo (fetch)
origin  /home/thunk/repo (push)
$ git remote -v
origin  https://github.com/JiaxiangBU/imp_rmd.git (fetch)
origin  https://github.com/JiaxiangBU/imp_rmd.git (push)

4.16 git pull

$ git push origin dental
error: src refspec dental does not match any.
error: failed to push some refs to '/home/thunk/repo'
$ git push origin master
To /home/thunk/repo
 ! [rejected]        master -> master (non-fast-forward)
error: failed to push some refs to '/home/thunk/repo'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
$ git pull
Merge made by the 'recursive' strategy.

经常git push 时,会产生报错,因此需要git pull,然后

bring your repository up to date with origin. It will open up an editor that you can exit with Ctrl+X.

如不想弹出编辑器,输入

$ git pull --no-edit origin master

4.16.1 other-branch

$ git pull origin other-branch

参考 Stack Overflow

4.17 git branch

branch 最直观的理解是

Two commits have more than one parent.

4.17.1 新建 branch 前注意事项

  1. 先commit + push,否则merge不成功时,可以再merge。
  2. 而且从断舍离的角度,会把 merged branch 删除,如果之前已经push了,那么云端还有报错的,可以push下来。

4.17.2 查看 branch 的差异

$ cd dental$ git diff summary-statistics master
diff --git a/bin/summary b/bin/summarydeleted file mode 100755
...

每个diff会用diff --git开头展示。

4.17.3 配合 git rm 使用

$ git checkout -b deleting-report
Switched to a new branch 'deleting-report'
$ git rm report.txt
rm 'report.txt'
$ git commit -m 'delete report.txt'
[deleting-report 97cb6e1] delete report.txt
 1 file changed, 7 deletions(-)
 delete mode 100644 report.txt
$ git diff master..deleting-report
diff --git a/report.txt b/report.txt
deleted file mode 100644
index 4c0742a..0000000
--- a/report.txt
+++ /dev/null
@@ -1,7 +0,0 @@
-# Seasonal Dental Surgeries (2017) 2017-18
-
-TODO: write executive summary.
-
-TODO: include link to raw data.
-
-TODO: remember to cite funding sources!

这样不会对master 误删文件产生不良效果。

4.17.4 git merge source destination

destination 可省略。

<<<<<<< destination-branch-name
...changes from the destination branch...
=======
...changes from the source branch...
>>>>>>> source-branch-name
  1. 大多时候destination-branch-nameHEAD,表示 Current branch
  2. 使用git status查看需要处理 conflict的文件

4.18 git rm

这里举一个例子, 在不同的branch中,删除某个branch的一个文件。

$ cd dental$ git branch
  alter-report-title* master
  summary-statistics
$ git checkout summary-statistics
Switched to branch 'summary-statistics'
$ ls
bin  data  report.txt  results
$ git rm report.txt
rm 'report.txt'
$ ls
bin  data  results
$ git commit -m 'rm report'
[summary-statistics c845641] rm report
 1 file changed, 7 deletions(-)
 delete mode 100644 report.txt
$ git checkout master
Switched to branch 'master'
$ ls
bin  data  report.txt  results

4.19 git remote

增加和更改 remote 地址,参考 www.jianshu.com

git remote rm origin
git remote rename origin default 

4.20 .giitignore

touch .giitignore

4.20.1 和 git status 合用

当敏感数据在文件夹时, 需要使用.giitignore文件进行ignore 这时可以不断申明后,查看git status,直到没有敏感数据为止。

4.21 相关问题

4.21.1 Submodule

One thing you should not do is create one Git repository inside another. While Git does allow this, updating nested repositories becomes very complicated very quickly, since you need to tell Git which of the two .git directories the update is to be stored in. Very large projects occasionally need to do this, but most programmers and data analysts try to avoid getting into this situation. (Wilson 2017)

最好不要嵌套,这样逻辑会非常复杂,除非是大项目。

但是根据 Chacon and Straub (2014) 的介绍也不会太过于复杂。

submodule 的 问题,只要把文件夹移走就好!

4.21.2 create a new repository on the command line

echo "# rong360" >> README.md
git init
git add README.md
git commit -m "first commit"
git remote add origin https://github.com/JiaxiangBU/rong360.git
git push -u origin master

4.21.3 push an existing repository from the command line

git remote add origin https://github.com/JiaxiangBU/rong360.git
git push -u origin master

4.21.4 push文件过大异常

参考 CSDN博客Atlassian Documentation

error: RPC failed; curl 56 LibreSSL SSL_read: SSL_ERROR_SYSCALL, errno 60
fatal: The remote end hung up unexpectedly

需要用VPN等(如蓝灯)。 2018-12-11 11:40:44 成功一次

参考 CSDN博客

git config http.postBuffer 524288000

设置通信缓存。

参考 Github Help

$ git rm --cached giant_file
## Stage our giant file for removal, but leave it on disk
$ git commit --amend -CHEAD
## Amend the previous commit with your change
## Simply making a new commit won't work, as you need
## to remove the file from the unpushed history as well
$ git push
这是一个例子。
$ git rm --cached  kaggle/PS_20174392719_1491204439457_log.csv
rm 'kaggle/PS_20174392719_1491204439457_log.csv'

$ git commit --amend -CHEAD
[master 680607e] knit rmd
 Date: Wed Nov 28 13:18:30 2018 +0800
 36 files changed, 233 insertions(+), 6362723 deletions(-)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-19-1.png (96%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-24-1.png (96%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-25-1.png (98%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-26-1.png (89%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-27-1.png (99%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-29-1.png (93%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-33-1.png (99%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-36-1.png (99%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-38-1.png (99%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-40-1.png (98%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-42-1.png (96%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-42-2.png (96%)
 create mode 100644 datacamp_files/figure-gfm/unnamed-chunk-54-1.png
 rewrite datacamp_files/figure-gfm/unnamed-chunk-55-1.png (99%)
 create mode 100644 datacamp_files/figure-gfm/unnamed-chunk-55-2.png
 rewrite datacamp_files/figure-gfm/unnamed-chunk-56-1.png (99%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-57-1.png (99%)
 create mode 100644 datacamp_files/figure-gfm/unnamed-chunk-57-2.png
 create mode 100644 datacamp_files/figure-gfm/unnamed-chunk-59-1.png
 rewrite datacamp_files/figure-gfm/unnamed-chunk-60-1.png (99%)
 create mode 100644 datacamp_files/figure-gfm/unnamed-chunk-60-2.png
 create mode 100644 datacamp_files/figure-gfm/unnamed-chunk-64-1.png
 create mode 100644 datacamp_files/figure-gfm/unnamed-chunk-64-2.png
 create mode 100644 datacamp_files/figure-gfm/unnamed-chunk-67-1.png
 create mode 100644 datacamp_files/figure-gfm/unnamed-chunk-72-1.png
 rewrite datacamp_files/figure-gfm/unnamed-chunk-73-1.png (99%)
 rewrite datacamp_files/figure-gfm/unnamed-chunk-74-1.png (99%)
 create mode 100644 datacamp_files/figure-gfm/unnamed-chunk-76-1.png
 rewrite datacamp_files/figure-gfm/unnamed-chunk-77-1.png (99%)
 delete mode 100644 kaggle/PS_20174392719_1491204439457_log.csv
参考 Github Help
以大文件kaggle/PS_20174392719_1491204439457_log.csv为例。
git filter-branch --force --index-filter \
'git rm --cached --ignore-unmatch kaggle/PS_20174392719_1491204439457_log.csv' \
--prune-empty --tag-name-filter cat -- --all

git add .gitignore

git commit -m "Add kaggle/PS_20174392719_1491204439457_log.csv to .gitignore"

## Double-check that you've removed everything you wanted to from your repository's history, and that all of your branches are checked out.

git push origin --force --all

4.21.5 当电脑闪退时,git的报错

$ git status
error: bad signature
fatal: index file corrupt

参考 这个博客

4.21.6 删除Commit过的大文件

参考 开源中国 这篇博客

git filter-branch -f --index-filter "git rm -rf --cached --ignore-unmatch .RDataTmp"

并且在.gitignore声明这个地址。

4.21.7 Git push master fatal: You are not currently on a branch

参考 Stack Overflow

git branch temp-branch
git checkout master
git merge temp-branch
git push origin master

4.21.8 同步过多/大文件

error: RPC failed; curl 18 transfer closed with outstanding read data remaining
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
git config --global http.postBuffer 524288000
git config --list

查询某项目所有配置。 按q退出。

查看 CSDN博客

4.21.9 fatal: refusing to merge unrelated histories

参考 CSDN博客

git pull origin master --allow-unrelated-histories

4.21.10 push 卡住

卡住
$ git push
Counting objects: 43, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (41/41), done.
Writing objects: 100% (43/43), 22.58 MiB | 14.22 MiB/s, done.
Total 43 (delta 14), reused 0 (delta 0)
git push
$ git push
Counting objects: 43, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (41/41), done.
Writing objects: 100% (43/43), 22.58 MiB | 14.33 MiB/s, done.
Total 43 (delta 14), reused 0 (delta 0)
error: RPC failed; curl 56 LibreSSL SSL_read: SSL_ERROR_SYSCALL, errno 50
fatal: The remote end hung up unexpectedly
git pull
$ git pull
remote: Enumerating objects: 34, done.
remote: Counting objects: 100% (34/34), done.
remote: Compressing objects: 100% (23/23), done.
error: RPC failed; curl 56 LibreSSL SSL_read: SSL_ERROR_SYSCALL, errno 54
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: unpack-objects failed

This is because of long process running in the server side. Stack Overflow

几种办法

参考 CSDN博客, 增加host,因为github的DNS被GFW墙

199.27.74.133 assets-cdn.github.com

Mac设置方式,

  1. shift + command + G
  2. 路径为etc/hosts

修改设置通信缓存 参考 GitHub Issue

git config http.postBuffer 524288000
git config https.postBuffer 524288000

切换成手机流量中国电信,重新pullpush。 只有电信的网络还不错。

家附近的网络环境不好,因此使用其他环境中手机的网络。

4.21.11 github_document

github_document 一个解决方案 knitr::kable()

4.22 完成证书

4.23 loose object is corrupt

产生报错

jiaxiang@jiaxiang-VirtualBox:~/Documents/job-automation$ git pull
error: object file .git/objects/3d/a9a910ba76b85f89f6f2da9a0d7715a5fbe113 is empty
error: object file .git/objects/3d/a9a910ba76b85f89f6f2da9a0d7715a5fbe113 is empty
fatal: loose object 3da9a910ba76b85f89f6f2da9a0d7715a5fbe113 (stored in .git/objects/3d/a9a910ba76b85f89f6f2da9a0d7715a5fbe113) is corrupt
jiaxiang@jiaxiang-VirtualBox:~/Documents/job-automation$ fatal: The remote end hung up unexpectedly

解决办法参考 Stack OverflowGithub Issue 36

cp -R foo foo-backup

-R复制文件夹,做备份管理。

git clone git@www.mydomain.de:foo foo-newclone
rm -rf foo/.git
mv foo-newclone/.git foo
rm -rf foo-newclone

这里是用 remote 端的最新 commit 覆盖了,因此本地修改的文件,会进入 unstaged 区域,需要重新 commit 了,但是文件都不会少。

4.24 How to fix a corrupt git index

error: bad index file sha1 signature
fatal: index file corrupt
rm .git/index
git reset

To be safe, make a backup of .git/index before you delete it. https://makandracards.com/makandra/5899-how-to-fix-a-corrupt-git-index

4.25 unable to create file Invalid argument

$ git pull origin master
From https://github.com/JiaxiangBU/tiny-pdf-folder
 * branch            master     -> FETCH_HEAD
error: unable to create file refs/Varian, H. R. (2010). Intermediate microeconomics : a modern approach. 8th ed.
New York: W.W. Norton & Co..pdf: Invalid argument

这种报错怎么办?

在云端直接删除了这个文件。

4.26 merge some commit

#目前在某个状态

git branch tmp
# copy

git checkout master

git merge tmp
/d/git/user_tags (master)
$ git checkout d962f8b
Note: switching to 'd962f8b'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false


HEAD is now at d962f8b fix conflict.

/d/git/user_tags ((d962f8b...))
$ git status
HEAD detached at d962f8b
nothing to commit, working tree clean

/d/git/user_tags ((d962f8b...))
$ git branch temp-branch

/d/git/user_tags ((d962f8b...))
$ git merge d962f8b
Already up to date.

/d/git/user_tags ((d962f8b...))

/d/git/user_tags (temp-branch)
$ git checkout master
Switched to branch 'master'
Your branch is up to date with 'origin/master'.

/d/git/user_tags (master)
$ git merge temp-branch
Updating 0aa6d4f..d962f8b
Fast-forward
 analysis/credit_level_dist.Rmd                     |   2 +-
 analysis/_perf_by_t.Rmd                        | 378 ++++++++++++++++++
---
 output/_ByT_jidai_perf.png                     | Bin 64043 -> 204548 by
tes
 output/_ByT_perf.png                           | Bin 77426 -> 83442 bytes
 output/_ByT_tcjq_perf.png                      | Bin 0 -> 226792 bytes
 output/_ByT_total_perf.png                     | Bin 0 -> 235576 bytes
 output/_ByTandId_perf.png                      | Bin 76232 -> 103447 bytes
 output/_block_jd_ver2_20191216.csv             |  16 +-
 output/_block_tcjq_ver2_20191216.csv           |  11 +
 output/_block_total_ver2_20191216.csv          |  22 +-
 ..._20191216.csv" |  43 ---
 11 files changed, 353 insertions(+), 119 deletions(-)
 create mode 100644 output/_ByT_tcjq_perf.png
 create mode 100644 output/_ByT_total_perf.png
 create mode 100644 output/_block_tcjq_ver2_20191216.csv
 delete mode 100644 "output/_2019
1216.csv"

4.27 查询一个文件所有 commit 和明细

这种问题还是看 git 而非 git2r,太乱。

参考 https://www.atlassian.com/git/tutorials/git-log

加上--stat,不要-p,会看到内容上的修改,乱七八糟的。 最好还可以看到修改的文档!这样就全面了!

$ git log --stat LICENSE | clip
commit 7c24e5f666029b7464099f7ea3367ee956844bf2
Author: Jiaxiang Li <alex.xxx@foxmail.com>
Date:   Sun Jan 26 23:18:49 2020 +0800

    * Update the year of the LICENSE.
    * Remove empty files.
    * Make some addins not exported.
    * Add 'allow_non_interactive = TRUE' to some functions with 'clipr::write_clip' function.
    * Make some functions with 'donttest' in the examples.
    * Add examples into functions.
    * Add arguements into functions.

 LICENSE | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

commit a65d99e9420476ce56bb550d95c80b49746d368b
Author: Jiaxiang Li <alex.xxx@foxmail.com>
Date:   Mon Sep 9 18:01:49 2019 +0800

    update license.

 LICENSE | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

commit 3be53ec5c365a9863f48fe4c9f43da035b2275b0
Author: Jiaxiang Li <alex.xxx@foxmail.com>
Date:   Thu Nov 15 12:57:17 2018 +0800

    update readme

 LICENSE | 23 ++---------------------
 1 file changed, 2 insertions(+), 21 deletions(-)

commit 011a78997d06e81acc23667d4b1fd14517202c8e
Author: Jiaxiang Li <jli270@binghamton.edu>
Date:   Thu Nov 15 11:19:38 2018 +0800

    Initial commit

 LICENSE | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

one line 情况

参考 夕小瑶 (2020)

$ git log --pretty=oneline --graph --decorate --all | clip
* 73ae56efb8b3e7688c9556c2a9a240df471a8ebe (HEAD -> master, origin/master, origin/HEAD) update notes
* d02614431a3294ce40c27c9c4405bfaa4edb1d23 update notes
* 9d1984955ae57f5eecd87e26189193ed5c662c49 update notes
* 376a7ad9ec11f45097e0ef0a184ccaa4f5a73c04 update notes
* 122d8086254f4764d64c47e67803ee9ef05ccd5f update notes
* 40fbcfdec9cd08d815a71341b913c3d7d627656f update notes
* ba04f9e7fbdacaef8fcb983ec5b2fbaaebe7daab update notes
* 3165bae2d3758f4a7f0e4a2f41d4856930cd8325 update notes
* ff0d86a36bc06450cf3e1a19cb96eeb009cd2e59 update notes

4.28 最近 log 记录

参考 夕小瑶 (2020)

$ git whatchanged --since='1 days ago' | clip
commit 73ae56efb8b3e7688c9556c2a9a240df471a8ebe
Author: Jiaxiang Li <alex.xxx@foxmail.com>
Date:   Thu Feb 20 19:46:51 2020 +0800

    update notes

:000000 100644 0000000 1dbb98a A    analysis/bfg/bfg-example2.Rmd
:100644 100644 046b7c1 f3b47f5 M    output/git-github-gitlab-learning-notes.Rmd
:100644 100644 042f856 b7401c7 M    output/git-github-gitlab-learning-notes.html

commit d02614431a3294ce40c27c9c4405bfaa4edb1d23
Author: Jiaxiang Li <alex.xxx@foxmail.com>
Date:   Thu Feb 20 18:09:28 2020 +0800

    update notes

:100644 100644 98bb000 046b7c1 M    output/git-github-gitlab-learning-notes.Rmd
:100644 100644 29e1acf 042f856 M    output/git-github-gitlab-learning-notes.html

4.29 更好的建项目方式

dir.create("../dtmvisual")
git2r::clone("https://github.com/GSukr/dtmvisual.git", "../dtmvisual")

这样不方便,还不如直接 git clone

4.30 重新设置 release

release 时需要 upstream 和 local 一致,所以需要重新设置 upstream

$ git remote rm upstream
$ git remote add upstream https://github.com/JiaxiangBU/nCov2019_analysis.git
> usethis::use_github_release()
 Setting active project to 'D:/work/nCov2019_analysis'
 Checking that remote branch 'upstream/master' has the changes in 'local/master'

4.31 提取文档做解释

方便 document。

git2r::status() %>% unlist() %>% 
    paste0("1. `",.,"`") %>% 
    clipr::write_clip()

4.32 解释表格的变量

因为方便写 github 和 gitlab

df %>% names() %>% 
    paste0("1. `", ., "` ") %>% 
    clipr::write_clip()

4.33 查询文档的贡献者

参考 夕小瑶 (2020)

$ git blame README.Rmd | clip
7115e38c (Jiaxiang Li 2018-11-25 20:40:28 +0800  1) ---
7115e38c (Jiaxiang Li 2018-11-25 20:40:28 +0800  2) output: github_document
7115e38c (Jiaxiang Li 2018-11-25 20:40:28 +0800  3) ---
7115e38c (Jiaxiang Li 2018-11-25 20:40:28 +0800  4) 
7115e38c (Jiaxiang Li 2018-11-25 20:40:28 +0800  5) <!-- README.md is generated from README.Rmd. Please edit that file -->
7115e38c (Jiaxiang Li 2018-11-25 20:40:28 +0800  6) 
5fb3828c (Jiaxiang Li 2020-02-02 16:28:06 +0800  7) ```{r, include = FALSE}
7115e38c (Jiaxiang Li 2018-11-25 20:40:28 +0800  8) knitr::opts_chunk$set(
5fb3828c (Jiaxiang Li 2020-02-02 16:28:06 +0800  9)   collapse = TRUE,
5fb3828c (Jiaxiang Li 2020-02-02 16:28:06 +0800 10)   comment = "#>",
5fb3828c (Jiaxiang Li 2020-02-02 16:28:06 +0800 11)   fig.path = "man/figures/README-",
5fb3828c (Jiaxiang Li 2020-02-02 16:28:06 +0800 12)   out.width = "100%"
7115e38c (Jiaxiang Li 2018-11-25 20:40:28 +0800 13) )
7115e38c (Jiaxiang Li 2018-11-25 20:40:28 +0800 14) ```
5fb3828c (Jiaxiang Li 2020-02-02 16:28:06 +0800 15) # learn_git
7115e38c (Jiaxiang Li 2018-11-25 20:40:28 +0800 16) 
5fb3828c (Jiaxiang Li 2020-02-02 16:28:06 +0800 17) <!-- badges: start -->
5fb3828c (Jiaxiang Li 2020-02-02 16:28:06 +0800 18) <!-- badges: end -->
5af62ccb (Jiaxiang Li 2018-11-29 14:35:58 +0800 19) 
5fb3828c (Jiaxiang Li 2020-02-02 16:28:06 +0800 20) The goal of learn_git is to ...
cce5dd2b (Jiaxiang Li 2019-01-21 01:05:24 +0800 21) 
f4946688 (Jiaxiang Li 2020-02-02 16:29:39 +0800 22) 1. [Git, GitHub, GitLab 瀛︿範绗旇](output/git-github-gitlab-learning-notes.Rmd)
f4946688 (Jiaxiang Li 2020-02-02 16:29:39 +0800 23) 
5fb3828c (Jiaxiang Li 2020-02-02 16:28:06 +0800 24) <h4 align="center">**Code of Conduct**</h4>

<h6 align="center">Please note that the `learn_git` project is released with a [Contributor Code of Conduct](https://github.com/JiaxiangBU/learn_git/blob/master/CODE_OF_CONDUCT.md).<br>By contributing to this project, you agree to abide by its terms.</h6>

<h4 align="center">**License**</h4>

<h6 align="center">CC0 &copy; [Jiaxiang Li](https://github.com/JiaxiangBU/learn_git/blob/master/LICENSE.md)</h6>

4.34 查询 ignored 文件

参考 夕小瑶 (2020)

$ git status --ignored | clip
On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)

    analysis/git-base/blame.Rmd
    analysis/git-base/pretty-log.Rmd
    analysis/git-base/status-ignored.Rmd
    analysis/git-base/whatchanged.Rmd

Ignored files:
  (use "git add -f <file>..." to include in what will be committed)

    .RData
    .Rhistory
    .Rproj.user/

nothing added to commit but untracked files present (use "git add" to track)

4.35 diff-tree

查看最近一次 commit 或者末次 commit 修改的文件, 参考 McGeary (2009)

Preferred Way (because it’s a plumbing command; meant to be programmatic):

$ git diff-tree --no-commit-id --name-only -r bd61ad98
index.html
javascript/application.js
javascript/ie6.js

Another Way (less preferred for scripts, because it’s a porcelain command; meant to be user-facing)

$ git show --pretty="" --name-only bd61ad98    
index.html
javascript/application.js
javascript/ie6.js

  • The --no-commit-id suppresses the commit ID output.
  • The --pretty argument specifies an empty format string to avoid the cruft at the beginning.
  • The --name-only argument shows only the file names that were affected (Thanks Hank). Use --name-status instead, if you want to see what happened to each file (Deleted, Modified, Added)
  • The -r argument is to recurse into sub-trees

4.36 git rebase –continue

$ git rebase --continue
Applying: add notes for http://git.xxx.com/xxx/xxx/issues/84#note_1395
69
No changes - did you forget to use 'git add'?
If there is nothing left to stage, chances are that something else
already introduced the same changes; you might want to skip this patch.
Resolve all conflicts manually, mark them as resolved with
"git add/rm <conflicted_files>", then run "git rebase --continue".
You can instead skip this commit: run "git rebase --skip".
To abort and get back to the state before "git rebase", run "git rebase --abort".

run "git rebase --abort" 即可。

4.37 git fetch

$ git fetch https
remote: Enumerating objects: 9, done.
remote: Counting objects: 100% (9/9), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 5 (delta 2), reused 4 (delta 2), pack-reused 0
Unpacking objects: 100% (5/5), done.
From https://github.com/liangzp/2020-Tencent-adAlgo
 * [new branch]      master          -> https/master
 * [new branch]      series-circuits -> https/series-circuits
 * [new branch]      tandem_y        -> https/tandem_y
 * [new branch]      write-commit    -> https/write-commit
 * [new branch]      zhipeng         -> https/zhipeng

fetch 的意义。

$ git checkout tandem_y
Switched to a new branch 'tandem_y'
Branch 'tandem_y' set up to track remote branch 'tandem_y' from 'https'.

4.38 push all

参考 https://matthiasloibl.com/posts/git-push-all-branches/

$ git push --all https
Enumerating objects: 17, done.
Counting objects: 100% (14/14), done.
Delta compression using up to 4 threads
Compressing objects: 100% (9/9), done.
Writing objects: 100% (9/9), 5.61 KiB | 1.87 MiB/s, done.
Total 9 (delta 6), reused 0 (delta 0)
remote: Resolving deltas: 100% (6/6), completed with 3 local objects.
To https://github.com/JiaxiangBU/rumor_detection_2019_ncov.git
   f9eb174..821d4fd  master -> master
 * [new branch]      multi-domain -> multi-domain
 * [new branch]      multi-domain02 -> multi-domain02

4.39 cherry-pick

参考 github

git cherry-pick 4b1dbd7b42406898e71839c4f980ad956f5b3a09

可以从所有branch 中找到唯一的一个 commit,在当前 branch 增加这个 commit。

git log -n 1

查看是否成功

参考 夕小瑶 (2020)

git checkout <branch-name> && git cherry-pick <commit-id>
$ git cherry-pick -x 8d1ead043c69d10e194a427b4a512d014d3d642f
[master bad794c] add file to xxx
 Date: Thu Jul 23 15:27:54 2020 +0800
 1 file changed, 46 insertions(+)
 create mode 100644 refs/baiduxueshu_papers_20200723151448.bi

更多地,

git cherry-pick <commit id>:单独合并一个提交
git cherry-pick -x <commit id>:同上,不同点:保留原提交者信息。
git cherry-pick <start-commit-id>..<end-commit-id>
git cherry-pick <start-commit-id>^..<end-commit-id>

产生冲突时,status_notes("obj") 删除光。

参考 https://stackoverflow.com/a/16068510/8625228

git cherry-pick -n 8297f22983a697f1445faf184b730631f2fe4beb

可以拿到某一个 commit id 的所有文件,但是没有 add 和 commit,删减文件,以达到,pick 某几个文件,然后再写 commit。

这样可以多个 commit 一起 pick

$ git cherry-pick 22795c4^..44be008
[read-NiN 56759f0] 快速阅读了 limu 的 NiN。
 Date: Thu Oct 15 12:25:07 2020 +0800
 3 files changed, 107 insertions(+), 1 deletion(-)
 create mode 100644 analysis/limu/read-NiN.Rmd
 create mode 100644 figure/20201015112906.png
[read-NiN a3fd2cd] 理解了 global avg pooling 的处理方式。
 Date: Thu Oct 15 12:37:43 2020 +0800
 1 file changed, 23 insertions(+)
[read-NiN cd1c6b2] 完成 NiN 的阅读
 Date: Thu Oct 15 12:46:38 2020 +0800
 3 files changed, 381 insertions(+), 1 deletion(-)
 create mode 100644 analysis/limu/read-NiN.html
 create mode 100644 figure/20201015124107.png

节省 branch。

参考 https://stackoverflow.com/a/3933416/8625228

git cherry-pick A^..B
git cherry-pick A..B

4.40 create a new git branch from an old commit

参考 https://stackoverflow.com/questions/7167645/how-do-i-create-a-new-git-branch-from-an-old-commit

git branch justin a9c146a09505837ec03b
git checkout -b justin a9c146a09505837ec03b

和 cherry pick 一起使用不要太好哈哈。

4.41 全删全增处理

Conflicting files model_notes.Rmd

在当前的 PR 发现 add.bib 全删全增 查询 master -> DBSCAN 的变更情况。

问题出在 f638f772eeda1fbcd827386d2809d0572512a12c

查询 parent 文件的树结构 5c7a4fb84595b0898a16426c76bc6c39787ed1bc

git checkout -b fix-add.bib 5c7a4fb84595b0898a16426c76bc6c39787ed1bc

然后 git checkout` 到当前 branch copy add.bib

git checkout fix-add.bib 粘贴,处理新增,然后 commit

然后 merge PR 中的两个 branch

git merge master
git merge DBSCAN

处理冲突。

粘贴刚才复制的部分,进行查看

另外一种办法是利用 RStudio 的 (Un)Staged Chunk 来一个个修改。

然后把对应的两个 branch 删除

$ git branch -D fix-add.bib

4.42 解决合并冲突

Conflicting files analysis/US Insurance.Rmd

git checkout master
git pull origin master

保持本地 master 最新

git checkout Sally
git pull origin Sally

发现产生了冲突

$ git status
On branch Sally
Your branch and 'origin/Sally' have diverged,
and have 62 and 4 different commits each, respectively.
  (use "git pull" to merge the remote branch into yours)

You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Changes to be committed:

        new file:   analysis/US-Insurance.md
        modified:   analysis/resource_country-sjy-final.Rmd
        new file:   analysis/resource_country-sjy-final.md

Unmerged paths:
  (use "git add <file>..." to mark resolution)

        both added:      analysis/US Insurance.Rmd

这里说analysis/US Insurance.Rmd需要处理来合并。

<<<<<<< HEAD
作者运用一般均衡的模型来解释了这个理论和现象,模型主要基于美国的风险偏好相对于其他国家更高的鸡舍,理由是美国拥有的技术可以一定程度分散和化解风险。在国外普遍风险厌恶程度更大的假设下:经济正常运行时,国外的预防性储蓄相对美国较多,消费也会更少;在经济危机时,国外的消费下降幅度就会小于美国,美国的财富会流出美国,流向国外。
=======
作者运用一般均衡的模型来解释了这个理论和现象,模型主要基于美国的风险偏好相对于其他国家更高的假设,理由是美国拥有的技术可以一定程度分散和化解风险。在国外普遍风险厌恶程度更大的假设下:经济正常运行时,国外的预防性储蓄相对美国较多,消费也会更少;在经济危机时,国外的消费下降幅度就会小于美国,美国的财富会流出美国,流向国外。
>>>>>>> 3bb1ddacbebcea94a97eea9b4fb4f448d69ef2d2

这里<<<<<<< HEAD表示当前 Sally branch 里面最新的情况 这里>>>>>>> 3bb1ddacbebcea94a97eea9b4fb4f448d69ef2d2表示当前 master branch 里面最新的情况

手动合并完为

作者运用一般均衡的模型来解释了这个理论和现象,模型主要基于美国的风险偏好相对于其他国家更高的鸡舍,理由是美国拥有的技术可以一定程度分散和化解风险。在国外普遍风险厌恶程度更大的假设下:经济正常运行时,国外的预防性储蓄相对美国较多,消费也会更少;在经济危机时,国外的消费下降幅度就会小于美国,美国的财富会流出美国,流向国外。

然后 add commit push

4.44 从一个分支复制文件

$ git show master:analysis/HUAWEI.* analysis/
commit 7b88385d7ebdf2ba89af21ea48852add2885ec49 (HEAD -> sherryxu233-patch-1)
Merge: 272c6c3 20dff2d
Author: Jiaxiang Li <alex.xxx@foxmail.com>
Date:   Mon Sep 14 17:11:28 2020 +0800

    Merge branch 'master' into sherryxu233-patch-1

参考 https://xliska.wordpress.com/2010/09/22/copy-files-between-git-branches/

$ git checkout master analysis/HUAWEI.*
$ git checkout write-commit paper/'2020'$'\345\271\264''9'$'\346\234\210\347\254\254''17'$'\346\234\237''-'$'\350\247\202\
345\257\237''34-1(1).pdf'

4.45 回滚某个文件

参考 https://zhuanlan.zhihu.com/p/84843029

$ git checkout 3bb64166 -- commit.Rmd

4.46 git pull origin

$ git pull origin
remote: Enumerating objects: 9, done.
remote: Counting objects: 100% (9/9), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 9 (delta 4), reused 9 (delta 4), pack-reused 0
Unpacking objects: 100% (9/9), done.
From github.com:JiaxiangBU/usd-dea
 * [new branch]      DEA_Malmquist+window -> origin/DEA_Malmquist+window
 * [new branch]      DEA_note             -> origin/DEA_note
 * [new branch]      DEA_note_1           -> origin/DEA_note_1
 * [new branch]      NXH_LY               -> origin/NXH_LY
 * [new branch]      dazong               -> origin/dazong
Already up to date.

不是全部更新,而不是合并到当前 branch。

4.47 Git 大小写问题

参考 https://coderwall.com/p/mgi8ja/case-sensitive-git-in-mac-os-x-like-a-pro 这是一种方法。

git mv filename filename_tmp
git mv filename_tmp Filename
git commit -m "Set correct case for filename"
(ref:20200928193850-plot) 但是我的需求是这样的,我在 github 线上进行 merge 后,存在大小写文件。但是在线下忽略了大小写就出现了这样的状况。
(ref:20200928193850-plot)

Figure 4.1: (ref:20200928193850-plot)

参考 https://gitlab.com/tortoisegit/tortoisegit/-/issues/2980 https://blog.csdn.net/u013707249/article/details/79135639

git config core.ignorecase false

这行代码在本地要跑。

然后在线上进行修改,具体是 checkout 到 source branch 重命名,然后再 merge。然后 target branch 进行 pull。

PS D:\work\rumor_detection_2019_ncov> fsutil.exe file SetCaseSensitiveInfo . enable
错误:  不支持该请求。

这是不支持的。

4.47.1 Can’t push to remote branch, cannot be resolved to branch

参考 https://stackoverflow.com/a/42802253/8625228

I was having this issue as well, and it was driving me crazy. I had something like feature/name but git branch -a showed me FEATURE/name. Renaming the branch, deleting and recreating it, nothing worked. What finally fixed it:

Go into .git/refs/heads

You’ll see a FEATURE folder. Rename it to feature.

这是因为大小写的问题,需要到 .git 修改文件夹大小写。

4.48 查看最近 commit 的文件

给出项目地址 ‘JiaxiangBU/tutoring2’

$ git show -n 10 --oneline --stat
结果
595fa37 fix https://github.com/JiaxiangBU/tutoring2/issues/62
 jinxiaosong/analysis/dict_comprehension.ipynb | 87 +++++++++++++++++++++++++++
 jinxiaosong/analysis/dict_comprehension.md    | 36 +++++++++++
 2 files changed, 123 insertions(+)
0b7867f use T to do the rowwise @slsongge
 jinxiaosong/analysis/custom-rowwise.ipynb | 194 ++++++++++++++++++++++++++++++
 jinxiaosong/analysis/custom-rowwise.md    | 134 +++++++++++++++++++++
 2 files changed, 328 insertions(+)
8cbb468 add notes for https://github.com/JiaxiangBU/tutoring2/issues/61#issuecomment-695986004
 commit.Rmd  |   23 +-
 commit.html | 2523 +++++++++++++++++++++++++++++++++++++++++++++++++-
 commit.md   | 2960 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 webshot.png |  Bin 43855 -> 57045 bytes
 4 files changed, 5442 insertions(+), 64 deletions(-)
527ea41 add notes for https://github.com/JiaxiangBU/tutoring2/issues/61#issuecomment-695952092
 jinxiaosong/analysis/some_join.ipynb | 814 ++++++++++++++++++++++++++++++++++-
 1 file changed, 805 insertions(+), 9 deletions(-)
e535bff add notes for https://github.com/JiaxiangBU/tutoring2/issues/61#issuecomment-695196270
 commit.Rmd  |  28 +-
 commit.html | 779 ++-------------------------------------------------
 commit.md   | 911 ++----------------------------------------------------------
 webshot.png | Bin 91943 -> 43855 bytes
 4 files changed, 62 insertions(+), 1656 deletions(-)
3bb6416 add notes for https://github.com/JiaxiangBU/tutoring2/issues/61#issuecomment-695195454
 .gitignore                           |   1 +
 commit.Rmd                           |  45 +-
 commit.html                          | 790 ++++++++++++++++++++++++++++--
 commit.md                            | 920 +++++++++++++++++++++++++++++++++--
 jinxiaosong/analysis/some_join.ipynb | 353 ++++++++++++++
 jinxiaosong/data/pivot_help.xlsx     | Bin 0 -> 11073 bytes
 webshot.png                          | Bin 60919 -> 91943 bytes
 7 files changed, 2003 insertions(+), 106 deletions(-)
a526ac1 fix https://github.com/JiaxiangBU/tutoring2/issues/55 @slsongge
 jinxiaosong/nested.Rmd | 24 ++++++++++++++++++++++++
 jinxiaosong/nested.md  | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 74 insertions(+)
06685b3 add notes for https://github.com/JiaxiangBU/tutoring2/issues/53
 .gitignore                     |  2 ++
 pansiyu/analysis/for-loops.Rmd | 30 ++++++++++++++++++++++++++++++
 pansiyu/analysis/for-loops.md  | 41 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 73 insertions(+)
0d2fc76 update issue template.
 .github/ISSUE_TEMPLATE.md | 166 ++++++++++++++++++++++++----------------------
 1 file changed, 85 insertions(+), 81 deletions(-)
03cdbe3 Merge branch 'master' of https://github.com/JiaxiangBU/tutoring2

 .../nb2gitbook/toc_and_pyecharts/echarts.min.js    |    22 +
 .../nb2gitbook/toc_and_pyecharts/jquery-ui.css     |  1312 ++
 .../nb2gitbook/toc_and_pyecharts/jquery-ui.min.js  |     6 +
 .../nb2gitbook/toc_and_pyecharts/jquery.min.js     |     6 +
 jinxiaosong/nb2gitbook/toc_and_pyecharts/main.css  |   183 +
 .../nb2gitbook/toc_and_pyecharts/output_9_0.png    |   Bin 0 -> 114082 bytes
 .../nb2gitbook/toc_and_pyecharts/require.min.js    |    36 +
 jinxiaosong/nb2gitbook/toc_and_pyecharts/toc2.js   |   828 ++
 .../toc_and_pyecharts/toc_and_pyecharts.html       | 13667 +++++++++++++++++++
 9 files changed, 16060 insertions(+)

参考 https://stackoverflow.com/a/14208143/8625228

git log --stat --oneline + commit_id
$ git show 994b56aa00a52f1ea59a5428d340e1520b7801a2 --stat --oneline
994b56a (https/optimizer, optimizer) Merge pull request #77 from JiaxiangBU/read-limu

 analysis/read-Adagrad.Rmd  |   21 +
 analysis/read-RMSProp.Rmd  |  104 +++
 analysis/read-RMSProp.html | 1715 ++++++++++++++++++++++++++++++++++++++++++++
 paper/d2l-en.pdf           |  Bin 0 -> 27381461 bytes
 paper/d2l-zh.pdf           |  Bin 0 -> 12766903 bytes
 5 files changed, 1840 insertions(+)

4.49 git diff 筛选状态

参考 https://www.sitepoint.com/community/t/finding-all-deleted-files-in-a-git-diff-file/106793

--diff-filter=D 使用 tab code completion, 并且可以根据状态进行 git diff 筛选。

4.49.1 原生 diff

参考 https://stackoverflow.com/questions/13964328/git-diff-two-files-on-same-branch-same-commit

diff fileA.php fileB.php

因此不属于 git diff 的功能范围。

4.50 fatal: ambiguous argument ‘xi-an’: both revision and filename

文件夹名和 branch 名字相同时的问题。

参考 https://stackoverflow.com/a/26349250/8625228

use--

git log --oneline -- branch_name

4.50.1 fatal: ambiguous argument ‘guiyang’: both revision and filename

参考 https://stackoverflow.com/a/33153205/8625228

$ git diff guiyang..guizhou-nationalities-museum  --name-status

用上语法..

4.51 正则化提取 branch 名字

参考 https://stackoverflow.com/a/32783008/8625228

git branch -a | grep Theme

4.52 using git how could i search for a string across all branches

参考 https://stackoverflow.com/questions/7151311/using-git-how-could-i-search-for-a-string-across-all-branches

需求是因为在 PR 里面的其他 branch,关键词搜索不到,因为不是 master 内的内容。

git grep "string/regexp" $(git rev-list --all)

如果报错

$ git grep "Caldara" $(git rev-list --all)
bash: /mingw64/bin/git: Argument list too long

参考 https://stackoverflow.com/a/16615508/8625228

git rev-list --all | xargs git grep "Caldara" | clip
git rev-list --all | xargs git grep "Silver" | clip

但是会慢一些。

4.53 how do i delete a git branch locally and remotely

参考 https://stackoverflow.com/questions/2003505/how-do-i-delete-a-git-branch-locally-and-remotely

$ git push -d <remote_name> <branch_name>
$ git branch -d <branch_name>

4.54 pull remote forked repo

4.54.1 pull/id/head

参考 https://stackoverflow.com/a/53246204/8625228

git checkout -b <branch>
git pull origin pull/8/head

新建一个 branch,如果是和 master 就 checkout 一个出来,然后 pull 就好了。

4.54.2 fetch method

参考 https://stackoverflow.com/questions/5884784/how-to-pull-remote-branch-from-somebody-elses-repo

git fetch https://github.com/xuziyuanUT/usd-satial_panels.git master:xuziyuanUT-master
git checkout xuziyuanUT-master

# fix fix fix

git push https://github.com/xuziyuanUT/usd-spatial_panels.git master

当 PR 是 original repo 和 forked repo 之间 master 产生的,因为同名(都叫做 master),所以一般方法不能 pull 和 push,因此采用以上方式解决。

但是这种方法有自身的问题。 比如当把xuziyuanUT-master push 到 JiaxiangBU,那么xuziyuanUT-master就被双重指向了。

4.54.3 gh method

$ git fetch https://github.com/xuziyuanUT/usd-spatial_panels.git master:xuziyuanUT-master
fatal: Refusing to fetch into current branch refs/heads/xuziyuanUT-master of non-bare repository

也可以用 gh,但是需要在 SSH 验证的情况下

choco install gh    

https://github.com/cli/cli#installation

$ "D:\install\gh_cli\gh.exe" pr checkout 280
Notice: authentication required
Press Enter to open github.com in your browser...
Authentication complete. Press Enter to continue...

Connection reset by 13.229.188.59 port 22
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
exit status 128

不行。

4.55 ls 远程 branch

参考 https://jira.atlassian.com/browse/SRCTREE-1295

$ git ls-remote https
e9230ad65afb335fd7b28e16b273a146d4a9b6b0        HEAD
e9230ad65afb335fd7b28e16b273a146d4a9b6b0        refs/heads/master
a1b73589ba946d9bf1f956a5022ed8b9066d0e1e        refs/heads/update-git-base
a1b73589ba946d9bf1f956a5022ed8b9066d0e1e        refs/pull/49/head
84432224605eeb039459f15e3e01c8aa5b203cac        refs/pull/49/merge

4.56 push with (Enumerating objects: 24, done)

Enumerating objects: 24, done.
error: remote unpack failed: eof before pack header was fully read
git update-git-for-windows

参考 https://blog.csdn.net/qq_28193019/article/details/89164452https://stackoverflow.com/a/54692115/8625228

I had a similar error with git 2.19.0 which was fixed by updating to git 2.20.1. I believe in my case git crashed while trying to compress a specific object (it only got to “Compressing objects: 31%”) and then the server returned that error due to the sending process having crashed. I had a similar error with git 2.19.0 which was fixed by updating to git 2.20.1. I believe in my case git crashed while trying to compress a specific object (it only got to “Compressing objects: 31%”) and then the server returned that error due to the sending process having crashed.

I hope this helps someone.

其实就是还在压缩。

4.57 three branch rule

对于 master 而言,

read-wsj -> 建立一个 wsj-public branch 之后的修改一直在 read-wsj 就好了,它就是一个 working-branch

这样就用 wsj-public 进行合并了,中间的 read-wsj 是缓存的。 wsj-public 就是可以合并的,或者只是做做文学编程而已。

这样 read-wsj 不需要等 wsj-public 完成后再更新。

让知识一直输出。

4.58 在服务器上配置 git

这个时候,在 Linux 系统下,有很多权限设置。

并且 jupyter notebook 的 terminal 不好用,最好在 jupyter notebook 里面用,如

!cat /home/xxx/.ssh/id_rsa_xxx.pub
# mm_remote_jupyter_200930
ssh-rsa xxxxx
!ssh -T git@git.xxx.com -i ~/.ssh/id_rsa_xxx.pub
# https://serverfault.com/questions/295768/how-do-i-connect-to-ssh-with-a-different-public-key
# 没有解决问题
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions 0644 for '/home/xxx/.ssh/id_rsa_xxx.pub' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
bad permissions: ignore key: /home/xxx/.ssh/id_rsa_xxx.pub
git@git.xxx.com's password: 

不行没有权限,因此还是 https 吧。

!git clone http://git.xxx.com/xxx/debt_repayment_modelling.git
Initialized empty Git repository in /home/xxx/xxx/analysis/debt_repayment_modelling/.git/
error: The requested URL returned error: 401 Unauthorized while accessing http://git.xxx.com/xxx/debt_repayment_modelling.git/info/refs

fatal: HTTP request failed

SSH 也不行。

!chmod 400 ~/.ssh/id_rsa_xxx
!chmod 400 /home/xxx/.ssh/id_rsa_xxx.pub
# https://blog.csdn.net/lxfHaHaHa/article/details/89577507
# 权限下降
!ssh -T --help
usage: ssh [-1246AaCfgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
           [-D [bind_address:]port] [-e escape_char] [-F configfile]
           [-I pkcs11] [-i identity_file]
           [-L [bind_address:]port:host:hostport]
           [-l login_name] [-m mac_spec] [-O ctl_cmd] [-o option] [-p port]
           [-R [bind_address:]port:host:hostport] [-S ctl_path]
           [-W host:port] [-w local_tun[:remote_tun]]
           [user@]hostname [command]
!ssh -o "IdentitiesOnly=yes" -i ~/.ssh/id_rsa_xxx.pub git.xxx.com
# 不被允许。
xxx@git.xxx.com's password: 

5 GitHub

5.1 使用 Milestone

A milestone acts like a container for issues

Once you’ve collected a lot of issues, you may find it hard to find the ones you care about. Milestones, labels, and assignees are great features to filter and categorize issues.

Milestones are groups of issues that correspond to a project, feature, or time period. guides.github.com

milestone 类似于 issues 的容器,主要功能也是作为一个筛选的标签,一般设置可以根据软件的特性、项目和时间周期来设定。

October Sprint — File issues that you’d like to work on in October. A great way to focus your efforts when there’s a lot to do.

表示十月份的冲刺,任务安排在10月份完成。

并且 MileStone 中的任务可以使用鼠标进行拖动。 help.github.com

5.2 数据安全

从一个陌生地址下载私人项目需要密码。

5.3 使用 Project Board

之后的汇报直接见项目看板,

  1. 这样不需要花时间整理
  2. 不需要把汇报内容托管在某一个具体项目下,如xxx,这样更加偏 sense,汇报是一个独立的东西,不应当归属于任何一个具体项目

Project boards on GitHub help you organize and prioritize your work. You can create project boards for specific feature work, comprehensive roadmaps, or even release checklists. With project boards, you have the flexibility to create customized workflows that suit your needs. help

当我们的项目(repo)数量增加时,我们很多需求需要跨项目进行。 这时候 project board 可以满足我们的需求,可以跨项目把issue、pull request 整合在一起,进行看板管理。

建立 project board 位置

Figure 5.1: 建立 project board 位置

参考 https://github.com/users/JiaxiangBU/projects/1#card-21863665 在使用 filter cards 功能时, 首字母、文字很关键,或者空格后的文字,才会被识别。

5.3.1 看板的列

四个不同类型的列

Figure 5.2: 四个不同类型的列

  1. Backlog 主要是加入到项目中
  2. To do 相当于计入待办事项
  3. In progress 相当于在进行中
  4. Done 完成

这里 Backlog 和 To do 原是 To do 的拆分,主要衡量未完成的数量。

5.4 解决GitHub访问慢方案

参考 blog (链接已经消失) 打开 IPAddress.com 输入 github.com 和 github.global.ssl.fastly.net,查找IP地址 192.168.xx.xx185.31.17.xx (目前 IPAddress.com 已经被禁止了)

参考 https://github.com/DrYaling/hosts-1

复制 https://gitlab.com/ineo6/hosts/-/raw/master/next-hosts 即可。

5.4.1 vim

打开 Terminal,输入

sudo vi /etc/hosts

输入密码。 按i进入编辑模式, 添加 host

192.168.xx.xx github.com
185.31.17.xx github.global.ssl.fastly.net

按住 esc 退出, 按住 shift + : 进入命令模式,输入 w + q 退出。 这里需要使用 sudo,否则会产生报错。

E45: 'readonly' option is set (add ! to override)

刷新缓存,搞定。

5.4.2 nano 的方法

打开 git bash, 如果打开 RStudio Ctrl+Shift+T

$ sudo nano /etc/hosts

复制上去。

# GitHub
140.82.112.4 github.com
192.30.253.113 github.com
192.30.253.112 github.com
192.30.255.112 github.com
140.82.113.4 github.com
151.101.185.194 github.global.ssl.fastly.net
185.31.16.184 github.global.ssl.fastly.net
185.199.108.153 easystats.github.io
185.199.109.153 easystats.github.io
185.199.110.153 easystats.github.io
185.199.111.153 easystats.github.io
208.74.204.55 github.community # community
199.232.28.133 github.map.fastly.net # raw.githubusercontent.com
192.30.253.118 gist.github.com
192.30.253.119 gist.github.com

复制快捷键 command + insert

command + X 退出,

保存

Save modified buffer?  (Answering "No" will DISCARD changes.)
 Y Yes
 N No           ^C Cancel

输入 Y

再回车保存名字

File Name to Write: /etc/hosts
^G Get Help               M-D DOS Format            M-A Append                M-B Backup File
^C Cancel                 M-M Mac Format            M-P Prepend               ^T To Files

5.4.3 Win7 上刷新缓存

不要在 bash 而是 CMD 进行

ipconfig /flushdns

Windows IP 配置

已成功刷新 DNS 解析缓存。

5.4.4 Mac 上刷新缓存

sudo dscacheutil -flushcache

5.5 稍后阅读

参考 help

library(knitr)
点击提醒

Figure 5.3: 点击提醒

点击标签

Figure 5.4: 点击标签

Saved for later 新增

Figure 5.5: Saved for later 新增

5.6 Private Contributions

Include private contributions on my profile Get credit for all your work by showing the number of contributions to private repositories on your profile without any repository or organization information. Learn how we count contributions.

加入 private contributions 后,可以更全面展示的 contributions,因为包含了私有项目中的。

You can opt into sharing your private contributions Details of the issues, pull requests, and commits you have made on private repositories are only visible to your fellow repository collaborators. github people without access to the private repositories you work in won’t be able to see the details of your private contributions. Instead, they’ll see the number of private contributions you made on any given day. help

具体的私有信息是不会展示的。

5.7 GitHub 主页

release 数据在 GitHub 主页看不到。

5.8 rpostback-askpass

error: cannot run rpostback-askpass: No such file or directory
fatal: could not read Username for 'https://github.com': Device not configured
Everything up-to-date
Error in namespaceExport(nsenv, exports) : 
  cannot add to exports of a sealed namespace

devtools::document() 也出了问题。

Username for 'https://github.com': JiaxiangBU
error: cannot run rpostback-askpass: No such file or directory
Password for 'https://JiaxiangBU@github.com':
Enumerating objects: 15, done.
Counting objects: 100% (15/15), done.
Delta compression using up to 4 threads
Compressing objects: 100% (9/9), done.
Writing objects: 100% (9/9), 2.82 KiB | 1.41 MiB/s, done.
Total 9 (delta 5), reused 0 (delta 0)
remote: Resolving deltas: 100% (5/5), completed with 4 local objects.
To https://github.com/JiaxiangBU/travel_notes.git
   2e37fee..09f02ab  master -> master
(base) vijadeMacBook-Pro:shanghai vija$ git push gitee master
Everything up-to-date

这样手动输入下密码就好了。

5.9 使用 GitHub API

如文档 https://developer.github.com/v3/projects/cards/#move-a-project-card 使用函数 gh::gh 进行调用。

  1. application/vnd.github.inertia-preview+json设置参数
  2. .send_headers = c("Accept" = "application/vnd.github.inertia-preview+json")
  3. POST /projects/columns/cards/:card_id/moves就是第一个参数,其中card_id也作为参数
  4. Parameters 内设置参数

5.10 下载部分文件

$ git clone https://github.com/cerlymarco/MEDIUM_NoteBook.git
Cloning into 'MEDIUM_NoteBook'...
remote: Enumerating objects: 55, done.
remote: Counting objects: 100% (55/55), done.
remote: Compressing objects: 100% (54/54), done.
error: RPC failed; curl 18 transfer closed with outstanding read data remaining
fatal: the remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed

这是完整的报错。 我本来是打算下载部分文件夹的内容的,都失败了,最后曲线救国的办法。 利用国内的 gitee fork 项目,然后再考虑 git pull 或者打开特有链接进行指定文档的下载。

存在不能导入的报错,这是因为项目不是GitHub本人的ID,进行fork后,用https://github.com/JiaxiangBU/MEDIUM_NoteBook.git即可。

5.10.1 采用特定服务器

比如特定服务器的 GitLab

Import Projects from GitHub To import a GitHub project, you can use a Personal Access Token.

这里需要 GitHub 的密钥,所以不能保护隐私。

5.10.2 更换网络服务商

我试了手机上的电信网络不行,家里的宽带都不行,打开了特定的合法VPN也不行。

5.10.4 Sparse Checkout at Git 1.7.0

参考 https://blog.csdn.net/goodook/article/details/51571371

$ mkdir test-time2vec
$ cd test-time2vec/
$ git init
Initialized empty Git repository in D:/work/test-time2vec/.git/
$ git remote add -f origin https://github.com/cerlymarco/MEDIUM_NoteBook.git

也没下再下来。

5.10.5 raw.githubusercontent.com 打不开 ERR_CONNECTION_RESET

增加 host

199.232.28.133 github.map.fastly.net # raw.githubusercontent.com

也没用。

5.11 close issue

如果 issue 不 close 那么就会在 github.com Recent activity 这一块挡住。

5.12 方便复制粘贴到 issue comment

$ jupyter nbconvert jinxiaosong/if-else.ipynb --to markdown
[NbConvertApp] Converting notebook jinxiaosong/if-else.ipynb to markdown
[NbConvertApp] Writing 3175 bytes to jinxiaosong\if-else.md

cat的使用参考 https://jiaxiangli.netlify.com/2018/01/04/shell/

$ cat jinxiaosong/if-else.md | clip

但是中文会乱码。

read_lines("../tutoring2/jinxiaosong/if-else.md") %>% clipr::write_clip()
append_comments_to_issue("https://github.com/JiaxiangBU/tutoring2/issues/30")

5.13 整理 issue 的 todos

sort_todo_by_df() 我想起来,现在 - [ ] 也可以用来整理 github issue哈哈!

5.14 如何通过邮件回复

直接根据提醒的邮件回复。

5.16 传大文件

大文件传 release,超过50MB检验不传项目。 因为这样其他人 clone 和 pull 的时候才耗时少。

5.17 修改.Renviron

记得重启 R 才能生效。

5.18 无法创建 issue

> create_issue("xxx")
i The issue is located 'https:/github.com/JiaxiangBU/xxx/issues/59'
$ gh issue status
Post https://api.github.com/graphql: read tcp xxx->xxx: wsarecv: An existi
ng connection was forcibly closed by the remote host.

gh::gh 走 v3 gh 走 v4 都不稳定老是报错,多刷几次就好了。

5.19 对长评论进行截图

  1. 先把 comment 的 markdown 格式拿下来,这个可以直接点击 comment 的右键 edit 拿到。 这里可以用 add2gh::collect_comments()拿到
  2. 然后在 RStudio 内新建一个 RMarkdown 文档,knit 为 html 文档。
  3. webshot::webshot 进行截图即可。

5.21 Sourcegraph 下载 PDF

下载 pdf 很方便

Figure 5.7: 下载 pdf 很方便

5.22 comment 拖动对话框

拖动对话框可以帮助在文字很多的时候进行编辑。

Figure 5.8: 拖动对话框可以帮助在文字很多的时候进行编辑。

5.23 Pull Request

参考 https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-request-reviews

Reviews allow collaborators to

  1. comment on the changes proposed in pull requests,
  2. approve the changes, or
  3. request further changes before the pull request is merged.
  4. Repository administrators can require that all pull requests are approved before being merged.

这是 pull request 的四个作用。

You can resolve a conversation in a pull request if you opened the pull request or if you have write access to the repository where the pull request was opened.

Resolving conversations 指的是讨论解决了问题,一般 pull request 提交者和 conversation 提交者有权限关闭。

If the suggestion in a comment is out of your pull request’s scope, you can open a new issue that tracks the feedback and links back to the original comment.

当讨论的问题和 pull request 或者 issue 不相关时,可以点击右上方,重新提交一个新的 issue,参考 https://help.github.com/en/github/managing-your-work-on-github/opening-an-issue-from-a-comment

5.24 如何书写 Pull Request Review

参考 https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/reviewing-proposed-changes-in-a-pull-request 这里包含了 suggestion 等情况。

5.25 table md

在excel 里面写好,直接复制粘贴过来,会自动识别出来。

1 2 3
1 2 3
1 2 3

5.26 gh-pages branch

参考 https://docs.github.com/cn/github/working-with-github-pages/configuring-a-publishing-source-for-your-github-pages-site gh-pages可以作为一个 branch 进行 github-pages,这样不要把重要信息放到这个 branch 即可,大部分都 ignore 掉。 有个好处,很多阅读记录不方便合并到 master 但是可以合并到 gh-pages。

5.27 查询 branch 的关系

https://github.com/JiaxiangBU/xxx-xxx/network 可以查看,方便 merge 和 diff

5.28 转移 issue

参考 https://docs.github.com/en/github/managing-your-work-on-github/transferring-an-issue-to-another-repository

按这个说法 comments 会跟着走?是的。

6 GitHub cli

6.1 安装

从 release 页面安装 https://github.com/cli/cli/releases/tag/v0.5.7 这点可以加入到 learn_git

6.2 设置环境变量

没有配置环境变量。

fs::dir_tree(here::here("../../install/gh_cli"))

没有bin\,设置环境变量 D:\install\gh_cli\ 设置以后就好了。

6.3 配置

$ gh issue list
Notice: authentication required
Press Enter to open github.com in your browser...
  1. 创建 token
  2. 顺手更新,使用usethis::edit_r_environ()
nano ~/.config/gh/config.yml
github.com:
  - user: JiaxiangBU
    oauth_token: 

6.4 wsarecv error

$  gh issue list
Post https://api.github.com/graphql: read tcp xxx->xxx: wsarecv: An existing connection was forcibly closed by the remote host.

参考 https://github.com/cli/cli/issues/536 https://github.com/cli/cli/issues/286 这是同样的问题,一会就会消失,目前没有好的解决方法。

6.5 查看 issue list

参考 https://cli.github.com/manual/

$ gh issue status
Relevant issues in JiaxiangBU/tutoring2

Issues assigned to you
  #35 code reproduction <https://lexparsimon.github.io/coronavirus/>  (bug) about 2 days ago
  #34 Error in `[.data.frame`(obj, ii, j, drop = FALSE) : undefined colum...  (bug) about 2 days ago
  #1 here::here 浣跨敤  (documentation, good first issue) about 6 months ago

Issues mentioning you
  There are no issues mentioning you

Issues opened by you
  #35 code reproduction <https://lexparsimon.github.io/coronavirus/>  (bug) about 2 days ago
  #8 plan: 鍏抽棴椤圭洰 about 5 months ago
  #7 swap analysis about 5 months ago
  #6 NLP XGBoost 涓嶉€傚簲鐨勬儏鍐?about 5 months ago
  #5 Python for 寰幆 about 5 months ago
  #1 here::here 浣跨敤  (documentation, good first issue) about 6 months ago

6.6 查看 issue 内容

以项目 tutoing2 为例

$ gh issue view 35 --preview | clip
code reproduction <https://lexparsimon.github.io/coronavirus/>
opened by JiaxiangBU. 2 comments. (bug)



    $ jupyter nbconvert --to markdown wuruiqi/coronavirus.ipynb
    [NbConvertApp] Converting notebook wuruiqi/coronavirus.ipynb to markdown
    [NbConvertApp] Writing 3648 bytes to wuruiqi\coronavirus.md

   wuruiqi/coronavirus.ipynb  这是复现的 notebook,我 reformat 了代码。

    import numpy as np

    # initialize the population vector from the origin-destination flow matrix
    N_k = np.abs(np.diagonal(OD) + OD.sum(axis=0) - OD.sum(axis=1))
    locs_len = len(N_k)  # number of locations
    SIR = np.zeros(
        shape=(locs_len, 3)
    )  # make a numpy array with 3 columns for keeping track of the S, I, R
  groups
    SIR[:, 0] = N_k  # initialize the S group with the respective populations

    first_infections = np.where(
        SIR[:, 0] <= thresh, SIR[:, 0] // 20, 0
    )  # for demo purposes, randomly introduce infections
    SIR[:, 0] = SIR[:, 0] - first_infections
    SIR[:, 1] = SIR[:, 1] + first_infections  # move infections to the I group

    # row normalize the SIR matrix for keeping track of group proportions
    row_sums = SIR.sum(axis=1)
    SIR_n = SIR / row_sums[:, np.newaxis]

    # initialize parameters
    beta = 1.6
    gamma = 0.04
    public_trans = 0.5  # alpha
    R0 = beta / gamma
    beta_vec = np.random.gamma(1.6, 2, locs_len)
    gamma_vec = np.full(locs_len, gamma)
    public_trans_vec = np.full(locs_len, public_trans)

    # make copy of the SIR matrices
    SIR_sim = SIR.copy()
    SIR_nsim = SIR_n.copy()

    # run model
    print(SIR_sim.sum(axis=0).sum() == N_k.sum())
    from tqdm import tqdm_notebook

    infected_pop_norm = []
    susceptible_pop_norm = []
    recovered_pop_norm = []

    ---------------------------------------------------------------------------

    NameError                                 Traceback (most recent call last)

    <ipython-input-7-aa271ab2f1eb> in <module>
          2
          3 # initialize the population vector from the origin-destination flow
  matrix
    ----> 4 N_k = np.abs(np.diagonal(OD) + OD.sum(axis=0) - OD.sum(axis=1))
          5 locs_len = len(N_k)  # number of locations
          6 SIR = np.zeros(


    NameError: name 'OD' is not defined

   For this analysis, we will use the aggregated \(OD\) flow matrix of a
   typical day obtained from GPS data provided by local ride sharing
   company gg as a proxy for the mobility patterns in Yerevan city.

  这是 OD 的定义。 需要在 https://www.ggtaxi.com/signin 下载。

   Next, we need the population counts in each 250×250 m grid cell, which
   we approximate by proportionally scaling the extracted flow counts so
   that the total inflows in different locations sum up to approximately
   half of Yerevan’s population of 1.1 million. This is actually a bold
   assumption, but since varying this portion yielded very similar
   results, we will stick to it.

  然后下载后的数据做一个计数矩阵,进行标准化处理即可。
  这部分你可以把数据下载下来后,我这边处理。



View this issue on GitHub: https://github.com/JiaxiangBU/tutoring2/issues/35

6.7 查看 issue 任务安排情况

参考 https://github.com/cli/cli/issues/593

这个 issue 可以知道目前 issue 的 assign 情况。

$ gh issue list -a JiaxiangBU

Issues for JiaxiangBU/tutoring2

#38  文本相似比较                                             (good first issue)
#35  code reproduction <https://lexparsimon.github.io/cor...  (bug)
#1   here::here 使用                                          (documentation, good first issue)


A new release of gh is available: 0.6.0 → v0.6.1
https://github.com/cli/cli/releases/tag/v0.6.1

7 GitLab

7.1 权限管理

The different permission models are really useful in cases where you have a senior developer who you want to give access to everything. However, if you hire an intern, this intern should only have access to pulling the code and not deleting branches and stuff. This way, you can make sure that the senior developer has reviewed the code before it was merged into the main branch. (Baarsen 2014)

因此对于非开发者,实际上只能限制他们能够 pull 而不能 push,他们可以通过 merge request 来完成合并代码的请求(经过 review)。

对项目进行管理。

                                    Guest      Reporter    Developer        Master       Owner
 Create new issues                  *          *           *                *            *
 Leave comments                     *          *           *                *            *
 Pull the project code                         *           *                *            *
 Download a project                            *           *                *            *
 Create code snippets                          *           *                *            *
 Create new merge requests                                 *                *            *
 Push changes to nonprotected                              *                *            *
 branches
 Remove nonprotected branches                              *                *            *
 Add tags                                                  *                *            *
 Write a wiki                                              *                *            *
 Manage the issue tracker                                  *                *            *
 Add new team members                                                       *            *
 Push changes to protected                                                  *            *
 branches
 Manage the branch protection                                               *            *
 Manage Git tags                                                            *            *
 Edit the project                                                           *            *
 Add deploy keys to the project                                             *            *
 Configure the project hooks                                                *            *
  1. guest 只给 issue、写下对应的 comment,因此 README 等文档都没有权限看,因此一般都是给 reporter。
  2. reporter 多给看 code 和下载代码
  3. developer 因为要给权限 merge

o

7.2 下载文件

将访问链接中的blob改为rawctrl + s 保存即可。

7.3 使用 GitLab

如图,根据指引可以新建好 board,这相当于 GitHub 的 Project 功能,作为一个看板,可以把每个项目内的 Issues 进行跟踪,目前版本比较简单,主要分为

  1. todos 和
  2. doing 和
  3. done 的安排

进行,但是基本上可以帮助进行项目管理了。

7.4 上传文件的小例子

这里介绍批量上传文件的方式。

打开 Git Bash,定位到项目想要下载的位置,如桌面。

复制项目地址

Figure 7.1: 复制项目地址

cd ~/Desktop
git clone http://git.xxx.com/xxx/xxxxx.git

以下过程 表示正在下载,xx MB/s 是下载速度的意思。

$ git clone http://git.xxx.com/xxx/xxxxx.git
Cloning into 'xxxxx'...
remote: Counting objects: 2729, done.
remote: Compressing objects: 100% (89/89), done.
Receiving objects:  17% (483/2729), 182.68 MiB | 31.63 MiB/s
cd xxxxx
cd sql-script/xxx/
mv ~/Desktop/xxxxx_roi ./

定位到xxxxx,这个我们想要下载的项目。 定位到文件夹xxx,把桌面xxxxx_roi的文件夹放入其中。

$ git status
On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        xxxxx_roi/

nothing added to commit but untracked files present (use "git add" to track)

告诉你文件夹路径修改成功。

git add .

.表示add 所有。

$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        new file:   xxxxx_roi/01-01-log.sql
        new file:   xxxxx_roi/01-02-log.sql
        new file:   xxxxx_roi/02-01-first-register.sql

status函数是查询目前项目中的状态,可见new file 有新增。

git commit -m 'add all files related to xxxxx roi metrics'

commit 没有实际意义,只是告知他人这条版本控制操作描述。

git push

推 = 上传。

$ git push
Enumerating objects: 73, done.
Counting objects: 100% (73/73), done.
Delta compression using up to 2 threads
Compressing objects: 100% (69/69), done.
Writing objects: 100% (70/70), 30.18 KiB | 1.68 MiB/s, done.
Total 70 (delta 11), reused 0 (delta 0)
To http://git.xxx.com/xxx/xxxxx.git
   0094def..c5054d9  master -> master

最后总结

add, commit, push

  1. add 只是加,如果 mv 掉,计算机是可以发现。
  2. add 的行为,需要描述,所有用 commit
  3. push 推送到云端
可以查看到commit成功在云端显示

Figure 7.2: 可以查看到commit成功在云端显示

点击具体的commit,可以看到具体的修改

Figure 7.3: 点击具体的commit,可以看到具体的修改

7.5 协同开发的注意事项

目前发现 GitLab 不支持以下场景。

项目 A 被 forked 后产生 项目 A’。 在项目 A 更新后,产生若干 commits,不能提交 merge request 把更新的commits 合并到 A’ 上。

因此开发者之间应该之后只维护项目 A,而非在跨项目之间进行 merge request。

目前的解决方案是在协作者的权限上升到 developer。 然后协作者 git clone 项目,在本地完成修改,git push 和 pull。

一般地,自建项目(不是 forked),都有一个 protected 的标签,这时候需要修改权限。

protected 标签

Figure 7.4: protected 标签

点开 project settings

设置 developer 的权限

Figure 7.5: 设置 developer 的权限

设置 developer 的权限

Figure 7.6: 设置 developer 的权限

7.6 设置 SSH 钥匙

bug
$ git push
git@xxx's password:
Permission denied, please try again.
git@xxx's password:
Permission denied, please try again.
git@xxx's password:
git@xxx: Permission denied (publickey,password).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

重新上传和使用 SSH 即可解决。

这方面的资料,GitHub 明显优于 GitLab。 参考 GitHub Help SSH Keys 如果忘记密码,只能重新生成了,参考 GitHub Help。 以下主要在 MacOS 中实现,其他系统类似,有些许差异。

ssh-keygen -t rsa -b 4096 -C "alex.xxx@foxmail.com"

这里会生成好。

Terminal 反馈

> Enter a file in which to save the key (/Users/you/.ssh/id_rsa): [Press enter]

因此默认路径即可。

> Enter passphrase (empty for no passphrase): [Type a passphrase]
> Enter same passphrase again: [Type passphrase again]

这个输入一个密码即可。如果本地环境是自己一个人使用,可是直接 enter 跳过即可

eval "$(ssh-agent -s)"
> Agent pid 59566

会反馈一个Agent pid xxxxx

参考 https://stackoverflow.com/questions/52113738/starting-ssh-agent-on-windows-10-fails-unable-to-start-ssh-agent-service-errohttps://www.jvandertil.nl/posts/2019-01-18_usingwindowssshwithgit/

unable to start ssh-agent service, error :1058

这是因为 git 没有安装在 C 盘。

然后在 GitHub 或者 GitLab 上复制粘贴公钥。 参考 GitHub Help,分系统进行

pbcopy < ~/.ssh/id_rsa.pub # Mac
clip < ~/.ssh/id_rsa.pub # Windows
sudo apt-get install xclip # Linux
xclip -sel clip < ~/.ssh/id_rsa.pub # Linux

这样就复制到剪贴板了,下一步就是复制粘贴即可。 参考 GitHub Help

参考 GitHub Help

$ ssh -T git@git.xxxcorp.com
$ ssh -T git@github.com
# 检查是否成功
$ ssh -T git@github.com
Hi JiaxiangBU! You've successfully authenticated, but GitHub does not provide shell access.

或者设置好后会收到这样的邮件

Hi xxx! A new public key was added to your account: title: If this key was added in error, you can remove it under SSH Keys — View it on GitLab. You’re receiving this email because of your account on git.xxxcorp.com. If you’d like to receive fewer emails, you can adjust your notification settings.

git2r::remote_rename(oldname = "origin", newname = "https")
git2r::remote_add(name = "origin", url = "git@github.com:{user}/{repo}.git")

git@github.com 就是 SSH host。

$ ssh -T git@github.com

如果返回

Hi {user}! You've successfully authenticated, but GitHub does not provide shell access.

说明就设置成功。

这样修改后,RStudio Git 的提示就不再会显示了。

$ git push --set-upstream origin master
Everything up-to-date
Branch 'master' set up to track remote branch 'master' from 'origin'.

参考 https://docs.github.com/en/free-pro-team@latest/github/authenticating-to-github/working-with-ssh-key-passphrases

设置密码后,每次登录都需要输入一下密码,为了安全。

git SSH passphrase

With SSH keys, if someone gains access to your computer, they also gain access to every system that uses that key. To add an extra layer of security, you can add a passphrase to your SSH key. You can use ssh-agent to securely save your passphrase so you don’t have to reenter it.

因此也是为了安全,那么和 https 的链接方式一样了。

The ssh-agent process will continue to run until you log out, shut down your computer, or kill the process.

Tip: If you want ssh-agent to forget your key after some time, you can configure it to do so by running ssh-add -t .

其实就是为了安全。

7.7 REMOTE HOST IDENTIFICATION HAS CHANGED

参考 https://juejin.im/post/5a30f319f265da43333e6597

$ git push origin tieniu2.0
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@       WARNING: POSSIBLE DNS SPOOFING DETECTED!          @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
The RSA host key for git.xxx.com has changed,
and the key for the corresponding IP address xx.xxx.xx.xxx
is unknown. This could either mean that
DNS SPOOFING is happening or the IP address for the host
and its host key have changed at the same time.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
SHA256:xxx.
Please contact your system administrator.
Add correct host key in /c/Users/xxx/.ssh/known_hosts to get rid of this message.
Offending RSA key in /c/Users/xxx/.ssh/known_hosts:2
RSA host key for xxx has changed and you have requested strict checking.
Host key verification failed.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

打开~/.ssh/known_hosts,删除 host 那一行。

$ ssh -T git@git.xxxcorp.com
The authenticity of host 'git.xxxcorp.com (10.114.16.202)' can't be established.
ECDSA key fingerprint is xxx
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'git.xxxcorp.com,xx.xx.xx.xxx' (ECDSA) to the list of known hosts.
Welcome to GitLab, xxx!

回复 YES

就可以了。

以下是在 Windows 10 出现的 bug。

$ ssh -T git@git.xxxcorp.com
CreateProcessW failed error:2
ssh_askpass: posix_spawn: No such file or directory
Host key verification failed.

这表示没有连接成功,尝试很多种方法都没有成功。最后分析的原因是当本地 known host 的 SSH 删除后,本地无法和远程建立关系了,因此产生报错,这里需要用 git clone 等命令,触发远程服务器下载新的 SSH 密钥到本地~/.ssh/known_hosts才行。然后就可以见到known_hosts 新增了一行。

$ git clone xxx
Cloning into 'xxx'...
The authenticity of host 'git.xxxcorp.com (xx.xxx.xx.xxx)' can't be established.
ECDSA key fingerprint is SHA256:lU3pxbSR9Xv/J6rhSovtqceu83EzAiAct152SjxBKas.
Are you sure you want to continue connecting (yes/no)? y
Please type 'yes' or 'no': yes
Warning: Permanently added 'xxx,xxx' (ECDSA) to the list of known hosts.

注意 Warning也说了新增这个动作也完成了。

7.8 定期清理 SSH 和 access token

revoke 所有的 GitHub Personal access tokens

Figure 7.7: revoke 所有的 GitHub Personal access tokens

这样才安全。

Gitee 也是,所有的 SSH keys 都控制好。

Figure 7.8: Gitee 也是,所有的 SSH keys 都控制好。

GitLab 也是,所有的 SSH keys 都控制好。

Figure 7.9: GitLab 也是,所有的 SSH keys 都控制好。

7.9 点对点交流方式

打开任意文件,点击 blame

## 
## 载入程辑包:'magrittr'
## The following object is masked from 'package:purrr':
## 
##     set_names
## The following object is masked from 'package:tidyr':
## 
##     extract

点击对应行。

按照 shift 键,可以选择一个区域。

然后复制生成的 URL,http://xxx.log#L6300-6321

当其他用户访问这个链接时,会自动跳转到你设定的行数,这样方便查看。

7.10 提醒注意

在正文 @人 是不提醒,放心使用,并且如果要让别人知道,在 comments 进行。

7.11 使用 blame 页面

可以查询一个文档的历史更新记录。

o ## 保持文档的覆盖习惯

不需要新建文件,替换我的文件,以你更新的为准。这样你的commit会保留,这样我们最新的以你为准。 并且之后的修改记录都会保留,方便 review。

7.12 设定项目人员的权限

7.13 Merge Request Minimal Example

fork

new dir

new file

merge request

add a member

see changes

comment on a line

针对 master 的一行code

add issue

总结

  1. 从一个 merge request
  2. issue
  3. commit
  4. one line code

7.14 Merge Request 报错

merge request 历史上已经建立了,不需要重复建立。

7.15 提取 project 列表

参考 Stack Overflow 学习了 GitLab 提取 project 列表。

curl --header 'PRIVATE-TOKEN: <your_token>' 'https://gitlab.com/api/v4/projects?owned=true'

7.16 查看最近 issue 的进度

image

Last Updated 就可以查看这周的进度。

7.17 python-gitlab

参考 https://python-gitlab.readthedocs.io/en/stable/install.html

pip install --upgrade python-gitlab

8 Gitee

8.1 私有选择

私有项目

8.2 可以手动同步 github 项目

image

$ git clone https://gitee.com/jiaxiangli/aRt.git
Cloning into 'aRt'...
remote: Enumerating objects: 357, done.
remote: Counting objects: 100% (357/357), done.
remote: Compressing objects: 100% (286/286), done.
remote: Total 357 (delta 63), reused 357 (delta 63)3 MiB/s
Receiving objects: 100% (357/357), 285.80 MiB | 4.37 MiB/s, done.
Resolving deltas: 100% (63/63), done.
Checking out files: 100% (256/256), done.

下载好了。

8.3 同步项目考虑 Gitee

我解释下原因,因为目前 GitHub 这个项目比较大,由于 GitHub 服务器挂在国外,所以慢,慢的太长,就掉了。 但是正常的 commit 可以的,只是一次性要下载全部才会出现这个问题。 因此我在国内的 Gitee 也同步每个 commit,思路就是从 gitee clone 下来,然后修改 push 的地址为 GitHub,这样就可以正常使用了。

8.4 权限管理

Gitee 权限管理

Figure 8.1: Gitee 权限管理

9 使用 release

目前 GitHub,有部分同学说下载不了数据、下载速度慢,影响效率,这里准备统一转移到国内的仓储,目前选择码云(gitee)。

项目邀请链接 https://gitee.com/jiaxiangli/xxxxx/invite_link?invite=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

上面是 token,点进去,就被邀请到项目了。

测试发版后 https://gitee.com/jiaxiangli/xxxxx/releases/test-upload 可以顺利下载数据

下载链接为 https://gitee.com/jiaxiangli/xxxxx/attach_files/424999/download

10 附录

10.1 使用 BFG 完成 Git 历史记录清理

注意看,即使删除文件,历史记录还是有敏感信息。

我们把Github上托管的项目,下载镜像。 等于不下载大文件,而只是 .git

$ git clone --mirror https://github.com/JiaxiangBU/test_bfg.git
Cloning into bare repository 'test_bfg.git'...
remote: Enumerating objects: 28, done.
remote: Counting objects: 100% (28/28), done.
remote: Compressing objects: 100% (14/14), done.
remote: Total 28 (delta 13), reused 26 (delta 11), pack-reused 0
Unpacking objects: 100% (28/28), done.

备份,

cp ... ...

下载 bfg,见 github, rtyley 语言通过 scala 完成。

cd 对应路径

java -jar ~/Downloads/bfg-1.13.0.jar --delete-files i_want_to_delete_it.R test_bfg.git

~/Downloads/bfg-1.13.0.jar这里给清楚 bfg 的地址。


$ java -jar ../bfg-1.13.0.jar --delete-files i_want_to_delete_it.R

Using repo : /Users/vija/Downloads/180805_folder_01/tmp_jli/trans/projIN/tmp_mirror/test_bfg.git

Found 6 objects to protect
Found 2 commit-pointing refs : HEAD, refs/heads/master

Protected commits
-----------------

These are your protected commits, and so their contents will NOT be altered:

 * commit f04d6c3d (protected by 'HEAD')

Cleaning
--------

Found 4 commits
Cleaning commits:       100% (4/4)
Cleaning commits completed in 89 ms.

Updating 1 Ref
--------------

    Ref                 Before     After   
    ---------------------------------------
    refs/heads/master | f04d6c3d | 71139798

Updating references:    100% (1/1)
...Ref update completed in 18 ms.

Commit Tree-Dirt History
------------------------

    Earliest      Latest
    |                  |
      .    .    D    m  

    D = dirty commits (file tree fixed)
    m = modified commits (commit message or parents changed)
    . = clean commits (no changes to file tree)

                            Before     After   
    -------------------------------------------
    First modified commit | 2aed1451 | b9c4b8dd
    Last dirty commit     | 2aed1451 | b9c4b8dd

Deleted files
-------------

    Filename                Git id         
    ---------------------------------------
    i_want_to_delete_it.R | 13792e77 (19 B)


In total, 3 object ids were changed. Full details are logged here:

    /Users/vija/Downloads/180805_folder_01/tmp_jli/trans/projIN/tmp_mirror/test_bfg.git.bfg-report/2018-12-31/10-41-59

BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive


--
You can rewrite history in Git - don't let Trump do it for real!
Trump's administration has lied consistently, to make people give up on ever
being told the truth. Don't give up: https://www.aclu.org/
--

参考 rtyley


$ git reflog expire --expire=now --all && git gc --prune=now --aggressive
Counting objects: 15, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (13/13), done.
Writing objects: 100% (15/15), done.
Total 15 (delta 4), reused 0 (delta 0)
git push

再查看原来的commit, i_want_to_delete_it.R的记录已经删除。

10.1.1 当前路径删除大文件

如出现如下报错

$ git push
Enumerating objects: 389, done.
Counting objects: 100% (385/385), done.
Delta compression using up to 4 threads
Compressing objects: 100% (372/372), done.
Writing objects: 100% (380/380), 142.65 MiB | 2.62 MiB/s, done.
Total 380 (delta 138), reused 1 (delta 0)
remote: Resolving deltas: 100% (138/138), completed with 2 local objects.
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: error: Trace: c2d7da981f786ca2477897234efa8fd0
remote: error: See http://git.io/iEPt8g for more information.
remote: error: File proj_archive/fcontest/xiaosong_sent_files/tojxg/round2.2_full_data_ipt_miss_wo_scale_part2.csv is 222.08 MB; this exceeds GitHub's file
 size limit of 100.00 MB
remote: error: File proj_archive/fcontest/xiaosong_sent_files/tojxg/round2.2_full_data_ipt_miss_wo_scale_part1.csv is 222.83 MB; this exceeds GitHub's file
 size limit of 100.00 MB
To https://github.com/JiaxiangBU/imp_rmd.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'https://github.com/xxx/xxx.git'
cd 当前路径

输入类似的命令

java -jar ~/Downloads/bfg-1.13.0.jar --delete-files round2.2_full_data_ipt_miss_wo_scale_part1.csv .git
java -jar ~/Downloads/bfg-1.13.0.jar --delete-files round2.2_full_data_ipt_miss_wo_scale_part2.csv .git

注意这里不要加上之前的路径

Error: *** Can only match on filename, NOT path *** - remove '/' path segments

直接写文件名就好,这里我好奇,万一文件名字同名了怎么办。

10.1.2 例子1

java -jar ~/Downloads/dmg/bfg-1.13.0.jar --delete-files act_train.csv
java -jar ~/Downloads/dmg/bfg-1.13.0.jar --delete-files act_test.csv
java -jar ~/Downloads/dmg/bfg-1.13.0.jar --delete-files people.csv
java -jar ~/Downloads/dmg/bfg-1.13.0.jar --delete-files sample_submission.csv

例如 pred4RedHat 项目的数据存在 50MB 以上的,来自 Kaggle,可以通过以上代码批量删除。 批量代码可以参考 fs 库完成。

10.1.3 例子2

```bash
$ git push
Enumerating objects: 62, done.
Counting objects: 100% (62/62), done.
Delta compression using up to 4 threads
Compressing objects: 100% (46/46), done.
Writing objects: 100% (52/52), 193.73 MiB | 273.00 KiB/s, done.
Total 52 (delta 21), reused 0 (delta 0)
remote: Resolving deltas: 100% (21/21), completed with 6 local objects.
remote: warning: File model/zh/zh.bin.syn0.npy is 57.34 MB; this is larger than GitHub's recommended maximum
 file size of 50.00 MB
remote: warning: File model/zh/zh.bin.syn1neg.npy is 57.34 MB; this is larger than GitHub's recommended maxi
mum file size of 50.00 MB
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.git
hub.com.
remote: error: Trace: 3675f52209d254c26786988926b674f8
remote: error: See http://git.io/iEPt8g for more information.
remote: error: File model/zh/zh.tsv is 212.88 MB; this exceeds GitHub's file size limit of 100.00 MB
To https://github.com/JiaxiangBU/doc2vec2cluster.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'https://github.com/JiaxiangBU/doc2vec2cluster.git'
```

zh.bin.syn0.npyzh.bin.syn1neg.npy需要删除。

  • D:\install\java\bin加如环境变量。

  • 需要重启即可。java 就可以用了,注意文件只要 basename。

    $ java -jar /d/install/bfg-1.13.0.jar --delete-files zh.bin.syn0.npy
    $ java -jar /d/install/bfg-1.13.0.jar --delete-files zh.bin.syn1neg.npy
    
    Using repo : D:\work\doc2vec2cluster\.git
    
    Found 109 objects to protect
    Found 3 commit-pointing refs : HEAD, refs/heads/master, refs/remotes/origin/master
    
    Protected commits
    -----------------
    
    These are your protected commits, and so their contents will NOT be altered:
    
     * commit 9e3a3aee (protected by 'HEAD')
    
    Cleaning
    --------
    
    Found 7 commits
    Cleaning commits completed in 384 ms.
    
    Updating 1 Ref
    --------------
    
        Ref                 Before     After   
        ---------------------------------------
        refs/heads/master | 9e3a3aee | 4282da27
    
    ...Ref update completed in 22 ms.
    
    Commit Tree-Dirt History
    ------------------------
    
        Earliest      Latest
        |                  |
         .  .  . D  D  D  m 
    
        D = dirty commits (file tree fixed)
        m = modified commits (commit message or parents changed)
        . = clean commits (no changes to file tree)
    
                                Before     After   
        -------------------------------------------
        First modified commit | bbb26eb0 | 770f3119
        Last dirty commit     | c24fcc7e | 0e600810
    
    Deleted files
    -------------
    
        Filename             Git id            
        ---------------------------------------
        zh.bin.syn1neg.npy | 3dd97315 (57.3 MB)
    
    
    In total, 9 object ids were changed. Full details are logged here:
    
        D:\work\doc2vec2cluster.bfg-report\2020-02-20\19-40-13
    
    BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive
    
    
    --
    You can rewrite history in Git - don't let Trump do it for real!
    Trump's administration has lied consistently, to make people give up on ever
    being told the truth. Don't give up: https://www.theguardian.com/us-news/trump-administration
    --
    $ git reflog expire --expire=now --all && git gc --prune=now --aggressive | clip
    Enumerating objects: 163, done.
    Counting objects: 100% (163/163), done.
    Delta compression using up to 4 threads
    Compressing objects: 100% (149/149), done.
    Writing objects: 100% (163/163), done.
    Total 163 (delta 31), reused 89 (delta 0)
    git push

10.2 cheatsheets

10.3 使用 RStudio Git 模块查看 commit 记录

我觉得比 GitHub 上的要更加清晰一些。

点击历史

查看每次提交的记录

这里每一个文件都可以查看。

10.5 id 出现 unknown 的情况

增加这一行就好,use_git_config(user.name = "Jiaxiang Li", user.email = "alex.xxx@foxmail.com")

10.6 局部更新

用这个来局部一点点修改

image image

10.7 选择 git2r

因为 push 稳定。

10.8 RStudio 查看文件

查看同一文档的 commits

Figure 10.1: 查看同一文档的 commits

参考文献

Baarsen, Jeroen van. 2014. GitLab Cookbook: Over 60 Hands-on Recipes to Efficiently Self-Host Your Own Git Repository Using GitLab. Packt Publishing.
Chacon, Scott, and Ben Straub. 2014. Pro Git. 2nd ed. 2014. Apress.
McGeary, Ryan. 2009. “How to List All the Files in a Commit?” Stack Overflow. 2009. https://stackoverflow.com/a/424142/8625228.
Wilson, Greg. 2017. “Introduction to Git for Data Science.” 2017. https://www.datacamp.com/courses/introduction-to-git-for-data-science.
夕小瑶. 2020. “Git从入门到进阶,你想要的全在这里.” 夕小瑶的卖萌屋. 2020. https://mp.weixin.qq.com/s/z_zFveiiLu9vLvWuBcsaIg.