Python 开发包 Cookbook
2020-03-12
- 使用 RMarkdown 的
child
参数,进行文档拼接。 - 这样拼接以后的笔记方便复习。
- 相关问题提交到 Issue
1 模板创建项目
从 nbdev_template 创建项目,
按照模板新建,先是 private
的,等完成后确认无误,再公开。
To check if a repository is available to use as a template, get the repository’s information using the
GET /repos/:owner/:repo
endpoint and check that theis_template
key istrue
.
# https://github.com/fastai/nbdev_template
gh::gh("GET /repos/:owner/:repo", owner = "fastai", repo = "nbdev_template", is_template = TRUE)
参考 https://developer.github.com/v3/repos/#create-repository-using-a-repository-template
gh::gh("POST /repos/:template_owner/:template_repo/generate",
template_owner = "fastai", template_repo = "nbdev_template",
owner = "JiaxiangBU", name = "test-create-from-template",
description = "Test to create a GitHub repository from a template",
private = TRUE,
.send_headers = c(Accept = "application/vnd.github.baptiste-preview+json"))
gh::gh("POST /repos/:template_owner/:template_repo/generate",
template_owner = "fastai", template_repo = "nbdev_template",
owner = "JiaxiangBU", name = "doc2vec2cluster",
description = "Cluster the documents using doc2vec",
private = TRUE,
.send_headers = c(Accept = "application/vnd.github.baptiste-preview+json"))
2 创建 R proj
然后 create_project
方便构建,这样可以用 RStudio 管理项目。
调用 dev_history 脚本。
3 从原项目 copy 文件
用另存为加入 ipynb,因为这样不会带入一些历史信息。 或者,在原有项目构建的话,把 ipynb 文档放到根目录。
file_copy("xxx.ipynb", "model.ipynb")
最好清洗干净后加入过来。 复制过来以后直接 Restart and clear output。 人工查看了没有敏感词,但是可以把包发出来。pull 最新的代码
复制
settings.ini
的信息和setup.py
文件nbdev_build_lib
报错,没有 copy docs$ nbdev_build_lib Converted ch3-01-automation.ipynb. Converted ch4-abtest.ipynb. Traceback (most recent call last): File "d:\install\miniconda\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "d:\install\miniconda\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "D:\install\miniconda\Scripts\nbdev_build_lib.exe\__main__.py", line 9, in <module> File "d:\install\miniconda\lib\site-packages\fastscript\fastscript.py", line 42, in _f func(**args.__dict__) File "d:\install\miniconda\lib\site-packages\nbdev\cli.py", line 22, in nbdev_build_lib write_tmpls() File "d:\install\miniconda\lib\site-packages\nbdev\export2html.py", line 325, in write_tmpls write_tmpl(config_tmpl, 'user lib_name title copyright description', cfg, cfg.doc_path/'_conf ig.yml') File "d:\install\miniconda\lib\site-packages\nbdev\export2html.py", line 319, in write_tmpl dest.write_text(outp) File "d:\install\miniconda\lib\pathlib.py", line 1218, in write_text with self.open(mode='w', encoding=encoding, errors=errors) as f: File "d:\install\miniconda\lib\pathlib.py", line 1186, in open opener=self._opener) File "d:\install\miniconda\lib\pathlib.py", line 1039, in _opener return self._accessor.open(self, flags, mode) FileNotFoundError: [Errno 2] No such file or directory: 'D:\\work\\conversion_metrics\\docs\\_con fig.yml'
[DEFAULT] # All sections below are required unless otherwise specified lib_name = item_based_recommender user = JiaxiangBU description = Item based recommender keywords = recommender author = Jiaxiang Li author_email = alex.lijiaxiang@foxmail.com copyright = Jiaxiang Li branch = master version = 1.0.0
更改许可证
r usethis::use_apl2_license()
增加 sample 数据
library(fs) dir.create("output") file_copy("../item_based_recommender/output/sample_df_refactored.csv", "output/")
3 创建README
把 index 页面复制。
4 项目迁移
如果项目是从老项目迁移(如建立包的目的),下面采用这样的思路和函数。
- 先构建项目。
- 然后更新原有项目进行说明,
- 产生 EX 的逻辑。
$ cp ../wei_lda_debate/README.Rmd .
$ cp -R ../wei_lda_debate/docs .
$ cp ../wei_lda_debate/set*
$ cp ../wei_lda_debate/*.ipynb .
$ mkdir data
$ cp ../wei_lda_debate/*.csv data/
$ cp ../wei_lda_debate/stopwords.txt data/
$ mkdir model
$ cp ../wei_lda_debate/Makefile .
$ cp ../wei_lda_debate/dtm-win64.exe refs/
$ cp ../wei_lda_debate/analysis/build-README.R analysis/
run all ipynb
中间如果有调用自建函数的,加入
6 构建 docs
完成以上 Commit
7 打开 GitHub Pages
参考 https://github.com/r-lib/gh/issues/107
gh::gh(
"POST /repos/:owner/:repo/pages",
owner = "JiaxiangBU",
repo = "test-gh",
source = list(branch = jsonlite::unbox("master"), path = jsonlite::unbox("")),
.send_headers = c(Accept = "application/vnd.github.switcheroo-preview+json")
)
因此只有["", "/docs"]
两种情况。
{
"url": "https://api.github.com/repos/JiaxiangBU/test-gh/pages",
"status": {},
"cname": {},
"custom_404": false,
"html_url": "https://jiaxiangbu.github.io/test-gh/",
"source": {
"branch": "master",
"path": "/"
}
}
The README.md is valid, https://jiaxiangbu.github.io/test-gh/README.md.
> dir.create("docs")
> file.edit("docs/index.html")
> git2r::add(path = "docs/")
> git2r::commit(message = "add index.html")
[0fa6c7b] 2020-02-06: add index.html
> git2r::push(name = 'origin', refspec = "refs/heads/master", cred = git2r::cred_token())
生成 docs
8 make release 失效
NB:
make release
will automatically increment the version number in settings.py before pushing a new release to pypi. If you don’t want to do this, runmake pypi
instead.
make release 这个命令不存在。
9 发版 PyPi
jupyter nbconvert index.ipynb --to markdown --output README.md
python setup.py sdist bdist_wheel
python -m twine upload dist/*
否则 README.md
会出 bug。
然后会反馈项目地址,如
commit 发布新的版本。
On branch master
Your branch is ahead of 'origin/master' by 2 commits.
(use "git push" to publish your local commits)
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: build/lib/pyks/__init__.py
modified: build/lib/pyks/_nbdev.py
modified: build/lib/pyks/ks.py
modified: pyks.egg-info/PKG-INFO
Untracked files:
(use "git add <file>..." to include in what will be committed)
dist/pyks-1.1.5-py3-none-any.whl
dist/pyks-1.1.5.tar.gz
no changes added to commit (use "git add" and/or "git commit -a")
10 docs 不要用中文
会生成乱码。
11 其他 ipynb 写法
如果不需要生成 docs 只是 Python 包,那么就不需要完成。
这样会如同 Rmd 文档的 yaml block 一样,转换成 HTML 文件的标题等。
custom_sidebar = False
不要乱改。
更新参考 https://nbdev.fast.ai/cli/#create_default_sidebarsettings.ini
的参数custom_sidebar = True
并且修改docs/sidebar.json
12 多个作者
<h4 align="center">**Code of Conduct**</h4>
<h6 align="center">Please note that the `learn_dev_py_pkg` project is released with a [Contributor Code of Conduct](https://github.com/JiaxiangBU/learn_dev_py_pkg/blob/master/CODE_OF_CONDUCT.md).<br>By contributing to this project, you agree to abide by its terms.</h6>
<h4 align="center">**License**</h4>
<h6 align="center">What license it uses © [Jiaxiang Li and Shuyi Wang](https://github.com/JiaxiangBU/learn_dev_py_pkg/blob/master/LICENSE.md)</h6>
13 todos 进 commit
放到 commit 最后都可以整理成 NEWS 的!
14 Update NEWS
以包 add2md 为例。
add2git:::commit2news(repo_path = ,min = ,max = )
- updated the function of
checkbox
with the feature checked -> unchecked. - updated the function
checkbox
with the featurechecked
. - reformatted code.
- depreciated the function
append_archive
. - fixed bugs and typos and encoding problems.
- translated non-ASCII characters into unicode, add
@params
args, delete packrat, ignore docs-building, add global variables. - updated imported packages, and remove depreciated functions in vignettes document.
- added the function
add_wechat_portfolio
. - updated the function
diagrammer_shortcut
. - made the function
get_input
exported. - made functions output visible.
- added arg
para_length
for the functionextract_firstline
. - updated license.
把所有 commit 拿出来然后进行整合。 这就是为什么要认真写 commit。
15 release
release 时需要 upstream 和 local 一致,所以需要重新设置 upstream
16 发送博客
应该用 release notes 和 readme 一起。
17 Python 命名原则
参考 https://stackoverflow.com/a/17487228/8625228
identifier ::= (letter|"_") (letter | digit | "_")*
18 建立index.Rmd
---
output: github_document
bibliography: refs/add.bib
---
I join this competition [data-science-bowl-2019], which ends on January 15, 2020. For the data feature, I do some work on the series features, using word2vec, LDA and node2vec.
1. [wide and deep]
1. [node2vec]
1. [LDA]
The baseline feature engineering I forked from @Massoud_Hosseinali2019. However, it helps me focus on series features.
Also, I use LTSM model to elaborate series features, I forked from @Grecnik2019.
[data-science-bowl-2019]: https://www.kaggle.com/c/data-science-bowl-2019
[wide and deep]: https://github.com/JiaxiangBU/data-science-bowl-2019EX/blob/master/wide_and_deep.ipynb
[node2vec]: https://github.com/JiaxiangBU/data-science-bowl-2019EX/blob/master/node2vec.ipynb
[LDA]: https://github.com/JiaxiangBU/data-science-bowl-2019EX/blob/master/lda.ipynb
然后把输出index.md
复制到index.ipynb
,然后修改index.ipynb
index.ipynb
的介绍和settings.ini
的desc保持一致。
19 lib_name
包名和GitHub项目名称不一致的解决办法
打开文件setup.py
,修改
20 make public
项目代码可以公开。
21 注意自建包的路径
这个不能录入的原因是这是安装的包,里面没有这个函数。 但是我们要检测的包是自建的,路径不相同。
22 版权符号,取用 unicode
原因参考 https://blog.csdn.net/github_35160620/article/details/53512967 这里进行替换即可。
## [1] "<U+00A9>"
在修改后
[Jiaxiang Li](LICENSE.md)
没有显示出来。
如果输入
的确是后面不显示的。
23 NameError
$ nbdev_build_docs
converting: D:\work\pyks\00_core.ipynb
An error occurred while executing the following cell:
------------------
show_doc(plot, default_cls_level=2)
------------------
?[1;31m---------------------------------------------------------------------------?[0m
?[1;31mNameError?[0m Traceback (most recent call last)
?[1;32m<ipython-input-3-83f9287125ad>?[0m in ?[0;36m<module>?[1;34m?[0m
?[1;32m----> 1?[1;33m ?[0mshow_doc?[0m?[1;33m(?[0m?[0mplot?[0m?[1;33m,?[0m ?[0mdefault_cls_level?[0m?
[1;33m=?[0m?[1;36m2?[0m?[1;33m)?[0m?[1;33m?[0m?[1;33m?[0m?[0m
?[0m
?[1;31mNameError?[0m: name 'plot' is not defined
NameError: name 'plot' is not defined
参考 https://github.com/fastai/nbdev/issues/25 和 https://github.com/fastai/nbdev/commit/f618197b5e32847e2718583ac439662b739364f2
NameError: name 'plot' is not defined
的报错一直没有解决。
24 nbdev 有了新的版本
Thank-you, I was on 0.2.4 and upgraded to 0.2.7 . Now it works. Appreciate it!(Dustin 2020)
settings.ini
配置进行了修改,是最新的版本了。
25 改了文件夹大小写
不会产生 git 的修改
26 学习声明
参考 https://www.python.org/dev/peps/pep-0263/
!
和 jupyter notebook 一样,启用哪一个 exe
If a source file uses both the UTF-8 BOM mark signature and a magic encoding comment, the only allowed encoding for the comment is ‘utf-8’. Any other encoding will cause an error.
使用 ‘utf-8’
28 搞清楚了 gbk
rmarkdown::render
- 就是 ’’ 产生 ‘’
- 但是 "" 会 ""
29 Editable Installation
pip install --help
...
-e, --editable <path/url> Install a project in editable mode (i.e. setuptools
"develop mode") from a local project path or a VCS url.
pip install -e /srv/pkg
where /srv/pkg is the top-level directory where ‘setup.py’ can be found.
30 更新版本号
31 临时安装包
用最新状态,避免发一次版本。
通过本地安装。
或者设计成 make 命令。
local:
nbdev_build_lib
python setup.py sdist bdist_wheel
pip install dist/doc2vec2cluster-0.0.1.tar.gz
参考 https://blog.csdn.net/liuhongyue/article/details/52514706?locationNum=12&fps=1
$ pip install dist/doc2vec2cluster-0.0.1.tar.gz
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Processing d:\work\doc2vec2cluster\dist\doc2vec2cluster-0.0.1.tar.gz
Building wheels for collected packages: doc2vec2cluster
Building wheel for doc2vec2cluster (setup.py) ... done
Stored in directory: C:\Users\lijiaxiang\AppData\Local\pip\Cache\wheels\e4\c2\03\5d561f5b4c162473186652402
5bd31768b76c145037bc3ceb8
Successfully built doc2vec2cluster
Installing collected packages: doc2vec2cluster
Successfully installed doc2vec2cluster-0.0.1
通过GitHub安装,但是要修改好 README。
pip install .
本地安装失败,因为被 .Rproj.user/
lock 了。
参考 https://blog.csdn.net/weixin_34223655/article/details/85969556
ERROR: Complete output from command python setup.py egg_info:
ERROR: Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\LIJIAX~1\AppData\Local\Temp\pip-req-build-yrf_5m1d\setup.py", line 13, in <module>
for o in expected: assert o in cfg, "missing expected setting: {}".format(o)
AssertionError: missing expected setting: description
----------------------------------------
ERROR: Command "python setup.py egg_info" failed with error code 1 in C:\Users\LIJIAX~1\AppData\Local\Temp\p
ip-req-build-yrf_5m1d\
也会报错。
发版也要等一会。 新版本还没有更新。
$ pip install dynamic-topic-modeling==1.1.0
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting dynamic-topic-modeling==1.1.0
ERROR: Could not find a version that satisfies the requirement dynamic-topic-modeling==1.1.0 (from versions: 1.0.0, 1.0.1, 1
.0.2)
ERROR: No matching distribution found for dynamic-topic-modeling==1.1.0
$ pip install git+https://github.com/JiaxiangBU/dynamic_topic_modeling.git
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting git+https://github.com/JiaxiangBU/dynamic_topic_modeling.git
Cloning https://github.com/JiaxiangBU/dynamic_topic_modeling.git to c:\users\lijiax~1\appdata\local\temp\pip-req-build-0_83_
npo
Running command git clone -q https://github.com/JiaxiangBU/dynamic_topic_modeling.git 'C:\Users\LIJIAX~1\AppData\Local\Temp\
pip-req-build-0_83_npo'
Building wheels for collected packages: dynamic-topic-modeling
Building wheel for dynamic-topic-modeling (setup.py) ... done
Stored in directory: C:\Users\LIJIAX~1\AppData\Local\Temp\pip-ephem-wheel-cache-s296m0mu\wheels\e4\8c\1a\cfd34a2f6db119fd7bf
eb9c8c2094f7a0790657b5998983037
Successfully built dynamic-topic-modeling
Installing collected packages: dynamic-topic-modeling
Found existing installation: dynamic-topic-modeling 1.0.2
Uninstalling dynamic-topic-modeling-1.0.2:
Successfully uninstalled dynamic-topic-modeling-1.0.2
Successfully installed dynamic-topic-modeling-1.1.0
UnicodeDecodeError: 'gbk' codec can't decode byte 0x99 in position 407: illegal multibyte sequence
最后修改了 readme 就好了,所以 readme 还是很重要!
附录
参考文献
Dustin. 2020. GitHub. 2020. https://github.com/fastai/nbdev/issues/52#issuecomment-576047224.