-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Typing] Paddle 的 CI 中引入 mypy 对于 API 中 docstring 的示例代码的类型检查 #63901
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
Update 20240505将 mypy 的检查引入 这里需要
配合检查 ~ math.py 中修改两个 api
另外,tools/mypy.ini 还只是最基础的配置,后面应该需要不断更新 ~ @SigureMo 请评审 ~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
其余暂时没什么大问题,等 CI 跑完我看下效果~
另外有一个小问题,我们现在是复用了示例代码的增量检测逻辑对吧(即 API change)
那么有一个问题就是未来升级 mypy 的话,如果 mypy 出现了一些变动导致现有示例类型检查挂掉的话,是很难全部检查到的,有没有办法一次性全检查到呢?因为我们之前引入过各种各样的 linter,升级时出现的一些新的 error 基本是不可避免的
一些个人想法:
在 CI 标题里加 [typing_checking=all]
(后缀形式),则检查全部示例文档
这个 PR 我们只讨论方案,具体实现不在本 PR~
另外有一个问题就是,现在这样的话,对于其他不知情的同学来说,在一个示例代码里改了一个 typo CI 是会通过的吗?是否会导致现存的示例代码就通不过呢?
paddle/scripts/paddle_build.sh
Outdated
@@ -3525,6 +3525,26 @@ function exec_samplecode_test() { | |||
fi | |||
} | |||
|
|||
function exec_type_hints() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
唔,这里检查的地方用 type check 吧,比如 exec_type_checking
tools/sampcd_processor_utils.py
Outdated
patArgSpec = re.compile( | ||
r'^(paddle[^,]+)\s+\(ArgSpec.*document\W*([0-9a-z]{32})' | ||
r'^(paddle[^,]+)\s+\((ArgSpec.*),.*document\W*([0-9a-z]{32})' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pat_arg_spec
或者使用 PATTERN_ARG_SPEC
放在文件开始作为常量吧
@@ -2046,7 +2050,7 @@ def test_timeout(self): | |||
|
|||
def test_bad_statements(self): | |||
docstrings_to_test = { | |||
'bad_fluid': """ | |||
'good_fluid': """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fluid 检查和单测是不是可以退场了?如果使用了 fluid 单测执行时就会挂掉
可以之后单独 PR 退场一下~
tools/type_hints.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同样,这个文件可以改成 type_checker.py
~
tools/test_type_hints.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
下个 PR 可以考虑将这些(含之前的 sample code)移到 test
目录,建一个 test/tools
子目录,不然这些单测没有 CI 的保护很容易失效的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的!之前就想过这个问题,tools 里面的单测得统一管理!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
添加任务 #63597 任务 4-3
TITLE_CHECK_ALL=`curl -s https://github.com/PaddlePaddle/Paddle/pull/${GIT_PR_ID} | grep "<title>" | grep -i "type checking all" || true` | ||
|
||
if [[ ${TITLE_CHECK_ALL} ]]; then | ||
python type_checking.py --full-test; type_checking_error=$? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
关于这块的mypy_cache
有个小问题,如果我根据 shell 脚本来跑,这个路径将会是在tools/.mypy_cache
, 而不是${PADDLE_ROOT}/.mypy_cache
。是故意这么设计的嘛emmm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个有什么影响吗? pyproject.toml
可以配置
[tool.mypy]
python_version = "3.8"
cache_dir = ".mypy_cache"
另外,刚好有个问题请教一下!
我这里 paddle_build.sh
中的 run_type_checking
参考的 check_run_sot_ci
:
# use "git commit -m 'message, test=sot'" to force ci to run
COMMIT_RUN_CI=$(git log -1 --pretty=format:"%s" | grep -w "test=sot" || true)
# check pr title
TITLE_RUN_CI=$(curl -s https://github.com/PaddlePaddle/Paddle/pull/${GIT_PR_ID} | grep "<title>" | grep -i "sot" || true)
if [[ ${COMMIT_RUN_CI} || ${TITLE_RUN_CI} ]]; then
set -x
return
fi
但是,$(git log -1 --pretty=format:"%s" | grep -w "test=type_checking" || true)
没有效果,所以去掉了,只使用了 title 的判断条件 ~
我本地 $(git log -1 --pretty=format:"%s" | grep -w "test=type_checking" || true)
没问题 ~
这是咋回事儿?有啥办法?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个有什么影响吗?
pyproject.toml
可以配置[tool.mypy] python_version = "3.8" cache_dir = ".mypy_cache"
影响就是,这样就会生成和维护两份缓存了
mypy python/paddle/tensor/math.py # cache 位置: ${PADDLE_ROOT}/.mypy_cache
cd tools/
python type_checking.py --full-test # cache 位置: ${PADDLE_ROOT}/tools/.mypy_cache
另外,刚好有个问题请教一下!
可以把git log -1
改成 git log -10
或者更大,因为 ci 在运行之前会 git pull upstream develop
所以导致没有命中我们想要的 commit (sot那个我自己来修吧, 感谢大佬发现的问题)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
影响就是,这样就会生成和维护两份缓存了
😅 我草率了 ... ... 当时没想这么多 ... ...
那我在 type_checking.py
里面改为绝对路径吧 ~
感谢!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gouzil 已修改 ~
Update 20240510
另外,目前这个 PR-CI-Static-Check 是做的全量检查,日志可以保存一下以供参考 ~ @SigureMo 请评审 ~ |
Sorry to inform you that e7be07d's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
Sorry to inform you that 618c3b9's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
paddle/scripts/paddle_build.sh
Outdated
@@ -4536,7 +4634,20 @@ function main() { | |||
api_example) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个 api_example
应该是很久以前的 CI 流水线入口,我们现在检查示例代码在 Static-Check 流水线,对应上面 build_and_check_cpu
和 build_and_check_gpu
,这个入口没用了,直接删掉就好
paddle/scripts/paddle_build.sh
Outdated
|
||
if [[ ${run_tc} -eq 0 ]]; then | ||
summary_type_checking_problems $type_checking_code "$type_checking_info" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
示例代码和类型检查是否可以分别抽一个函数呢?CI 入口一般只做分发,不适合写太多逻辑~
paddle/scripts/paddle_build.sh
Outdated
summary_check_problems $[${example_code_gpu} + ${example_code}] "${example_info_gpu}\n${example_info}" | ||
|
||
run_type_checking | ||
run_tc=$? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
run_tc=$? | |
type_checking_status=$? |
感觉这样语义会更明确些
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
啊,不对,我才发现是上面的 run_type_checking
的语义不对,上面的 run_type_checking
乍一看是已经开始跑类型检查了,但实际上只是检查了标题,是有点奇怪的,可以改成 check_need_type_checking
其结果为 need_type_checking
tools/type_checking.py
Outdated
config_file=( | ||
args.config_file | ||
if args.config_file | ||
else (base_path + '/../pyproject.toml') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议使用 os.path.join
,或者使用 pathlib
,避免直接字符串拼接
paddle/scripts/paddle_build.sh
Outdated
set +x | ||
|
||
# check pr title | ||
TITLE_CHECK=`curl -s https://github.com/PaddlePaddle/Paddle/pull/${GIT_PR_ID} | grep "<title>" | grep -i "type checking" || true` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TITLE_CHECK=`curl -s https://github.com/PaddlePaddle/Paddle/pull/${GIT_PR_ID} | grep "<title>" | grep -i "type checking" || true` | |
TITLE_CHECK=`curl -s https://github.com/PaddlePaddle/Paddle/pull/${GIT_PR_ID} | grep "<title>" | grep -iE "type checking|typing" || true` |
加一个 typing
吧,我比较喜欢用 [Typing]
前缀
Update 20240529
另外:
这两个问题我一起回复一下 :这里是临时的逻辑,后面会去掉 |
python/paddle/tensor/math.py
Outdated
@@ -322,7 +327,7 @@ def scale(x, scale=1.0, bias=0.0, bias_after_scale=True, act=None, name=None): | |||
return helper.append_activation(out) | |||
|
|||
|
|||
def stanh(x, scale_a=0.67, scale_b=1.7159, name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
类型检查跳过了,符合预期~
这里可以恢复下了~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经确认,可以在下个 PR 这么搞下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR Category
User Experience
PR Types
Others
Description
关联 PR : #63597
任务:1-4
在 CI 中引入 mypy 对 docstring 中示例代码的检查。与 xdoctest 的检查项共用一个 CI:
PR-CI-Static-Check
。如果后续有必要的话可以单独开一个 CI 。涉及文件:
paddle/scripts/paddle_build.sh
添加 mypy 检查入口python/unittest_py/requirements.txt
添加 mypy 的依赖,这里用最新的 1.10 版本tools/mypy.ini
配置文件,后面应该会不断更新tools/type_hints.py
使用 mypy 的 api 进行类型检查tools/test_type_hints.py
针对type_hinst.py
的单测tools/sampcd_processor_utils.py
中的get_api_md5
判断 api diff 时,考虑 signature 的变动tools/test_sampcd_processor.py
修改单测,包括get_api_md5
与 适应 0D 后的 xdoctest 检查用于 type hint 检查的文件:
python/paddle/py.typed
python/setup.py.in
setup.py
另外,修改一个文件用于验证:
python/paddle/tensor/math.py