[mthreads] bert_hf 1x8 #350

mingyuanw-mt · 2023-12-04T07:51:31Z

No description provided.

yuzhou03 · 2023-12-08T03:26:34Z

training/mthreads/bert_hf-pytorch/README.md

+| ------------------ | --------- | ---- | ---- | ---- | ---- | ---- |  ---- | ---- |
+| S4000单机单卡（1x1） | fp32 | bs=20, lr=2.5e-05 |  |  |  |  |  | /48.0 |
+| S4000单机单卡（1x1） | amp | bs=20, lr=2.5e-05 |  |  |  |  |  | 35.0/48.0 |
+| S4000单机8卡（1x8） | fp32 | bs=16 |  |  |     |  | |  |


请填写acc和mem数据

yuzhou03 · 2023-12-08T03:29:16Z

training/mthreads/bert_hf-pytorch/README.md

+| ------------------ | --------- | ---- | ---- | ---- | ---- | ---- |  ---- | ---- |
+| S4000单机单卡（1x1） | fp32 | bs=20, lr=2.5e-05 |  |  |  |  |  | /48.0 |
+| S4000单机单卡（1x1） | amp | bs=20, lr=2.5e-05 |  |  |  |  |  | 35.0/48.0 |
+| S4000单机8卡（1x8） | fp32 | bs=16 |  |  |     |  | |  |


1x8配置中，精度是amp

mingyuanw-mt · 2023-12-11T05:21:57Z

Please take a further look at training/mthreads/bert_hf-pytorch/config/environment_variables.sh @yuzhou03 @shh2000 , just in case communication overrun.

* [kunlunxin] fix tacotron2 running error and add 1x1 & 2x8 config (#346) * [kunlunxin] fix tacotron2 running error and add 1x1 & 2x8 config * [kunlunxin] modify tacotron2 test_config * [kunlunxin] update tacotron2 readme * [kunlunxin] modify tacotron2 torch.load() * [iluvatar] swin_transformer-pytorch 1x1 2x8 (#340) * update iluvatar/swin_transformer-pytorch * update * update * update * fix batch size mistake in readme * correct val_loss to final acc1 * add finnal_acc1 and mem in readme * correct readme mem --------- Co-authored-by: 魏杰 <jie.wei@iluvatar.com> Co-authored-by: 杨智超 <zhichao.yang@iluvatar.com> Co-authored-by: clveryang <yangclver@gmail.com> * fix get_system_info for iluvatar_monitor (#351) Co-authored-by: zhouyu <zhouyu@baai.ac.cn> * update iluvatar mobilenetv2 config (#356) Co-authored-by: sen.li <sen.li@iluvatar.com> * Update README.md (#357) * Update README.md * Update README.md * [iluvatar] bertlarge inference case (#353) * iluvatar bertlarge MLM inference case * update ixrt readme --------- Co-authored-by: 杨智超 <zhichao.yang@iluvatar.com> * [mthreads] bert_hf 1x8 (#350) * support bert_hf fp32/amp/bf16 training for mthreads * update readme * prevent overrun * 1x1/2x8 not support * 【mthreads】【block】resnet50 training (#246) * support resnet50 training on mthreads * fix typo * support rn50 amp training on mthreads * add test config (should revert this commit) * update config & readme * add get_system_info fn * update * 1x1/2x8 not support --------- Co-authored-by: Zhou Yu <zycosmos@gmail.com> * fix llama, add TFLOPS log (#358) * fixllama * add t/tflops * [mthreads] deepspeed llama2 * update readme for sdpa --------- Co-authored-by: jamesruio <44098605+jamesruio@users.noreply.github.com> Co-authored-by: swish swish <62273738+cloud9wj@users.noreply.github.com> Co-authored-by: 魏杰 <jie.wei@iluvatar.com> Co-authored-by: 杨智超 <zhichao.yang@iluvatar.com> Co-authored-by: clveryang <yangclver@gmail.com> Co-authored-by: Zhou Yu <zycosmos@gmail.com> Co-authored-by: zhouyu <zhouyu@baai.ac.cn> Co-authored-by: forestlee95 <82379785+forestlee95@users.noreply.github.com> Co-authored-by: sen.li <sen.li@iluvatar.com> Co-authored-by: uuup <55571217+upvenly@users.noreply.github.com> Co-authored-by: clveryang <50865584+clveryang@users.noreply.github.com> Co-authored-by: mingyuanw-mt <130545727+mingyuanw-mt@users.noreply.github.com> Co-authored-by: shh2000 <13820618441@163.com>

yuzhou03 reviewed Dec 8, 2023

View reviewed changes

yuzhou03 approved these changes Dec 8, 2023

View reviewed changes

shh2000 approved these changes Dec 11, 2023

View reviewed changes

mingyuanw-mt added 4 commits December 11, 2023 15:53

support bert_hf fp32/amp/bf16 training for mthreads

f6a6c10

update readme

9cdfcd9

prevent overrun

ef99c73

1x1/2x8 not support

8716317

mingyuanw-mt force-pushed the bert_hf branch from e51a68e to 8716317 Compare December 11, 2023 07:55

upvenly merged commit 5dfb723 into FlagOpen:main Dec 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mthreads] bert_hf 1x8 #350

[mthreads] bert_hf 1x8 #350

mingyuanw-mt commented Dec 4, 2023

yuzhou03 Dec 8, 2023

yuzhou03 Dec 8, 2023

mingyuanw-mt commented Dec 11, 2023

[mthreads] bert_hf 1x8 #350

[mthreads] bert_hf 1x8 #350

Conversation

mingyuanw-mt commented Dec 4, 2023

yuzhou03 Dec 8, 2023

Choose a reason for hiding this comment

yuzhou03 Dec 8, 2023

Choose a reason for hiding this comment

mingyuanw-mt commented Dec 11, 2023