We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
网络为分类模型,本地运行可以成功,提交到集群上面时,刚开始运行就出错
............................*** Aborted at 1500366428 (unix time) try "date -d @1500366428" if you are using GNU date *** *** Aborted at 1500366428 (unix time) try "date -d @1500366428" if you are using GNU date *** PC: @ 0x70d722 paddle::PrecisionRecallEvaluator::calcStatsInfo() *** SIGFPE (@0x70d722) received by PID 15143 (TID 0x7fe0872d0880) from PID 7395106; stack trace: *** @ 0x7fe086ec0160 (unknown) @ 0x70d722 paddle::PrecisionRecallEvaluator::calcStatsInfo() @ 0x70f0f0 paddle::PrecisionRecallEvaluator::evalImp() @ 0x70ecbe paddle::Evaluator::eval() @ 0x745f98 paddle::CombinedEvaluator::eval() @ 0x7393b4 paddle::MultiGradientMachine::eval() @ 0x78d64a paddle::TrainerInternal::trainOneBatch() @ 0x787dcf paddle::Trainer::trainOnePass() @ 0x78b494 paddle::Trainer::train() @ 0x5c02e3 main @ 0x7fe08549bbd5 __libc_start_main @ 0x5cf9a1 (unknown) PC: @ 0x70d722 paddle::PrecisionRecallEvaluator::calcStatsInfo() *** SIGFPE (@0x70d722) received by PID 25723 (TID 0x7ff8b3aaf880) from PID 7395106; stack trace: ***
配置网络如下,集群版本内layers.py中没有seq_reshape_layer层,本地修改了这个文件后加了seq_reshape_layer层后提交到集群:
data_word = data_layer(name="word", size=num_word) data_postag = data_layer(name="postag", size=num_postag) data_arc = data_layer(name="arc", size=num_arc) if not is_predict: data_label = data_layer(name="label", size=num_classes) word_attr = ParameterAttribute(initial_std=1/8.0, initial_mean=0.0) tag_attr = ParameterAttribute(initial_std=1/4.0, initial_mean=0.0) label_attr = ParameterAttribute(initial_std=1/4.0, initial_mean=0.0) embedding_word = embedding_layer(input=data_word, size=word_dim, param_attr=word_attr) srl_word = seq_reshape_layer(input=embedding_word, reshape_size=20*word_dim) embedding_postag = embedding_layer(input=data_postag, size=postag_dim, param_attr=tag_attr) srl_tag = seq_reshape_layer(input=embedding_postag, reshape_size=20*postag_dim) embedding_arc = embedding_layer(input=data_arc, size=arc_dim, param_attr=label_attr) srl_arc = seq_reshape_layer(input=embedding_arc, reshape_size=12*arc_dim) concat = concat_layer(input=[srl_word, srl_tag, srl_arc], act=LinearActivation()) bias_attr = ParameterAttribute(initial_std=0., l2_rate=0.0001) w_attr = ParameterAttribute(initial_std=1e-4, initial_mean=0.0) hidden1 = fc_layer(input=concat, size=hidden_dim, act=ReluActivation(), param_attr=w_attr, bias_attr=bias_attr) hidden2 = fc_layer(input=hidden1, size=hidden_dim, act=ReluActivation(), param_attr=w_attr, bias_attr=bias_attr) output = fc_layer(input=hidden2, size=num_classes, act=SoftmaxActivation(), param_attr=w_attr, bias_attr=bias_attr) if not is_predict: cls_loss = classification_cost(input=output, label=data_label, evaluator=[precision_recall_evaluator, classification_error_evaluator]) outputs(cls_loss) else: outputs(output)
任务链接为: http://yq01-idl-gpu-offline62.yq01.baidu.com:8880/output/list/9066
The text was updated successfully, but these errors were encountered:
这是浮点数异常, SIGFPE
SIGFPE
参考 #2563 (comment)
只是一种思路,可以尝试下。
不确定真与 seq_reshape_layer 有关系。
Sorry, something went wrong.
优化方法换成learning_method=AdamOptimizer()可正常运行了,谢谢,辛苦了
No branches or pull requests
网络为分类模型,本地运行可以成功,提交到集群上面时,刚开始运行就出错
配置网络如下,集群版本内layers.py中没有seq_reshape_layer层,本地修改了这个文件后加了seq_reshape_layer层后提交到集群:
任务链接为: http://yq01-idl-gpu-offline62.yq01.baidu.com:8880/output/list/9066
The text was updated successfully, but these errors were encountered: