[megatron-bert-uncased-345m] fix conversion #16639
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes: #16638
The original conversion script made an assumption that all released
megatron-bert-*-345m
checkpoints had the same vocab, but https://huggingface.co/nvidia/megatron-bert-cased-345m/blob/main/vocab.txt and https://huggingface.co/nvidia/megatron-bert-uncased-345m/blob/main/vocab.txt are quite different.This PR sets
config.vocab_size
to the actual size of one of the params of vocab dimension.I tested that both checkpoints mentioned above convert and load correctly:
both succeed.
Before this PR only the former worked, and the 2nd failed with:
29056 is the vocab size of
megatron-bert-cased-345m
@LysandreJik, @sgugger