Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dutch model does not seem to work out of the box #18

Open
tcbrouwer opened this issue Jan 26, 2024 · 0 comments
Open

Dutch model does not seem to work out of the box #18

tcbrouwer opened this issue Jan 26, 2024 · 0 comments

Comments

@tcbrouwer
Copy link

tcbrouwer commented Jan 26, 2024

I tried to run the dutch model but I can not seem to get it to work. What am I doing wrong?

Here is a minimal example with the resulting error message.

# The minimal example of diaparser is:

# >>> from diaparser.parsers import Parser
# >>> parser = Parser.load('en_ewt-electra')
# >>> dataset = parser.predict([['She', 'enjoys', 'playing', 'tennis', '.']], prob=True)

# Let's try this with dutch

from diaparser.parsers import Parser
parser = Parser.load('nl_alpino_lassysmall.wietsedv')
dataset = parser.predict([['Zij', 'houdt', 'van', 'tennissen', '.']], prob=True)

# Note that using 'nl_alpino_lassysmall-wietsedv' as model name seems to just load the english model, which yields bad results for dutch.

Running the code yields the following error:

Some weights of BertModel were not initialized from the model checkpoint at wietsedv/bert-base-dutch-cased and are newly initialized: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[15], line 10
      1 # The minimal example of diaparser is:
      2 
      3 # >>> from diaparser.parsers import Parser
   (...)
      6 
      7 # Let's try this with dutch
      9 from diaparser.parsers import Parser
---> 10 parser = Parser.load('nl_alpino_lassysmall.wietsedv')
     11 dataset = parser.predict([['Zij', 'houdt', 'van', 'tennissen', '.']], prob=True)
     13 # Note the using 'nl_alpino_lassysmall-wietsedv' as model name seems to just load the english model, which yields bad results for dutch.

File ~/Projects/UDParserEvaluation/venv/lib/python3.10/site-packages/diaparser/parsers/parser.py:263, in Parser.load(cls, name_or_path, lang, cache_dir, **kwargs)
    261 model = cls.MODEL(**args)
    262 model.load_pretrained(state['pretrained'])
--> 263 model.load_state_dict(state['state_dict'], False)
    264 model.to(args.device)
    265 transform = state['transform']

File ~/Projects/UDParserEvaluation/venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2152, in Module.load_state_dict(self, state_dict, strict, assign)
   2147         error_msgs.insert(
   2148             0, 'Missing key(s) in state_dict: {}. '.format(
   2149                 ', '.join(f'"{k}"' for k in missing_keys)))
   2151 if len(error_msgs) > 0:
-> 2152     raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
   2153                        self.__class__.__name__, "\n\t".join(error_msgs)))
   2154 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for BiaffineDependencyModel:
	size mismatch for feat_embed.bert.embeddings.word_embeddings.weight: copying a param with shape torch.Size([30000, 768]) from checkpoint, the shape in current model is torch.Size([30073, 768]).

Is the model not intended to be run this way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant