Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix incomplete abstract and title issue
In some cases the title and/or abstract obtained was incomplete (issue gijswobben#23). This happens when the text contains html markup tags. The most frequent ones seems to be (in descending order): <i>, <sub>, <sup>, <b>, <mml:*>, ...?. Example: PMID 31689885 <ArticleTitle>Gamma Irradiated <i>Rhodiola sachalinensis</i> Extract Ameliorates [...]</ArticleTitle> Before the fix the returned title was just: 'Gamma Irradiated ' <AbstractText>The effect of <i>Rhodiola sachalinensis</i> Boriss extract irradiated [...]</ArticleTitle> Before the fix the returned abstract was just: 'The effect of ' Fastest solution found: cleanup of tags. It seems to fix gijswobben#23 correctly, at least for non-mml tags. NB: cleaning of nested <mml:*> tags can result in multiple blanklines.
- Loading branch information