We released PyThaiNLP v5.1.0! This version has increased features and fixed problems such as Thai Discourse Treebank (TDTB), Thai Solar Date converted to Thai Lunar Date, and others.
Install: pip install pythainlp
Upgrade: pip install -U pythainlp
- Documentation: https://pythainlp.github.io/docs/5.1
- Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See PyThaiNLP 5.1 Change Log: #900
What is new?
New features
- Add Thai Discourse Treebank postag #910
- Add Thai Universal Dependency Treebank postag #916
- Add Thai G2P v2 Grapheme-to-Phoneme model #923
- Add support for list of strings as input to sent_tokenize() #927
- Add pythainlp.tools.safe_print to handle UnicodeEncodeError on console #969
- Add Thai Solar Date convert to Thai Lunar Date #998
- Add Thai pangram text #1045
- Add pythainlp.llm #1043
Bug fixes
- Fix collate() to consider tonemark in ordering #926
- Fix maiyamok() that expanding the wrong word #962
- Fix nlpo3.load_dict() that never print error msg when not success #979
Remove
- Remove clause_tokenize #1024
Deprecation and other API changes
- 5.1
pythainlp.util.is_native_thai
, use insteadpythainlp.morpheme.is_native_thai
- 5.2
pythainlp.cls
, use insteadpythainlp.classify
pythainlp.corpus.thai_synonym
, use insteadpythainlp.corpus.thai_synonyms
pythainlp.util.maiyamok
, use insteadpythainlp.util.expand_maiyamok
Improve
- Add more Thailand political party to Thai dictionary 2252dee
- Fix inconsistency in newmm-safe engine by copilot #1063
- Update warn_deprecation to get deprecated and removal versions #1028
- Remove unnecessary enumerate in expand_maiyamok #1029
- Add SPDX FileType #1032
- Fix bug in Longest Matching tokenizer to preprocess spaces consistently #1062
- Add codemeta.json file to root directory #1053
Full Changelog: v5.0.0...v5.1.0
Contributors
Thanks all the contributors. (Image made with contributors-img)
We build Thai NLP.
PyThaiNLP