Skip to content

PyThaiNLP v5.1.0 Released!

Latest
Compare
Choose a tag to compare
@wannaphong wannaphong released this 25 Feb 12:13
d88d971

We released PyThaiNLP v5.1.0! This version has increased features and fixed problems such as Thai Discourse Treebank (TDTB), Thai Solar Date converted to Thai Lunar Date, and others.

Install: pip install pythainlp
Upgrade: pip install -U pythainlp

See PyThaiNLP 5.1 Change Log: #900

What is new?

New features

  • Add Thai Discourse Treebank postag #910
  • Add Thai Universal Dependency Treebank postag #916
  • Add Thai G2P v2 Grapheme-to-Phoneme model #923
  • Add support for list of strings as input to sent_tokenize() #927
  • Add pythainlp.tools.safe_print to handle UnicodeEncodeError on console #969
  • Add Thai Solar Date convert to Thai Lunar Date #998
  • Add Thai pangram text #1045
  • Add pythainlp.llm #1043

Bug fixes

  • Fix collate() to consider tonemark in ordering #926
  • Fix maiyamok() that expanding the wrong word #962
  • Fix nlpo3.load_dict() that never print error msg when not success #979

Remove

  • Remove clause_tokenize #1024

Deprecation and other API changes

  • 5.1
    • pythainlp.util.is_native_thai, use instead pythainlp.morpheme.is_native_thai
  • 5.2
    • pythainlp.cls, use instead pythainlp.classify
    • pythainlp.corpus.thai_synonym, use instead pythainlp.corpus.thai_synonyms
    • pythainlp.util.maiyamok, use instead pythainlp.util.expand_maiyamok

Improve

  • Add more Thailand political party to Thai dictionary 2252dee
  • Fix inconsistency in newmm-safe engine by copilot #1063
  • Update warn_deprecation to get deprecated and removal versions #1028
  • Remove unnecessary enumerate in expand_maiyamok #1029
  • Add SPDX FileType #1032
  • Fix bug in Longest Matching tokenizer to preprocess spaces consistently #1062
  • Add codemeta.json file to root directory #1053

Full Changelog: v5.0.0...v5.1.0

Contributors

Thanks all the contributors. (Image made with contributors-img)

We build Thai NLP.

PyThaiNLP