Changelog

0.20:

experimental OpenAI support

0.19: is not used

0.18:

bug #49, #50, #51 fixed

0.17:

bug #47 fixed

0.16:

bug #46 fixed
improvements in #45 implemented

0.15:

XObject contents are supported
--translation-heuristic experimental option added

0.14:

--version option added
build system is migrated from setup.py to pyproject.toml

0.13:

new feature: the use of metadata if exists. it is not enabled by default.

0.12:

reorganized the project structure and files (see additional notes for v0.12 below)
fixes bug #31
pdfminer version updated
new feature: converts latin ligatures (ff, fi, fl, ffi, ffl, ft, st = Unicode FB00-FB06) to individual characters by default
started using standard logging, thus the log prints go to stderr

0.11:

functionally same as 0.10, including some pylint fixes.

0.10:

--page-number argument added. Related issue is here.
potentially a fix implemented for some files having non-zero Trm[1] and Trm2[] elements. This change might cause different outputs than previous versions of pdftitle. This is related to the issue raised here.
verbose and error messages improved.
pdfminer version updated.

0.9:

retrieve_spaces function is made non-recursive.
eliot algorithm is implemented for this issue, test file is woo2019.pdf
eliot-tfs option is implemented for eliot algorithm.
stack trace was printed only in verbose mode, this behavior is changed and now stack trace is printed always if there is an error.

0.8:

make the title like title case (-t) using Python title method.
pdfminer version updated.
algorithm flag (-a). default is the original algorithm so no change.
max2 algorithm is implemented for this issue, test file is paran2010.pdf.

0.7:

changes and fixes for pylint based on Jakob Guldberg Aaes's recommendation.
no functional changes.

0.6:

rename file name to title (-c). Contributed by Tommy Odland.
pdfminer version updated.

0.5:

fixed install problem with 0.4
pdfminer version updated.

0.4:

Merged #e4bb0d6 to detect and remove duplicate spaces in the returned title. Contributed by Jakob Guldberg Aaes (https://github.com/jakob1379).

0.3:

Merged #f65ff4c and #f5c60c0 for identifying spaces when no space char is used. Contributed by Fabien Couthouis (https://github.com/Fabien-Couthouis).

0.2:

changed version string to major.minor format.
pdftitle can be used as a library for a project, use get_title_from_io method
added chardet as a dependency
algorithm is changed but there are problems with finding the word boundaries