This project's goal is to extract the "assumed rate of return" from about 3000 municipal 2020 financial reports in Michigan state.
The extracted result is then compared against three reported data:
- Actuarial Assumed Rate of Investment Return in "OPEB Summary and UA",
- Discount Rate in "OPEB Summary and UA", and
- Actuarial Assumed Rate of Investment Return in "Pension Summary and UA".
Put all pdf files to folder MI2020
Run file convert.sh
to convert these pdf files into text files.
2. Using regular expression to extract the information of interest and the context surrounding these figures.
Run file carf1.py
to extract all numbers associated with "investment rate of return", "discount rate".
The extracted result is output1.csv
file.
This analysis is run in R
.
Using output1.csv
file above to compare against Actuarial Assumed Rate of Investment Return and Discount rate in "Michigan Pension and OPEB Assumption Data 2020.xlsx".
Run comparing_with_opeb2.Rmd
inside folder "comparing_extracted_OPEB_pension_RAnalysis". Knitting the file to get final comparison report: comparing_with_opeb2.html
.
List of discrepancies between extracted data and both OPEB and Pension: not_matched_with_any
List of pairwise discrepancies between extracted data and OPEB: not_matched_with_opeb
; between extracted data and Pension: not_matched_with_pension