Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hot fixes to minimize and close idle database connections #547

Merged
merged 9 commits into from
Jan 6, 2022

Conversation

aufdenkampe
Copy link
Member

@aufdenkampe aufdenkampe commented Jan 6, 2022

As described in #543 (comment), the old code that inserted data from device POSTs, based on a set of convoluted Django models, was leaving open an ever increasing number of idle connections to the PostgreSQL database server. This resulted in every increasing CPU utilization that would max out in less than a week unless the database server was rebooted as seen below:

AWS-CPU-2022-01-03

This general issue is well-described by:

This PR replaces most of the data insert Django model code with code based on SQLAchemy. When deployed to production as a hot fix at around 16:48 UTC on Jan. 5., the base CPU load decreased and has stayed low, as shown here:
MicrosoftTeams-image (6)

Quick additional testing showed that:

  • API traffic response times are about 1/3 of what they were on the django models dependent code.
  • Database CPU utilization spikes were at 1/4 to 1/2 what it was previously immediately after a reboot, and stayed that way for >24 hours rather than steadily increasing.

This PR fixes:

This implements a threaded approach to the datastream view, which should increase performance by insert data in parallel.
Performance profiling showed that the Django models used in the view end point for 'api/data-stream' were acting as a bottle neck, and also did not allow for asynchronous support. This commit is a patch which replaces those models with direct SQL, which should be more performant and also supports multithreading.
My previous commit (7612930) replace django models with customized queries. These queries leverages the DO operation and some Postgres IF logic. While this executes fine in Postgres, it doesn't appear to function with SQLAlchemy. This commit replace those queries with more simple logic that is supported by SQLAlchemy.
@aufdenkampe
Copy link
Member Author

Note that these hot fixes required us to set Gunicorn to only have 1 worker with 8 threads, to avoid the potential problems described in Connection Pooling — SQLAlchemy 1.4 Documentation.

This simple work-around leaves us we plenty of room to further enhance performance by properly configuring SQLAlchemy to use Multiprocessing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants