We’re investigating eliminating this class of problem during new code deployments by using static code analysis to stop us from accidentally writing database migrations that add indexes in a way that is likely to cause this variety of problem again.
Posted Apr 20, 2020 - 20:50 UTC
We deploy new code to the platform a few times per day. Sometimes those new code deployments include changes to the underlying database supporting the platform.
We deployed a new version of the application which included a new database index on a large table that is used to store signatures. Adding this index was slow, and not properly configured to allow database writes to continue while the index was being created. Allowing an index creation to be applied in production without the CONCURRENTLY flag which allows simultaneous database writes should have been caught during code review. As a result signature creations were temporarily blocked, and other page requests backed up behind these requests. This issue was resolved by the database index operation completing successfully.