Incident Summary:
At approximately 10:53 a.m. on Tuesday, November 28th, 2017, the GeorgiaVIEW Online Learning QPROD environment experienced a disruption of service due to database blocking. While no outage occurred, this negatively impacted performance for a small subset of users. These issues persisted until approximately 11:13 a.m., when full functionality of the service was restored.
Because we recognize that interruptions of GeorgiaVIEW service impact institutions across the state, we are communicating this post-outage analysis of what occurred and the measures being taken to address the factors resulting in this incident.
Incident Cause:
Upon investigation, ITS technical staff determined that the application code related to dropbox and file submission was causing excessive database blocking. Blocking, a term that describes database processes systematically waiting their turns so as to ensure database consistency as multiple changes are requested at the same time, is completely normal. Excessive blocking, however, which can occur for many reasons including poor application/query design, stale indexes, and unexpected changes in the use of the application, can result in significant performance problems.
Incident Response Measures:
To correct the issue, ITS technical staff identified and terminated the so-called "lead blocker" process. To help prevent this situation from recurring, ITS staff forced a refresh of the statistics relevant to the blocking event and also added a job that will automatically refresh the same statistics on a more frequent and ongoing basis. (Microsoft SQL Server automatically keeps these internal statistics and uses them to help optimize database activity.)