Degraded performance of diary application in the UK/Europe Region
Incident Report for ResDiary
Postmortem

All times in this report are in BST.

At around 17:02 the ResDiary support team we were notified that users were experiencing issues accessing their restaurant diary on the UK/Europe server. This was escalated to an engineer to investigate at 17:10 and after initial investigation and performance of the database was identified as the underlying issue.

At 17:35 the engineer made a configuration change to the database to improve the performance and shortly after the system returned to normal.

We’d like to sincerely apologise for any inconvenience caused and assure you that we are focused on identifying and implementing the changes required to prevent situations like this in the future.

How we are preventing this in the future

  • We are currently working on improving our monitoring of our SQL Server databases, and should have improved monitoring and alerting online within the next few weeks. This will allow us to react faster to situations like this, ideally solving them before they become user visible.

  • We are working on identifying the parts of our application that cause the highest database usage, and will be making changes to make it less likely we have these kinds of incidents in the first place.

  • We will investigate the particular query that caused the problem on Sunday, and will aim to prevent it from causing us problems in future.

Posted Apr 01, 2019 - 11:45 UTC

Resolved
We have been monitoring the affected system over the last three hours and it has been performing normally. We will continue monitoring and provide a post-mortem as soon as we have more information.
Posted Mar 31, 2019 - 19:50 UTC
Monitoring
The performance of the database has improved and we are continuing to monitor.
Posted Mar 31, 2019 - 16:45 UTC
Identified
Engineers have identified a problem with one of our SQL Server databases. We have made a configuration change to try and resolve the issue.
Posted Mar 31, 2019 - 16:35 UTC
Investigating
We are experiencing an issue affecting one or more of our services.

An engineer is investigating the cause and will provide updates as they are available.
Posted Mar 31, 2019 - 16:10 UTC
This incident affected: ResDiary Application (UK/Europe).