System Status

Status of PlayFab services and history of incidents

Operational
Partial Outage
Major Outage
Degraded performance of segmentation APIs and segment counts.
Incident Report for PlayFab
Postmortem

On March 18th, 2024, between 4:45 PM and 10:40 PM UTC, some customers experienced errors and delays when using Segmentation, Reports and Trends, and Title Overview features on PlayFab's API. The incident was caused by a bug in the GDPR service that resulted in a high volume of delete operations on the Azure Data Explorer (ADX) cluster which hosts the data for these features. The delete operations increased the load on the cluster and triggered a throttling mechanism that aborted some queries. We resolved the issue by moving the queries to a secondary ADX cluster and fixing the bug in the GDPR service. 

Action Items 

  • We fixed the bug in the GDPR service that caused the missed deletes and the high volume of re-run deletes. 
  • We increased the batch size and reduced the frequency of the delete operations submitted to the cluster to reduce the load on the cluster. 
  • We have stabilized the secondary cluster to continue to exclusively process queries from the Segmentation, Reports and Trends, and Title Overview features. The GDPR delete operations are now run only in the primary cluster without impacting the feature queries.
Posted Apr 23, 2024 - 17:10 PDT

Resolved
This incident has been resolved.
Posted Mar 19, 2024 - 09:35 PDT
Monitoring
A fix has been implemented and we are monitoring the status and performance.
Posted Mar 18, 2024 - 22:48 PDT
Investigating
We are investigating degraded performance and increased errors for Segmentation APIs and Segment and Player counts in Game Manager.
Posted Mar 18, 2024 - 21:10 PDT