System Status

Status of PlayFab services and history of incidents

Operational
Partial Outage
Major Outage
Potential Segmentation Service Degradation
Incident Report for PlayFab
Postmortem

On April 17th, 2024, between 1:49 PM and 10:15 PM UTC, some customers experienced delays and errors when using PlayFab's Gaming Insights service, which provides reports, trends, and segmentation features for game developers. The incident was caused by a storage outage in the West US 2 region that throttled two of our Kusto clusters. We resolved the issue by scaling up the clusters while the storage outage was ongoing. 

Action Items 

To prevent similar incidents from happening again, we have taken the following actions: 

  • We turned on the monitoring and alerting for the Kusto clusters, which was previously disabled. 
  • We changed the autoscaling rules for both clusters to be more reactive to changes in CPU and cache utilization. 
  • We improved our communication with the Kusto team and subscribed to their outage notifications.
Posted May 08, 2024 - 10:45 PDT

Resolved
This incident has been resolved.
Posted Apr 18, 2024 - 11:04 PDT
Update
We are continuing to monitor for any further issues.
Posted Apr 18, 2024 - 08:11 PDT
Update
We are continuing to monitor for any further issues.
Posted Apr 17, 2024 - 21:41 PDT
Update
We are seeing minor delays in data processing, which may affect the performance of segmentation-related features and APIs.
Posted Apr 17, 2024 - 21:41 PDT
Monitoring
We're currently monitoring service degradation that could affect segmentation, potentially causing delays in data processing and impacting the performance or functionality of segmentation-related features.
Posted Apr 17, 2024 - 20:51 PDT
This incident affected: Analytics (Trends, Reports, Dashboard) and API (Data).