System Status

Status of PlayFab services and history of incidents

Operational
Partial Outage
Major Outage

Potential Segmentation Service Degradation

Incident Report for PlayFab

Postmortem

On April 17th, 2024, between 1:49 PM and 10:15 PM UTC, some customers experienced delays and errors when using PlayFab's Gaming Insights service, which provides reports, trends, and segmentation features for game developers. The incident was caused by a storage outage in the West US 2 region that throttled two of our Kusto clusters. We resolved the issue by scaling up the clusters while the storage outage was ongoing. 

Action Items 

To prevent similar incidents from happening again, we have taken the following actions: 

  • We turned on the monitoring and alerting for the Kusto clusters, which was previously disabled. 
  • We changed the autoscaling rules for both clusters to be more reactive to changes in CPU and cache utilization. 
  • We improved our communication with the Kusto team and subscribed to their outage notifications.
Posted May 08, 2024 - 10:45 PDT

Resolved

This incident has been resolved.
Posted Apr 18, 2024 - 11:04 PDT

Update

We are continuing to monitor for any further issues.
Posted Apr 18, 2024 - 08:11 PDT

Update

We are continuing to monitor for any further issues.
Posted Apr 17, 2024 - 21:41 PDT

Update

We are seeing minor delays in data processing, which may affect the performance of segmentation-related features and APIs.
Posted Apr 17, 2024 - 21:41 PDT

Monitoring

We're currently monitoring service degradation that could affect segmentation, potentially causing delays in data processing and impacting the performance or functionality of segmentation-related features.
Posted Apr 17, 2024 - 20:51 PDT
This incident affected: Analytics (Trends, Reports, Dashboard) and API (Data).