On August 31, 2025, between 11:50 AM and 7:53 PM PDT, some customers experienced a complete outage of PlayFab Matchmaking. The incident was caused by a regression in the authentication library, which led to unexpected high call rates to Entra ID for token acquisition, resulting in service throttling and failures. We resolved the issue by increasing authentication rate limits, updating the authentication code, recycling affected application instances, and isolating impacted titles into dedicated deployments.
During the outage, matchmaking services were unavailable for all affected titles, preventing players from joining matches. Notification failures also occurred, resulting in players being stuck in matchmaking and not receiving proper updates.
The root cause of the incident was a regression in the updated authentication library, which failed to cache tokens as expected. This led to a surge in token requests to Entra ID, causing throttling and authentication failures across the service. The updated library created new instances for each authentication attempt, breaking the caching mechanism and increasing network traffic unexpectedly.
To prevent similar incidents from happening again, we have taken the following actions: