Well, that was just awful: Details on yesterday’s font serving outage
Yesterday was a long day here at Typekit as we dealt with rolling outages from our storage service provider, and we know it was extremely frustrating for many of you. We’d like to apologize for the uncharacteristic hit to our service, and the subsequent effect on your websites.
As some of you may already know, we use Amazon S3 to store fonts and the kit configuration files. According to Amazon, early on the morning of August 10 there was a “misconfiguration,” resulting in an outage on S3 and other services. They reported that they mistakenly pursued the wrong problem at first, which is why the downtime was longer than usual.
Because of this outage, our servers were unable to retrieve fonts and kit configuration files from S3. We first noticed this problem at 12:20 AM PDT (about 10 minutes after the outage started), when we were notified by our automated systems.
We thought the system had stabilized by 4:00 AM PDT, but were unpleasantly surprised by a second outage around 12:00 PM PDT and then a third one shortly after that. Amazon acknowledged the second outage but provided no further insight into what had happened on their end.
What you may have noticed on your website was a “Waiting for use.typekit.net” message, which eventually timed out and then loaded your default fonts instead. Lovely way to start your week, we know.
Outages at the larger service providers like Amazon have effects across huge portions of the internet at a time, which we had the misfortune of witnessing firsthand today. We’re looking into using multiple edge locations for our S3 buckets so we can distribute load over multiple locations. In plain speech, this means that if an outage like this happens again, we’ll be able to switch to different servers which are not affected.
There are a few things you can do to help protect your website from becoming collateral damage during larger service outages.
Last week, we made an update so that new kits will load asynchronously by default. Your current kits may not do this, but it’s easy to get the new version from the kit editor. The result is that if there’s a service outage on our side (or our service provider’s), your fallback fonts will load right away — instead of having to go through the server timeout song and dance.
Speaking of fallback fonts, it’s never a bad idea to put a little thought into what you specify for fallbacks. They’re not just for outages, after all: You never know when someone’s connection speed or custom browser setup might impact the way your site loads or appears. Platform defaults don’t offer quite the typographic palette you might want — that’s probably why you’re using Typekit — but it still gives you a degree of design control. Check our Help documentation for a little detail about styling your fallback fonts using Font Events.
Many of our largest customers who use Typekit Enterprise weren’t affected by this outage, because they’re hosting the entire Typekit solution in their own CDN environment. We’re looking into ways to bring this self-hosting capability to more of our customers in the future. In the meantime, if you’re interested in learning more about Typekit Enterprise, you can contact our Enterprise sales team.
We’re always looking for ways to improve Typekit, and this episode was a powerful reminder of the responsibility we have to demand the highest possible uptime from our partners and vendors, and also to communicate in a timely and effective manner when issues do arise.
In general, if you notice performance issues with the fonts on your site, check our status blog for quick updates on the web font network and any degraded service reports. And as always, feel free to send an email to firstname.lastname@example.org if you have any questions.