Amazon S3/EC2/AWS outage take II
A few days ago I wrote about the Amazon outage of the popular S3/EC2/AWS services. Yesterday I received some more detailed information via Shahram that tried to explain what had happened. Those messages below were posted on an Amazon Bulletin Board where they kept track of the issues.
And for those of you who don’t want to read through all of the stuff: They had the nicest problem one can have: success disaster. Too many people using the service beyond it’s capacity. In this particular case it seemed that a cryptographic sub-system could not handle all the requests that were thrown at it.
First message from an Amazon employee:
Quick note to keep everyone up to date. The team continues to be heads down focused on getting to root cause on this morning’s problem. One of our three geographic locations for S3 was unreachable beginning at 4:31 a.m. PST and was back to near normal performance at 6:48 a.m. PST (a small number of customers experienced intermittent issues for a short period thereafter). Though we’re proud of our uptime track record over the past two years with this service, any amount of downtime is unacceptable and we won’t be satisfied until it’s perfect. We will be providing additional information on this thread as soon as we have it.
Second message from an Amazon employee:
Here’s some additional detail about the problem we experienced earlier today.
Early this morning, at 3:30am PST, we started seeing elevated levels of authenticated requests from multiple users in one of our locations. While we carefully monitor our overall request volumes and these remained within normal ranges, we had not been monitoring the proportion of authenticated requests. Importantly, these cryptographic requests consume more resources per call than other request types.
Shortly before 4:00am PST, we began to see several other users significantly increase their volume of authenticated calls. The last of these pushed the authentication service over its maximum capacity before we could complete putting new capacity in place. In addition to processing authenticated requests, the authentication service also performs account validation on every request Amazon S3 handles. This caused Amazon S3 to be unable to process any requests in that location, beginning at 4:31am PST. By 6:48am PST, we had moved enough capacity online to resolve the issue.
As we said earlier today, though we’re proud of our uptime track record over the past two years with this service, any amount of downtime is unacceptable. As part of the post mortem for this event, we have identified a set of short-term actions as well as longer term improvements. We are taking immediate action on the following: (a) improving our monitoring of the proportion of authenticated requests; (b) further increasing our authentication service capacity; and (c) adding additional defensive measures around the authenticated calls. Additionally, we’ve begun work on a service health dashboard, and expect to release that shortly.
And a non-Amazon party (company who uses the service) reported this:
What caused the problem however was a sudden unexpected surge in a particular type of usage (PUT’s and GET’s of private files which require cryptographic credentials, rather than GET’s of public files that require no credentials). As I understand what Kathrin said, the surge was caused by several large customers suddenly and unexpectedly increasing their usage. Perhaps they all decided to go live with a new service at around the same time, although this is not clear. What is clear however is that S3 was the momentary victim of its own success, but the problem was quickly rectified.
Tasty planner
I have a tendency to go grocery shopping about 2-3 times a week. That’s way too often (in my opinion) and shows that I have no plan. I usually think 1-2 dinners ahead, but I’m certainly not in a position where I have the whole week planned. I need help! Instead of getting a personal chef, I decided to get things organized differently. Tasty Planner to the rescue!
Tasty planner seems to be pretty new (judging from the few blog entries). They already have a nice recipe collection, but by far not as many as other recipe-sites out there. What differentiates them is how easy they make it to generate a weekly plan and the accompanied shopping list. Browse recipes, assign them to days of the week and, voila, you get a shopping list with all the ingredients for the whole week. The one feature that really distinguishes them from the rest of the crowd: an iPhone interface for the shopping list. Whatever ingredients you selected for the week, the stuff is readily available on the iPhone while you are shopping. Including checking off items that just went into your shopping cart. Brilliant!
I’ll let you know how things are going once I used them for a while.
On the wishlist for the service are:
- automatic sorting of grocery items (group all veggies, fruit, etc.)
- nutritional information for recipes (wouldn’t it be great to actually have a service where you can provide a calorie count and it would select suitable recipes automatically?)
Overall, the site is pleasant on the eye and well-done.
Footprints? Footprints!
What’s pictured above are some footprints right next to the stairs at the back of the house. The problem with those: they are fresh (max a few days old), they lead up the stairs and down again and they are not mine! I went on the terrace to clean Pia’s boots and while brushing them I noticed the prints in the mud/sand next to the stairs. I tried to recall whether I (or somebody else I know) could have been there. Looks like a size 10-11 to me and I’m much closer to a 13. I have to assume that somebody I don’t know was walking around in the back of the house.
You ask why this is worth mentioning? If you saw the gaping hole in a glass door of my neighbor’s home (unoccupied) you would have to agree that this is indeed worth posting. His house got broken into about 2 weeks ago. The thieves only took a few items, because there was not a lot to be had there.
I’m so happy that I work out of my home and that I’m in fact home most of the time. My dogs will also get some extra treats later on. And it’s time to extend my surveillance strategy: inside the house I have a few wireless cameras. Whenever I’m gone those cameras switch into motion detection mode and automatically email pictures to me when motion is detected. I think I got to extend that to work on the outside as well … sigh!
No power and a missile range
Friday’s dinner at Santa Cafe was quite interesting. When we arrived there we were greeted with a candlelit bar - only candles and nothing else. The bar lady told us that a prior power outage had left parts of the restaurant without power, including the bar. Soon after we ordered our dinner, the power went out completely. As we had ordered our entrees already, we were assured that the kitchen would complete all placed orders, but no new orders were accepted. Now imagine a very busy and small kitchen operating without exhausts. All doors to the kitchen were opened and windows all over the restaurant were used to try to get the smoke out of the restaurant. It almost felt like having dinner on the patio, only the temperature was a bit lower than expected. Like the Ore House, quoted in the article above, I’m sure tons of other businesses in town lost a lot of money that evening.
Then yesterday afternoon I made the mistake of going on Guadalupe Street after doing some grocery shopping. I’m talking about the section between Cerrillos Road and Paseo de Peralta. I’m sure you heard about the dismal state of Guadalupe since the pavement had been stripped just before Christmas. If not, then just read those letters to the editor of the Santa Fe New Mexican.
Things have gotten 10 times worse since then. Driving at 4-5 mph I was rocked like in a roller-coaster. The whole area feels like a missile test range with impact craters all over the place. If you love your car, don’t go on this road.
Won’t take long before the area around Agua Fria looks like this:

NPR: Dissecting People’s ‘Predictably Irrational’ Behavior
NPR: Dissecting People’s ‘Predictably Irrational’ Behavior - this was making the rounds in the office earlier on today. Fascinating view into the way the human mind works when it comes to make economic decisions. I hope that not a lot of marketing people are listening to that program …
andLinux: a new breed of Linux distribution
When people are asked why they haven’t tried out Linux yet, quite often you hear arguments like: “I don’t want to install another disk drive”, “I don’t want to mess with my Windows partition”, etc. To a certain extent those are valid arguments, especially if you are scared messing with your computers hardware or some of the more geeky aspects of the software environment.
While there was always the possibility to “get your feet wet” using virtualization software like VMWare, where the software simulates a virtual machine that is completely independent from the host operating system (think “computer inside a computer”), now there’s an even easier method.
Let me introduce andLinux. andLinux is based on the work that went into the “Cooperative Linux” project. Instead of relying on virtualization technologies, coLinux managed to compile the Linux kernel in a way that allows it to run side-by-side with the host OS. No virtualization required, no messing with your disk layout, no additional software required. andLinux installs just like any other Windows application. All you need is a nice chunk of disk space. Unlike coLinux, andLinux deals with some of the scary parts of the coLinux configuration. It will automatically configure network connections correctly and even share your windows filesystem with the Linux portion. The later allows you to access all your windows files from Linux.
I tried the KDE-version of andLinux the other day and just wanted to write a bit about my experience.
After downloading some 665MB I ended up with a single installer executable.
The installation process was painless and I quickly answered the few questions that came up during the installation: you tell it where to install it, you tell it how much memory you want to dedicate to your Linux “computer”, if and how you want to share the windows filesystem (I picked the easier coFS option) and off you go.
During the installation a new network driver was copied to the system (TAP-Win32), which required a restart once the installation completed. The network driver allows communication between the Windows OS and Linux OS as if those systems resided on a different network and were connected via Ethernet-cables.

The installer left three desktop shortcuts on the desktop and two quickstart icons in the taskbar (pictured above, I dragged the two items from the taskbar [right-most] on the desktop).
After the restart I also found a new item in my system tray (pictured on the right). Those items in the systray-menu allow you to execute a bunch of the most often used applications directly from there. Once andLinux is running on your system, you right-click on the tray-icon, select the application and it will be run under Linux.
So I finally went ahead and started andLinux via the “Start andLinux” shortcut on the desktop. A new console window (labelled “andServer (CoLinux)”) opened and I saw messages scrolling by that documented the startup process (you’ll see that console window further down in this post).
From the startup messages you can see that I opted to reserve up to 384MB of my memory for andLinux. The documentation suggests that you should not use less than 256MB, however under certain circumstances you might even get away with 128MB (expect the performance to become sluggish at that value).
Linux’ root file system is in a single file called “base.drv”. Extra swap space is created in a separate file called “swap.drv”. During the startup process andLinux will attach to those files and treat the single file like a “disk”.
As you can see from the startup messages, andLinux also configures two network adapters (eth0 and eth1). The first one is used to “bridge” Linux networking with your real network adapter on Windows - this allows Linux to reach the Internet. The second one is used as a communication device between Windows and Linux and that’s the place where the above mentioned TAP-Win32 driver comes into play.

With the andLinux server running I clicked on the “KHomeFolder” shortcut and within a second I had KDE’s Konqueror window with root’s home folder on my screen.

I tried a number of other applications and the performance was very nice. Overall the applications felt very responsive and I did not get the impression that those were not native Windows applications.
Behind the scenes a special version of the X11-server, Xming is being used to accomplish that. Xming is launched when the andLinux server starts up. It creates an X11-screen that coexists with the Windows desktop. There’s no switching between virtual screens, everything looks like it’s part of your standard Windows installation.
After starting and closing a number of applications, I was curious what Linux’ memory consumption looked like. To my surprise I found the following:

I still had 280MB (from the total of 384MB) physical memory available - nice!
Network connectivity worked right after the start. Where I previously had to tweak/configure settings in coLinux, andLinux made that portion a snap and configured everything for you.

Internet access from Linux applications worked as expected and my Windows file-system was automatically mounted (you can see it above in the Konquerer screenshot).
All in all a more than pleasant experience.
I guess there’s no more excuse now not to give Linux a try. The only thing that makes you scratch your head: why would you want to run a more stable operating system under a less stable one? But, I guess, that’ll be subject of another post.
Convert those HD-DVDs to Blu-Ray
Now that Toshiba has officially announced the death of HD-DVD, what do you do with all those HD-DVD discs you bought already? Fear not! Venture out, get a Blu-Ray player and then use Wired’s (doom9-lifted) instructions to convert those movies over to Blu-Ray. And the discs can be used as decorative coasters ![]()
“To my mind Adobe is quickly passing Google in both the beauty and usefulness of its online apps.”
From Scott Fitzgerald Johnson’s blog - I’m taking off for the rest of the day (it’s time to pick up Pia anyway) ![]()
Comments(2)