Load spike

Posted on October 29, 2008
Filed Under Incidents

Earlier this afternoon, Melvin experienced a slight load spike due to an exhorbitant amount of Apache processes running on the server.  During this time, domains were experiencing a 500 Internal Server Error when accessing the website(s).  We restarted the Apache process and the load was restored to normal.

While websites were not “down” in a traditional sense, there was a period of about 30 minutes where users would experience the above error.

No data was lost.

About that downtime…

Posted on May 15, 2008
Filed Under Incidents, Updates

Before I get started with the whys and wherefores, let me take this opportunity to personally apologize to all of you who’ve been affected by our recent downtime. And let me make it clear that the purpose of this item is not to point fingers, but to give a transparent explanation of what happened and why, and how we’re going to stop this sort of problem from happening in the future.

Background

On 12 May 2008, we initiated a request to our datacenter to upgrade the operating system on our primary server (melvin.smilingpeanut.com) to Fedora Core 8. (We had previously been running FC4.) We were told this would be routine and that backups and transfers of data would be taken care of with minimal involvement on our part. That said, we made sure to back up core data such as user files, databases and mail on our own just in case. The update was to commence on 14 May 2008 in the very early hours of the morning so as to impact our customers as little as possible.

Read more

26 hours later…

Posted on May 15, 2008
Filed Under Incidents, Updates

And we’re back. As I write this, techs in two states are restoring access to the websites hosted on melvin.smilingpeanut.com one at a time. We’ll have more updates as things progress.

UPDATE @ 0225 MST: We have verified that all websites hosted on Melvin are functional.  For those interested, we will post a timeline as well as a thorough explanation of the day’s events later today.  But for now, we need some rest.

Slight downtime on the morning of 10/3

Posted on October 3, 2007
Filed Under Incidents

Early this morning, Melvin experienced approximately 51 minutes of downtime, starting at 6:52am MST.   Immediately upon notice from our monitoring service, we issued a response to the data center to reboot the server.

Upon reboot, all services have been returned to normal.  No data loss was reported.

Slight downtime this morning

Posted on May 17, 2007
Filed Under Incidents

After over one hundred days of uptime, our primary server, Melvin, experienced some slight downtime for 58 minutes starting at 2:57am on 17 May, 2007.  Immediately upon notice from our monitoring service, we issued a response to the data center to reboot the server.

Upon reboot, all services have been returned to normal.  No data loss was reported.

keep looking »

Recent


Topics


Archives