lord-inflight

06
Sep
2008

MaePPV Incident Summary

Site News

As some of you have noticed, my SecondLife “web 2.0″-style video sharing site, MaePPV, has been down for a few days.
Here’s my summary of the incident.

What happened?
My host, Velcom decided to relocate the server. They sent this e-mail (which I never got, but we’ll talk about that later):

Dear Loyal Customer,

On August 26, 2008 (Tuesday) the following virtual servers will be migrated to new and improved server farm:

da1.velcom.com (209.67.60.133)

As the result there will be a disruption in service. Expected downtime is 16-24 hours. Throughout the maintenance window your site(s), mail and other related services will be inaccessible.

All your domains pointed to Velcom DNS servers, i.e. ns1.velcom.com and ns2.velcom.com, will be automatically transferd to the new IP address. If you have control over the DNS zone yourself you will be required to update the IP address in your IN A record. Also, if your website utilizies SSL and is bound to a Static IP, please contact our technical support team for more information on your new IP address. This maintenance is strongly required to improve future service and prevent failure.

We do apologize for the inconvenience and thank you for your continued support and patience.

I didn’t find out about this until the 27th, when I filed a support ticket. I never got the e-mail, because it seems their support mailserver is in a spam blocklist. Anyhow, bit of downtime, no big deal. I told them about the spam blocklist and asked to be informed when the IP changed. They got themselves removed and promised to let me know.
A few days later, on the 31st, I checked back with my Velcom control panel and noticed the IP had changed. No e-mail about it… because apparently they’re back on the blocklist already. Good stuff. So, I go and update the DNS servers, flush my caches, and connect… to a 404.
Wait, what? So I log in by SSH.

Could not chdir to home directory /home/maeppv: No such file or directory

Uh oh.
I never got an answer as to exactly what happened, I’m not sure if Velcom knows. But apparently, somewhere in transit, my site fell off the server. By the time the move was finished, it was completely gone.
Their response to this problem was as follows:

We are really sorry, but we don’t have any backups for your account. We did our best, but we were not able to find it.

Now your account is active and you may start uploading content for your site.

Please take our sincere apologies for this inconvenience and please let us know if we can be of any further assistance. We will be glad to help you!

What does this mean for MaePPV?
Unfortunately, it’s bad. Very bad. I’m on a limited-bandwidth connection, and can’t afford to make regular backups of the whole site. I have (most of) the code, and an old copy of the database from when I was moving from Hostmonster to Velcom, but that’s it.
Not a single movie file survived.

So what now? Roll over and die?
Not likely! Since I’m doing a lot of stuff over, I can do a better job of it this time. I’m rebuilding the site with a better, more reliable architecture — something that couldn’t be done normally, because it would have meant losing all the movies.
They’re now hosted mirrored on two redundant servers, one in Utah, one in Toronto. I plan to add a feature to select the one closer to you for better speed, but that’s not done as of this writing.
There’s also a third “manager” server which has the site and DB, but doesn’t store movies. I’m running this on a VDS and managing it myself, with daily backups. Uploads go to this server, it checks them, encrypts them, and pushes them out to the view servers. Potentially, it could also convert common non-SL-compatible formats too; something a lot of you have been wanting. This is also high on my to-do list.

Do I get anything?
– I’ve given every account that was restored 10,000 free credits.
– Because a lot of accounts (any created within the last few months) got lost, for a limited time, all new accounts start with 10,000 free credits instead of 1,000.
– To encourage repopulating, uploads are temporarily free.

As of this writing, the new site is mostly coded and being tested. I’m writing this as I’m waiting for an upload to finish. Assuming nothing blows up horribly, expect it to be public again within a couple of days.

This entry was posted on Saturday, September 6th, 2008 at 2:10 AM and is filed under Site News. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply

To deter spammers, your browser is required to perform some mathematical work before your comment can be posted.
The "Submit" button will be available once this is complete.
The time this takes is random, but should only be a few seconds.