DFF Updates for Hurricane IKE Aftermath in Houston
(update 9/24/2008 10:15pm CST)
Finally, full power has been restored to our office building. Building main AC are on and temperature is now back to a cool 70 degrees and going down.
We have experienced 2 hard drive failures, 1 about to fail, 1 power supply failure and 1 motherboard failure so far all due to heat.
I have restored minimal service as quickly as possible today.
What you can expect in the next 3 days:
1. We will be replacing our firewall hard drive which is about to fail. (about 10 minutes down time max)
2. We will be upgrading our bandwidth to 6mbps burstable to 10mbps within 3 days. (about 2 - 3 hours down time maximum)
3. Data will be refreshed within 24 - 48 hours with new English language aware search vector search.
4. While we repair servers (replacing mobo, power supply, etc) there will be no downtime as we will rotate with spare servers.
Many of you have offered to host DFF while we were down. We really do appreciate the offer, however DFF servers deployment is quite large.
I would like to take this opportunity to explain a bit about our structure for future reference. Our basic infrastructure for "DFF US project" only consists of:
- 3 web servers
- 1 load balancing server
- 3 memcache servers
- 8 active search postgresql servers
- 8 spare / next data generating postgresql servers
- 1 raw data postgresql servers
- 2 main DFF database postgresql servers
- 2 product image servers (now offsite will soon be brought inside our facility)
- 1 file server
aside from the servers above we also have the firewall, a couple of workstation running cron / scheduled tasks and servers for other web sites.
As you can see DFF is a large deployment and can not be deployed in colocation / shared hosting environment. But we do appreciate the offers.
I have learned a lot from IKE, we will use this experience to improve our future emergency plan.
Okay... enough about this IKE aftermath... BACK TO WORK!!! :-)
(update 9/23/2008 2:40pm CST)
It has been 1.5 weeks since the Hurricane IKE hit Houston. About half of greater Houston is still without power. Our building suffered major damages
and only have partial power.
We have been very worried regarding our temperature issue in the server room which is now at 100+ degrees fahrenheit.
We have just experience another server failure just a few minutes ago.
We have decided to shutdown our servers temporarily in order to prevent further damage due to high temperature.
We do not know when FULL POWER will be restored in our building and when FULL AIR CONDITIONER power will be restored.
I know power and AC may be restored any day now, as Centerpoint Energy is doing all they can...
We apologize for this "TEMPORARY" interruption of service. We will restore service as soon as we have better news.
(update 9/22/2008 11:24am CST)
We are still doing just okay. However the office building we are in has suffered MAJOR damages and still not ready.
The building / data center is under powered only enough to keep minimal power on.
There is not enough to power on air condiioner for the entire building.
Therefore the servers are hot like oven for 3 - 4 days already NOT GOOD!!!!
still waiting on the power repair people who are the busiest / most important people in Houston now.
(update 9/17/2008 4:45pm CST)
Thank You, for everyone's comments and support.
We were able to work on the servers (on-site) for a couple of more hours. Personal building visit (physically) is still limited until Friday.
Anyways we found many servers in bad shape due to the electricity being on-off-on-off etc... created fragmentations on hard drives.
We have a couple of database servers we have to switch out with new faster hardware. We may as well take this opportunity to do so.
We have 1 web server down due to corrupted partition, we should be able to get that up by Friday afternoon.
All in all we still feel lucky, there are many offices with blown out windows, our windows, doors have minor water leaks from outside wind + water pressure, just sign of water seeping through nothing more.
(update 9/16/2008 12:45pm CST)
First, we would like to apologize for the interruption of service at DFF.
Hurricane Ike hit Houston, Texas dead on Sept 13 4:00am. The hurricane was a category 2 - 3 with 100+ mph winds.
City of Houston suffered major damages. Hurricane Ike tragedy has been declared the 3rd most costly disaster right after hurricane Katrina 3 years ago.
Even after 3.5 days half of Houston is still without power and shortage of gasoline everywhere.
Also hit hard was Baytown, Texas where most of the nation's oil refineries are located. YES, you will see increased gas prices all throughout the nation.
Galveston Island ... well let's just say, they have my deepest sympathy. I know many people, friends and businesses there. Galveston suffered tremendous damage.
Our Builiding and Data Center
The office building we are in suffered MAJOR damage. Building authorities reported, "almost all equipments on the roof are either gone or damaged" which includes Air Conditioner equipments.
Many blown out windows, doors and debris are everywhere. In summary, the building is not in good shape at all.
We are officed on the 8th floor in this 20 stories office building. I was given special access to enter the building along with some other repair personel and technicians just for a short period of time.
All our equipment are intact and there were no visible damage. I was able to restore power to as many servers as possible which does not have immediate issue. Some servers are still down and need to be worked on because power has been erratic during and after the storm.
We will do our best to restore service as quickly as possible working remotely today. I hope to be able to fully access the servers tonight / tomorrow to restore the rest of our servers.
Thank you for your patience as we restore our service.