I suppose I'll Weasyl crosspost...
12 years ago
General
https://www.weasyl.com/profile/lynceusglaciermaw
Frankly I'm rolling my eyes at everyone who's suddenly jumping ship to run to Weasyl, but I created an account there for crossposting and to keep tabs on the artists I watch who are running away from here. I can't say I know much about webhosting, but I am an IT tech and I can tell you: Hardware fails. And it can take time for new hardware to be put in place.
One power failure once resulted in a 72 hour downtime for my work's e-mail server. And that was just recovering corrupted databases and restarting the servers; no hardware was replaced. So that was an EASY one.
Calm down, people.
Frankly I'm rolling my eyes at everyone who's suddenly jumping ship to run to Weasyl, but I created an account there for crossposting and to keep tabs on the artists I watch who are running away from here. I can't say I know much about webhosting, but I am an IT tech and I can tell you: Hardware fails. And it can take time for new hardware to be put in place.
One power failure once resulted in a 72 hour downtime for my work's e-mail server. And that was just recovering corrupted databases and restarting the servers; no hardware was replaced. So that was an EASY one.
Calm down, people.
FA+

I'm a an engineer for the leading web application delivery controller (load balancer) company, I specialize in Web App Security but also deal with optimization on all layers and will be working on acceleration early next year.
FurAffinity's problem does not hinge primarily on hardware failures, it should be noted that the hard disks that failed were on the unused until this week backup DB server and not the main DB server. The issue is mostly about the scalability of using one notification table with users that will create tens of thousands of notification table entries when they post anything, and also the fact that this table has grown into the billions of rows and some hundreds of gigabytes, making effective indexing, caching and even writing data difficult.
The solution, in theory (backed by proven use of this technology like Google, Twitter, Facebook, etc) would be to use a non-Atomic non-relational database to store notifications. This will require a rewrite of the notification code, I estimate that to involve about 3k LoC and should take a week to three weeks with a competent team. Unfortunately FA only has Yak and he has little time and little experience with non-relational databases.
On the hardware side, it is distressing that FA is using drives that are being killed by DB load, IIRC they are not using the right type of hard drives for the job or they have been shipped bad drives, normally this is manageable, but apparently this has become such a problem that they dont have sufficient backup hardware available. Trapa, a storage engineer I know, has offered to help and has been turned down (as I have as well). Because FA does not reveal much in the way of technical details the exact cause of these problems must be derived from stuff like the code that was leaked in 2007, some from 2009 and other odd sources.
A third point I have is that people need to diversify, lest they get caught in the next Yerf or VCL.