Server Poopage
10 years ago
General
Just wanted to give an update on what happened last night.
We've been making a lot of tweaks and improvements on our systems' back end to support higher resolution uploads and lift the file data cap (which will help musicians). We've been migrating over the site's assets to a new server, and were planning on switching to the new system (a 48TB file storage system) some time today.
While moving the data over AND having site live we experienced an issue with file system that caused the system to become unresponsive while under the added load. My assumption is one of the drives in the system started experiencing a fault, but wasn't reporting it. The drive kept trying to recover, and while the system waited for it to eventually become responsive, eventually it timed out.
This caused the server itself to become unresponsive, and I had to drive out to the colo facility and reset it. This was an older server (one of our last) without dedicated remote management, so it compounded things a bit. Trying to reboot the server remotely caused FreeBSD to become unresponsive due to the hardware issue.
Our plan was to move the data over, then retrofit the older server with higher capacity drives to act as an updated mirror. I got home at 10am this morning and crashed on the spot.
We've been making a lot of tweaks and improvements on our systems' back end to support higher resolution uploads and lift the file data cap (which will help musicians). We've been migrating over the site's assets to a new server, and were planning on switching to the new system (a 48TB file storage system) some time today.
While moving the data over AND having site live we experienced an issue with file system that caused the system to become unresponsive while under the added load. My assumption is one of the drives in the system started experiencing a fault, but wasn't reporting it. The drive kept trying to recover, and while the system waited for it to eventually become responsive, eventually it timed out.
This caused the server itself to become unresponsive, and I had to drive out to the colo facility and reset it. This was an older server (one of our last) without dedicated remote management, so it compounded things a bit. Trying to reboot the server remotely caused FreeBSD to become unresponsive due to the hardware issue.
Our plan was to move the data over, then retrofit the older server with higher capacity drives to act as an updated mirror. I got home at 10am this morning and crashed on the spot.
FA+

What brand of Drives did you go with. I read the study by backblaze and it was interesting about seagate. https://www.backblaze.com/blog/hard.....ility-q3-2015/
10Ge SANs with RED drives or RE would be a nice feat.
I stick to the theory that your breadporn broke FA.
Thanks.
Thanks for working so hard for us c:
I think supporting webm video would be the best route. Unfortunately, I don't know what to do as an alternative for interactive Flash.
The hard drives will also need to be upgraded to accommodate the changes. It comes down to a question of whether it's cheaper to upgrade the server or whether we should buy a new one to mirror the main file server.
Stop being damn lazy, turn FA off for a good few months or so, and update every little node of tech FA really needs. Maybe Then FA will finally stop having all these issues due to negligence.
Shutting the site down for a few months (and screwing over EVERYONE in the process) won't make new code, or new servers, appear any faster. You're literally suggesting we basically tell the entire community to screw off (my words, not yours) to fix a bunch of issues, then turn it on with the magic assumption that everyone's just going to wait patiently by the way side as we shut down the entire site.
FA has problems, but we're working to resolve those problems. We're actively re-coding the site, we're fixing issues with the UI, and we're going to keep the site up and running while we do it. Yes, there will probably be some downtime once we have to transition to the new services when they're ready, but to shut down a community that has several hundred thousand active users is just a slap in the face to our users.
FA has problems, but shutting down the entire community would be like setting your house on fire WHILE you remodel.
Example: FA's Phoenix Update was unnecessary, dreadful looking, and lacking in any real content that would be considered "Needed".
That whole time during it's developement, you could have used that time, money, and effort to fix the systems and any hardware to help FA in the Longrun. Because remember what our mom's and dad's told us when we we're kids? "Your needs should always prioritize over your wants."
If FA was to shut down for even just a month, sure people will whine at first. But at least when they come back, FA will be in top shape, many bugs rubbed down, and hardware stabilized.
You don't have to like me, but I have to put our users needs before mine. I'd love to reboot FA from the ground up, and that's what we're doing, but I'm not going to screw over every last user of this site to do it.
...your going to "reboot" FA from the ground up, and that what your going to do? Now your being contradicting. So let me get this straight:
You'd take the long, excruciating path of just wiping the whole site clean and starting from fresh ground up, over simply just fixing what NEEDS to be fixed and have a happy ending?
You really need to get your act together, 'Neer. Now your just screwing us over, like you just said you want to avoid.
A) Stop being lazy and fix FA's Hardware with the rest of the site's funding.
B) Hire/Commission more Tech Members, legitimately and without scamming said Tech Member(s) Funding.
C) Run Maintenance every 2 weeks to ensure Security and Programming Stability, instead of running it when a problem has turned up and cause damage.
I wanted to ask something. I know everything submitted in the past couple of days is getting transferred but what about any PMs that were made during that time? I just felt like I was supposed to get a few from some people today regarding commissions so the fact that they haven't gotten through yet has me a bit worried.
FA store their data seperately. PMs are stored in a MySQL database, which is on a seperate server. The 48TB server is just a network-attached-storage server primarily for storing submission files, maybe some backups.
I hope this should finish so I can keep looking at my friends' pics so I can keep working on they're gifts.
https://i.imgur.com/peBiaTq.gif
Luckily for us, the drive that died only had not-important VMs on them (VMs we could easily rebuild again and again). Had it been our Domain Controller, we'd be royally fucked over.
I picture the "Colorado facility" as a tin shed somewhere out on the high plains...