Server Poopage -- Dragoneer's Journal -- Fur Affinity [dot] net

Server Poopage

10 years ago General

Just wanted to give an update on what happened last night.

We've been making a lot of tweaks and improvements on our systems' back end to support higher resolution uploads and lift the file data cap (which will help musicians). We've been migrating over the site's assets to a new server, and were planning on switching to the new system (a 48TB file storage system) some time today.

While moving the data over AND having site live we experienced an issue with file system that caused the system to become unresponsive while under the added load. My assumption is one of the drives in the system started experiencing a fault, but wasn't reporting it. The drive kept trying to recover, and while the system waited for it to eventually become responsive, eventually it timed out.

This caused the server itself to become unresponsive, and I had to drive out to the colo facility and reset it. This was an older server (one of our last) without dedicated remote management, so it compounded things a bit. Trying to reboot the server remotely caused FreeBSD to become unresponsive due to the hardware issue.

Our plan was to move the data over, then retrofit the older server with higher capacity drives to act as an updated mirror. I got home at 10am this morning and crashed on the spot.

80 Comments

supermario0865 ~supermario0865

Art Whore

10 years ago

Wow. You don't know what to expect on FA these days

Ratchet_Lomu

A blue bubble with ripples and a white pawprint in the center

~ratchetlomu

The One and Only Horny Thel Butt

10 years ago

Boy, that must be an hell of a morning for you.

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

The timing of it was just aggravating.

Ratchet_Lomu

~ratchetlomu

The One and Only Horny Thel Butt

10 years ago

At least you can relax at the movie theater tonight watching Zootopia.

Pouncewhisper ~pouncewhisper

Briar Witch

10 years ago

I'm surprised you can form a coherent sentence after all that.

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

My blood-type is RB-Positive (Red Bull).

Pouncewhisper ~pouncewhisper

Briar Witch

10 years ago

augh. I think mine's made of molasses and sleep

SilveryGreyWolf ~silverygreywolf

3D Modeller

10 years ago

damn so thats why it went down

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

The worst part about it is we were doing the transfer specifically to help PREVENT downtime, but the added load pushed the older hardware to a point where it exposed a fault.

SilveryGreyWolf ~silverygreywolf

3D Modeller

10 years ago

well ya know what they say shift happenes all that matters is nothing and no one got hurt

ShinyTotodude ~shinytotodude

Big Jaw Pokémon | Water-Type 💧

10 years ago

Thankies, 'Neer! Please take your time to refuel your Digimon batteries, okay? (-^.=.^-)

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

I'm trying. I was so tired last night it hit the point where trying to stay awake started causing physical pain. Heh.

Cybeast ~cybeast

Writer, Imagination

10 years ago

So is the resolution limit the same?

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

For now, but it image resolution and file-size limits will be raised.

Cybeast ~cybeast

Writer, Imagination

10 years ago

Any details on that or will it be a surprise?

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

When it's ready. It's not going to be months from now, but we have to restructure and fix a few things first. My current goals are working on the new UI (fixing Tablets, improving layouts) and improving the site rules and policies (work for users, not against!). The code team is currently working on the site rewrite and fixing live.

Orcinus ~orcinus

Everyone's favourite orca!

10 years ago

A site this big and you STILL don't have a contract that includes on-site staff to hit a reset button?

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

We have remote hands, yes. That said, I also live 30 minutes from the colo, so it's easier for me to just drive up there and see things first hand so I can try to diagnose it. I'd probably get there faster than any remote hands would at that.

nick otter ~voremonster21

writer

10 years ago

i was wondering what happen to fa but that's cool you guys are upgrading to a bigger storage system

Kootiebirdo ~kootiebirdo

Traditional Artist

10 years ago

Poop happens.

NikkoWolf ~nikkowolf

Photographer

10 years ago

How are you handling the 48TB pool? Using ZFS with replication? I've heard enough good things about it to put it in production.

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

Currently UFS for stability, but we're looking into restructuring the file system and may change things later.

NikkoWolf ~nikkowolf

Photographer

10 years ago

Oh UFS . I have to use those for dealing with Linux, Mac and PC when using USB sticks.

What brand of Drives did you go with. I read the study by backblaze and it was interesting about seagate. https://www.backblaze.com/blog/hard.....ility-q3-2015/

10Ge SANs with RED drives or RE would be a nice feat.

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

We went with Hitachi/HGST drives for storage as the current batch seems to be one of the better performing and most stable. Also, from my time as an Amazon cloud tech, I got to see a lot of what drives were doing what, and... yeah. I've seen some ridiculous things.

NikkoWolf ~nikkowolf

Photographer

10 years ago

Whats your opinion on TLER based drives and raid controllers? I know we're getty techy here .

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

Honestly, I'll fess up in that I don't have enough knowledge on those drives to have an informed opinion of how well they'd work in our setup.

NikkoWolf ~nikkowolf

Photographer

10 years ago

Oh well, back to mopping up server poop

wuvdwagons ~wuvdwagons

Watcher

10 years ago

Interesting.

I stick to the theory that your breadporn broke FA.

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

Now I need to watch that video again.

Thanks.

NikkoWolf ~nikkowolf

Photographer

10 years ago

you could call it a yeast infection

ChuckMeIntoHell ~chuckmeintohell

Digital Artist

10 years ago

Ba-dum, tiss!

NaughtyAngel ~naughtyangel

Digital Artist

10 years ago

Ouch man. Take some time to get some rest. My husband is a programmer and IT Manager for the company he works for, and believe me, I've come to know all about server bugger ups and failing hardware disasters and the time it takes to get things back into order.

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

I sued to be a data center tech for Amazon AWS (I kept "the cloud" fluffy). I've seen some of the weirdest bugs, flaws, errors, problems and more in the 1,000s of systems I've worked on.

NaughtyAngel ~naughtyangel

Digital Artist

10 years ago

I can image. x.X Anyway, hope you get to rest up some, and thanks for the good work.

HellzKirara ~hellzkirara

Doodler

10 years ago

Woo! Me and you both had long nights, me pulling a 14 hour shift at work ontop of doing chores that morning, and you working to better this place that we all love.

Thanks for working so hard for us c:

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

I was up late working on fixes for the UI, and flopped into bed to watch Adult Swim... and had this nagging feeling that I needed to check Skype. Go back, and poof. It's that kind of moment where you just kinda bite your lip and die a little inside. XD

[deleted]

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

Per the notice at the top of the page, we're finishing up a sync of data involved, and once that's done, things should go smoothly again.

KeyFeathers ~dirtiran

Key Feathers ✨

10 years ago

Ahh technology, always a joy!

Black-Vulpine ~black-vulpine

Gamer/Vtuber/Fursuiter

10 years ago

Wait, so you were up all night fixing this problem?! Wow, the things you do for us, eh?

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

Yeah. It was to the point I was falling asleep in the chair, and having to bite the inside of my cheek to stay awake to try to make sure things could get up and running. I finally hit my second wind so I was able to drive home alright, but I was... yeah. XD

Black-Vulpine ~black-vulpine

Gamer/Vtuber/Fursuiter

10 years ago

The things you do for us.

Lei-Lani ~lei-lani

Freelance Writer/Author

10 years ago

Thank you for being able to recover everything too. ^^ I know there must have been a lot of hearts in throats for a while there.

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

Mostly frustration, especially since we were in the process if migrating all the data over to help PREVENT this from happening in the first place.

Lei-Lani ~lei-lani

Freelance Writer/Author

10 years ago

This was sort of like shutting the barn door after a few horses escaped then. ^^

AzureSerpent ~azureserpent

Anime Artist

10 years ago

Thank you for the quick update! I was worried it was something like another attack, so in a way it's a relief that it wasn't. Still sucks to take up your morning like that though. Get all the sleeps!

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

No data was lost, it was just annoying as hell (and inconvenient to our users) that this happened. Period. Especially because we were doing this to PREVENT downtime. We had an alternative to put the site into read-only prior to the outage, but we didn't want to shut down the site for a data transfer that could run in the background. Unfortunately...

DuoVandal ~duovandal

Edgy Soft Boy

10 years ago

Get plenty of rest!

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

I got about 5 hours of sleep and kind of rolled out of bed likely a lonely burrito. I wanted to sleep more, but I has having really odd dreams about being stuck in a Japanese game show that was part Dark Souls, part Smash TV.

naughtykyuubi69 ~naughtykyuubi69

Watcher/and tinkerer

10 years ago

sounds like you had a most entertaining sleep then, say that'd be a game worth playin' ne! Rest up Neer thanks for your hard work! ^^

Mircea ~mircea

3D Animator

10 years ago

Lift the file data cap? This makes me wonder if one day, FA might support video uploads too. Embracing a Youtube side would certainly be helpful to some users, particularly we animators! Considering file size limitations however, I understand if stuff like 10 minute + 60FPS @ 1080p videos will never happen here...

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

I think video is something we can embrace, but within limitation. I want to better support animators and people uploading video (since Flash is in its death spiral) but video is one of the fastest things that can eat up resources, so it's something we have to approach cautiously. I'm not saying no, just that I'm not sure what route we'll take.

I think supporting webm video would be the best route. Unfortunately, I don't know what to do as an alternative for interactive Flash.

Mircea ~mircea

3D Animator

10 years ago

The main problem from FA's perspective is probably drive space, followed by bandwidth. Resources to playback video in a browser should be less of an issue, unless someone is using an older computer in which case it can't be helped and the problem is on their end altogether. In any case, I certainly hope it will happen... thank you for considering it

Tombfyre ~tombfyre

Writer

10 years ago

Ah, the joys of server administration. ^^ Are you going to make sure the old server gets remote access capabilities once you convert it into a mirror?

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

The biggest problem is the old server is... well, hold. Circa 2008/2009. I don't think adding to the server at this point would really do any good long term. My goal is to see what can be upgraded (e.g. can we still use the RAM from it to upgrade to something more recent?) and improve the system across the board.

The hard drives will also need to be upgraded to accommodate the changes. It comes down to a question of whether it's cheaper to upgrade the server or whether we should buy a new one to mirror the main file server.

Tombfyre ~tombfyre

Writer

10 years ago

Definitely good points there. Upgrading things past the 5 year mark tends to be sketchy at best.

Ben_Roprim ~benroprim

3D/2D Digital Artist

10 years ago

"Tweaks and improvements"? According to that tech guy you took advantage of, your systems are super out of date and utter garbage.

Stop being damn lazy, turn FA off for a good few months or so, and update every little node of tech FA really needs. Maybe Then FA will finally stop having all these issues due to negligence.

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

Some systems are out of date, some are fresh systems that are a year old (e.g. our application server is a 12-cores, 32GB of RAM and full SSD drives) . Other servers, like our storage system, are an 8-core system with 48TB of drive space.

Shutting the site down for a few months (and screwing over EVERYONE in the process) won't make new code, or new servers, appear any faster. You're literally suggesting we basically tell the entire community to screw off (my words, not yours) to fix a bunch of issues, then turn it on with the magic assumption that everyone's just going to wait patiently by the way side as we shut down the entire site.

FA has problems, but we're working to resolve those problems. We're actively re-coding the site, we're fixing issues with the UI, and we're going to keep the site up and running while we do it. Yes, there will probably be some downtime once we have to transition to the new services when they're ready, but to shut down a community that has several hundred thousand active users is just a slap in the face to our users.

FA has problems, but shutting down the entire community would be like setting your house on fire WHILE you remodel.

Ben_Roprim ~benroprim

3D/2D Digital Artist

10 years ago

Yes, we really should turn it off and figure out what needs to be done. Your prioritizing FA's Wants, then FA's Needs.

Example: FA's Phoenix Update was unnecessary, dreadful looking, and lacking in any real content that would be considered "Needed".

That whole time during it's developement, you could have used that time, money, and effort to fix the systems and any hardware to help FA in the Longrun. Because remember what our mom's and dad's told us when we we're kids? "Your needs should always prioritize over your wants."

If FA was to shut down for even just a month, sure people will whine at first. But at least when they come back, FA will be in top shape, many bugs rubbed down, and hardware stabilized.

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

We're just going to have to disagree on this one.

Ben_Roprim ~benroprim

3D/2D Digital Artist

10 years ago

That's a shame, because lately your really lacking in consideration of the bigger picture. That's typical Dragoneer for you. I miss the Dragoneer I respected long ago, he would have thought otherwise.

Dragoneer

∞dragoneer

Digimon | Virus-Type

10 years ago

Yep, you're absolutely right. Typical Dragoneer for you, not being considerate to all the artists, crafters, fursuit creators and everyone who rely on FA for commissions to help pay for rent, food, bills or medical needs. Being so rude as to try to actively improve the site without having to take a dump on the hundreds of users who actively use the site.

You don't have to like me, but I have to put our users needs before mine. I'd love to reboot FA from the ground up, and that's what we're doing, but I'm not going to screw over every last user of this site to do it.

Ben_Roprim ~benroprim

3D/2D Digital Artist

10 years ago

I dont have a feeling of anything regarding your being, no hate nor likeness. I'm rather blank.

...your going to "reboot" FA from the ground up, and that what your going to do? Now your being contradicting. So let me get this straight:

You'd take the long, excruciating path of just wiping the whole site clean and starting from fresh ground up, over simply just fixing what NEEDS to be fixed and have a happy ending?

You really need to get your act together, 'Neer. Now your just screwing us over, like you just said you want to avoid.

Not-Anonymous ~not-anonymous

Watcher

10 years ago

Knock it off, he's only human. It's not like turning off FA will make him fix any problems faster anyway.

Ben_Roprim ~benroprim

3D/2D Digital Artist

10 years ago

Why do you think turning off systems for things like PHPBB Sites and Video Games, help maintenance get done in only a few hours? Google that, and you'll have your answer.

Pheagle Adler ~hg3300

Anthro Bald Eagle Shapeshifter

10 years ago

I kinda agree with Dragoneer, don't turn off the site for months, that'll piss a lot of people off, and the site may end up losing a chuck of its userbase. I don't know if it's like setting your house on fire, but I would at least liken it to shooting yourself in the foot.

Ben_Roprim ~benroprim

3D/2D Digital Artist

10 years ago

Then he needs to do either of the following:

A) Stop being lazy and fix FA's Hardware with the rest of the site's funding.
B) Hire/Commission more Tech Members, legitimately and without scamming said Tech Member(s) Funding.
C) Run Maintenance every 2 weeks to ensure Security and Programming Stability, instead of running it when a problem has turned up and cause damage.

Tenyoken ~tenyoken

Watcher

10 years ago

Oh, I had no idea! I was fighting the final boss in Super Mystery Dungeon last night, so I was focused on saving the world! But maintaining a Art site used by many many people is very important too!

insomniacovrlrd ~insomniacovrlrd

Comic Artist / Illustrator

10 years ago

Yo, sometimes FA breaks. I'm way less bothered by outages that last a few hours. As long as we don't have extended downtimes like we did a few years back, it's not a huge deal.

Pheagle Adler ~hg3300

Anthro Bald Eagle Shapeshifter

10 years ago

So this is good now? All servers have remote management? Also, at least it was down overnight and not in the middle of the day.

Zanth ~zanth

Watcher

10 years ago

Just another day in the office.

guy9 ~guy9

Watcher

10 years ago

Fun times huh? At least everything turned out ok.
I wanted to ask something. I know everything submitted in the past couple of days is getting transferred but what about any PMs that were made during that time? I just felt like I was supposed to get a few from some people today regarding commissions so the fact that they haven't gotten through yet has me a bit worried.

TechKat ~techkat

Programmer

10 years ago

I'm only assuming, but I may also be correct.

FA store their data seperately. PMs are stored in a MySQL database, which is on a seperate server. The 48TB server is just a network-attached-storage server primarily for storing submission files, maybe some backups.

Kai151Drandro ~kai151drandro

Digital Artist

10 years ago

mmmm I can wait, I'm sure you can handle this trouble going on.

I hope this should finish so I can keep looking at my friends' pics so I can keep working on they're gifts.

Ashtalon ~ashtalon

Cinnamon Dragon

10 years ago

How I imagine this went down
https://i.imgur.com/peBiaTq.gif

XenoSpyro ~xenospyro

10 years ago

I bet if there was a maintenance policy here that was similar to Steam, there would be 95% less problems.

TechKat ~techkat

Programmer

10 years ago

Shit happens. Can't be any worse than one of our ESXi servers at a school I work at where the drives were NOT raided and eventually one of the drives storing around about 2-3 VMs died suddenly.

Luckily for us, the drive that died only had not-important VMs on them (VMs we could easily rebuild again and again). Had it been our Domain Controller, we'd be royally fucked over.

tigoshit ~tigoshit

Watcher

10 years ago

Hope you're not running into that XFS deadlock bug.....

WarHorse573 ~warhorse573

Watcher

10 years ago

For the amount of money I spend on my FA subscription every month, I DEMAND... oh...

I picture the "Colorado facility" as a tin shed somewhere out on the high plains...

Drake Sanchez ~tmdrake

Inflatable FUN-raiser.

10 years ago

This might be a case of bitrot!

Browse

Search

Support ▼

Log In or Create an Account

Log In
Create an Account

Home

Gallery

Scraps

Favs

Journals

Recent Journals

Server Poopage

Browse

Search

Support ▼

Community Walls

SUPPORT FA

RULES & POLICIES

SUPPORT

Log In or Create an Account Log In Create an Account

Home

Gallery

Scraps

Favs

Journals

Recent Journals

Server Poopage

Log In or Create an Account

Log In
Create an Account