A Circuitous Route -- Kerosel's Journal -- Fur Affinity [dot] net

A Circuitous Route

12 years ago General

UPDATE: I didn't do adequate research and this tale -- as engaging as it is -- is only a half-truth. The big payoff is pretty much a bust. Check out the real story here, in a much less dramatic, much more defeatist style.

I installed Shorewall on my server over the weekend, and in the documentation, there was an option to make traffic return by the same route it entered. Because (it said) traffic does not have to follow the same path it took to get to you: it follows the routing tables. "Huh," I said. Interesting, but I was at a loss for how exactly that might happen. Of course life being what it is... I would get a personal example two days later.

I was tasked with transferring our website to new hardware. After more than a week of practice, development runs, testing and many e-mail exchanges with our contractor that keeps watch over our servers, I was ready. Both servers were connected to the public network so the contractor could access them (and the current webserver would keep working) and to the private network so I could copy files between them at gigabit speeds, By the way, SCP has become my new favorite tool, I don't know how I ever got along without it, but I digress...

Okay, so... maintenance window opens. Down the public interfaces, stop the web service, stop the database service and start copying the web-related files. That was a 10-minute affair and I kept thinking, "Tick tock tick tock..." Set the new server to use the same public IP address as the old (praying there wouldn't be ARP-cache issues). Copy the database, fix up permissions, bring up the public interface on the new server, start the database service, start the web service and... it lives! Both via the private address and the public URL. Total downtime: 20 minutes.

I told my boss and he suggested I try it from one of the newsroom iPads, which are on the open wireless. It's a physically separate network from the building network through a different ISP. And... it didn't load! Damn! Tried it on my phone (same wireless connection), no dice. Ping it from my phone... no answer. It's down! But how can it be down when I can hit it from my desk -- using the public URL/address -- and get the site?

I checked a few things on the server before trying to load Google's website and got an immediate "can not connect" notice. Could it be... why yes, the default route wasn't set, I had apparently used a blunt-force tool to bring up the interface and it had done just that... and nothing more. I added the default gateway to the routing table and no sooner had I pressed enter than the website popped up on the iPad, Fixed! And I had to bail for a dental appointment, so I called it good and dashed out the door.

But by the afternoon, I was getting some fridge-logic. Why had it worked from my computer? And from the Director of New Media's computer, too? The networking on the server was obviously broken: it couldn't even contact the internet without the default gateway. Why could I interact with it via the public address? By the evening, I had that light bulb moment and realized the strange truth.

Okay, so I request the website via the URL on my machine. It asks the domain DNS for the address, which it doesn't know so it asks our public name server which does know, and my machine gets back the public IP. Of course the public IP isn't on the private network and the request gets booted upstairs (literally) to the default gateway.

Puzzle-piece #1: Our gateway is a Cisco Catalyst managed switch that handles and is aware of both the private and public networks. When it gets the request for the web server on the public network, it simply routes the packets from the private to the public network, It seems strange, but traffic from our workstations going to our public-facing servers never gets onto the internet. It never even leaves the building.

Okay, so the network traffic goes through the gateway and arrives at the server on the public interface. The web server processes the request and attempts to send packets back. But there's no default gateway.

Puzzle-piece #2: Because the traffic never leaves the building, it doesn't need to undergo any sort of NAT or masquerading. The reply address is still the private network IP of my workstation.

Puzzle-piece #3: The web server has an address on the private network. It already knows how to get the traffic back to the private address: directly out the interface on the private network! No default gateway necessary. And that's what it does... creating a crazy triangular path from my workstation to the gateway, to the server... and directly back to my workstation.

Exactly what the documentation to Shorewall said: outbound traffic doesn't have to follow the same route as inbound traffic. Who knew?

The first couple of copy rehearsals between servers took hours, even over the gigabit link. Why? Because there was a directory of thumbnails... with over a million JPEGs in it. I wish I was kidding! None of them were much over a few kilobytes, but the sheer number of files! The overhead of each file made the transfer speed slow to a crawl and it just took forever! Thank the silicon gods we found out that directory didn't need to be moved to the new server.

The thumbnail directory on our sister station's web server is approaching that same milestone. We need to find a way to keep those directories cleaned up!

2 Comments

vylbird ~vylbird

12 years ago

A million files of a few kilobytes? That's what BLOB is for. For best results, software should be written to address them by hash - that way you get a natural primary key.

Kerosel ~kerosel

Watcher

12 years ago

Take it up with Wordpress. I have no control over it.

Browse

Search

Support ▼

Log In or Create an Account

Log In
Create an Account

Home

Gallery

Scraps

Favs

Journals

Recent Journals