Technology
Frustrating Day
by balleman on Jul.12, 2006, under Happenings, Linux
Unfortunately, it seems that anything requiring cooperation and coordination is hard to do. I spent the day investigating a likely impossible solution to a problem that doesn’t end up needing solved. While at the same time still not having the ability to work on the problem that really needs solved. And no documentation or plan has been given to me other than 3rd party verbal snippets to the overarching project responsible for the whole situation. Sigh.
Luckily, other than this blog, I’ve been pretty successful at not letting the day ruin my evening. I got to visit with Chris a bit, following his trip to Illinois.
And one more note… recent kernel upgrades for FC5 require a new firmware image for the ipw2200 wireless driver, so remember that.
Unfreakin’ Believable
by balleman on May.16, 2006, under Networking
There are good networking problems. These are the kind that something happens that you don’t understand. You mull over it, eventually diagram out the different layers of the various protocol stacks, and discover that things are working exactly as designed, just not as you expected. Ian and I uncovered an example of this kind of problem a few years back at CTI. There was a periodic surge in broadcast traffic on our LAN destined for a specific box. After ruling out all of the various NAT rules and such going on, we eventually found the simple cause. The box was not sending any packets of its own. So, the switch had no forwarding entry for it. So it did what switches are supposed to do in that case: broadcast the packet. Coming to those kind of conclusions is fun.
There are at least three kinds of bad networking problems. The first is when something is evil by design. An infamous example of this would be the error measurements in the DS1-MIB. Absolutely useless when it comes to making time-series graphs, or setting event thresholds. The design seems to have been one of convenience, since the quantities represented date back to the first of the DS1 channel banks. Anachronistic design sucks.
Another type of evil networking problem is one that is seemingly random. Everyone knows that one of the first steps in troubleshooting is knowing what steps are needed to reproduce the problem. With random events, you’re essentially screwed. Turn up the debugging until it gets painful and wait for the event to recur, and pray you captured something useful when and if it does. DDOS attacks and the like can fall into this category, especially for those of us without the ability to do meaningful flow logging. Tracking down random problems is evil.
The third category of networking problems I feel like discussing this evening is the unreasonable problem. This is the non sequitur, the problem that makes you yell futilely at your terminal or coworker, from the complete and utter ridiculousness of it. If you manage to solve one of these, you might end up with a good networking problem, as described above. Or you might want to take a sledge hammer to a piece of hardware. Let’s explore some examples:
Near the end of my CTI experience, a certain Astrocom CSU/DSU was observed having a most unlikely problem. If I remember correctly, it somehow would drop packets over a certain size. A most improbable feat considering that your average CSU/DSU should not really have any concept of what a packet is, let alone be able to drop one. Patrick offered a bounty on the problem, but as far as I know, it was never solved.
The last example is the reason I am writing tonight. Last summer at Doug‘s LAN party, I had difficulties copying large files to my desktop machine. I eventually blamed it on my patch cord, but by that time we were packing things up, so this was never really tested. Even before this, I was having trouble sending large print jobs to my printer. I quickly blamed this on the network card in the printer, or some postscript oddities, but never came to a solution.
Later, when trying to use my desktop as a file server for a CentOS install, I realized that my network issue with my desktop was ongoing. Traffic analysis indicated that during a high speed transfer coming from my desktop to another machine, the connection would stop passing packets. Packets on other TCP connections between the same machines were unaffected, but subsequent retries of packets associated with the dead connection were getting dropped somewhere. Since this seemed to be connection related, firewall settings were verified and found to be fine. I let the problem fester, as it was not causing any day-to-day difficulties, since my desktop isn’t ordinarily sending large amounts of data, just receiving.
So, this evening I was talking with Doug about printer stuff, and he made a connection that I had been missing. Was my printer problem related to my network problem? And what about all of those problems with NFS in the recent past? Yes, they all sound like candidates. With that late realization, I delved into the annoying network problem. I replaced the network card. The problem persisted. The cabling was ruled out. And the problem was narrowed down to a switch. A specific switch port, to be precise. Now, I’m not exactly sure how one of my NetGear GS506 gigabit switches is managing to drop packets belonging to a specific connection when that connection begins spouting lots of traffic in a certain direction, but that’s exactly what it appears to be doing. And it’s reproducible. So yes, a problem as insidious as this problem should be solved with the sledge, but some tape over port 1 of the switch should do. Thanks for the insight, Doug! And if you see a problem like this, check the switch, even though it doesn’t make any sense.
Computer Stuff
by balleman on Mar.18, 2006, under Computers, Linux
The storage capacity upgrade and RAID5-ification has been completed, following a week of computers and their components strewn throughout the house. Having a RAID5 include a linear md as a component was a bit challenging… had to make the kernel not try to assemble the RAID5 automagically, but wait for the mdadm.conf to do it. Unfortunately, that wasn’t the end of the computer fun this week. Chris had a drive fail in his firewall, and the machine employed various means to make it impossible to install an OS on another drive. Still not sure what its problems are, having spent the afternoon in futile efforts to fix it.
[balleman@oak ~]$ df -h /storage Filesystem Size Used Avail Use% Mounted on /dev/mapper/storage0-lstorage0 1.5T 739G 729G 51% /storage
The Future?
by balleman on Feb.09, 2006, under Networking
Internet connectivity in the US, particularly rural areas, is awful. And there is really no excuse for it. The Paradox of the Best Network provides some insight into this (thanks to Patrick for the link, via his blog). I also envision general office buildings in the future that provide telecommuters a way to get out of the house and have access to shared resources. And everyone working in the building might be working for a distinct company across the globe. In the mean time, my 1Mbit with extra evil shaping connection to Kuhn and my 11Mbit wireless link to Chris will have to suffice.
OpenSky here we come…
by balleman on Dec.10, 2005, under Radio
Several years ago, the PA state government began deploying a proprietary trunked communications system… and encouraged County governments to use it. Since it’s proprietary, the governments are locked into the price structure of a single vendor (brilliant! bravo!) and radio enthusiasts like me won’t be able to use our scanners to listen in (at least, not anytime soon). I’m a big fan of government transparency and openness… and this is moving in totally the wrong direction. All isn’t immediately lost however. Even if fire and ambulance dispatch is handled with OpenSky staring on June 1st, 2006 as this article says, it sounds as if paging will still be simulcast on traditional FM frequencies so that voice pagers, sirens, and scanners can function without an upgrade. It seems that radio geeks aren’t the only ones concerned… just listen to those fighting fires.
And while I’m ranting… Since this system is IP based, how long will it take before a laptop in a state cop’s car contracts a virus because he installed an unauthorized Wifi card, and cripples communications state-wide? Consultants won’t go hungry cleaning it up, that’s for sure.