Class of the Day: System.IO.FileInfo

Today I learned about a class in the .NET framework that I've never used: System.IO.FileInfo. I came across it while looking up the appropriate method to determine the size of a file for a unit test. I had expected that there would be a method in the System.IO.File class for this, but I was mistaken. It turns out the proper way is to use the Length property of a FileInfo instance.

FileInfo is kind of a weird class. In fact, I'm not really sure what the design impetus behind it was. It actually duplicates a lot of the behaviour of the File class, which I use all the time. In fact, it duplicates so much so that I would have expected the two to be a single class. It appears that the distinction is primarily that File is static while FileInfo is the instance equivalent, with some of the methods moved to properties. Although it's still not clear to me why they couldn't just have it all in one class, with methods overloaded for static and instance calls.

By way of summary, the FileInfo class inherits from FileSystemInfo. Some of the interesting properties include Directory, Exists, and Length, as well as both "regular" and UTC variants of properties for file access and creation time. Interesting methods include Encrypt() and Decrypt(), CopyTo() and MoveTo(), and SetAccessControl to set ACLs on the file. I wonder how Mono implements that last one. It's supported, but undocumented.

PHP bug of the day

I came across an annoying little bug in PHP this afternoon. Nothing I can't work around, but it's another example of the general suckiness of object-oriented programming in PHP.

Here's the situation. I've been slowly refactoring LnBlog over the last few months. I'm trying to make the design more object-oriented, easier to maintian, and just generally less messy and ad hoc. I'm adding in unit tests with SimpleTest and, at least today, I was actually working with a copy of Martin Fowler's Refactoring open in front of me.

The particular problem that popped up was with a class called Path that managers the building and converting of filesystem paths. Two of its methods are get() and getPath(). Both simply join a list of path components into a single path string. The difference is that get() is an instance method and works on instance variables. The getPath() method, on the other hand, is a static method where you just pass in the path components as parameters. Since these two methods do essentially the same thing, I thought that it would make sense to combine them.

In a language like C#, I would simply do this by overloading the get() method and having an instance get() with no parameters and a static get() with a parameter list. However, there is no overloading in PHP. The typical method is simply to fake it with optional parameters. Ugly, but it works.

Well, today I thought I'd be clever. I reasoned that, when a method statically, there is no instance of the class and hence the $this variable isn't set. So I tried something like the following: function get() { if (isset($this)) { return $this->implodePath($this->sep, $this->path); } else { $args = func_get_args(); return Path::implodePath(DIRECTORY_SEPARATOR, $args); } } The problem with this is that it only sorta, kinda works. More specifically, it only works when you don't call it statically inside another class method.

If you're familiar with bug 30355, this shouldn't come as a surprise. Turns out this behavior is actually for backward compatibility with various badly broken code. I can only hope that this is a bug that got grandfathered in rather than something that was originally done by design.

At any rate, the workaround was quite simple. I should have done it the first time - just replace that isset($this) with func_num_args() == 0. problem solved.

Back home

I'm finally back home, sitting on my back porch with the laptop, sipping a diet W-up (the Wegman's-brand version of 7-up).

My class in Hartford finished up early on Wednesday afternoon, so Sarah and I spent the rest of the day walking through the rose garden in Elizaeth Park. On Thursday morning, we took the hour drive down to Mystic.We spent the morning and early afternoon at the aquarium and then browsed around the shops at the Old Mystic Village. My two big purchases were my first ever chocolate-covered frozen banana (yum!) and a decorative bokuto.

After a very nice dinner at the Steak Loft, we spent the night at the local Comfort Inn. I found it somewhat annoying that the $100 per night room at the Comfort Inn was actually much nicer than the $150 per night room at the Hartford Crowne Plaza. The Comfort Inn had more storage space, better TV channels, complimentary breakfast, and didn't charge for internet. The worst part is that I didn't find this all that surprising. For some reason, the fancy, expensive hotels always nickle and dime you to death. Apparently they can't scrape by on the higher room charges and over-priced room service.

On Thursday, we wandered around downtown Mystic and then went to the Mystic Seaport. The seaport is actually one big museum. It's a recreation of a 19th-century costal town and has a rather an active program to preserve period ships. It was actually very interesting.

After that, we headed back to New York. We stopped at my parents' house for the night, since it's on the way, and came home this morning. Of course, we had to make a stop to pick up the latest Harry potter book. By now, Sarah's probably about half-way through it. The speed at which she goes through novels is just disgusting.

Last day of ESRI class

The ESRI class is almost over. We're down to the last 2 of 16 lessons.

Today has been pretty boring for me. The instructor has been doing a great job, it's just that we're getting deeper into the details of managing a geodatabase. Since this is the first time I've ever actually worked with a geodatabase, the finer points are more or less lost on me. I can understand the concepts, but I have no frame of reference for applying them. In terms of practical knowledge, I don't even know enough to ask an intelligent question.

By way of contrast, the people from host agancy, the City of Hartford, really know their stuff. They've been to several ESRI classes, gone to the ArcGIS user conference, and work with an ArcGIS database regularly. It's clear that they have a good handle on this stuff and are really getting something out of this.

All in all, I file this experience under "pointless waste of time and money." I had a better time than I anticipated, and I now know something about ArcGIS geodatabases, but I really had no business attending this class in the first place. They should have sent someone with at least a basic knowledge of ArcGIS instead - or, at the very least, someone who isn't looking to jump ship at the first opportunity (and my supervisor does know I'm looking for work - I'm not trying to hide it).

So, to sum up, I did get something out of this experience and it was very nice to get out of the office for several days. But the benefit to my employer won't even come close to justifying the $1400 registration fee plus travel expenses. I really don't know what they were thinking.

ESRI class day 2 - boredom and web access

More live-blogging today. Unfortunately, it turns out that's the only kind I'll be doing until the class is over. I'm staying at the Crowne Plaza hotel, and while they offer WiFi service, they charge $10/day for it! That's actually worse than the $4/2-hour block that they charge for the AT&T WiFi at the Barnes & Noble back home. At that price, I could get 5 hours of service, and there's no way I'm going to be online that long here. Fortunately, the PCs in the training room for the class have web access, so I can at least check my e-mail.

Yesterday afternoon and this morning we got more into the details of managing ArcGIS geodatabases. Things like connection methods, authentication, data loading, management tools, "gotchas", and so forth. Basically, the stuff I will probably never need to know.

I'm actually a little ambivalent about this class so far. On the one hand, it's absolutely mind-numbing at times. It's not that the class is bad, it's just that it's getting into details that have absolutely no relevance for me.

On the other hand, it's actually very interesting in an academic sense. After all, we're talking about a system that scales up to clustered, multi-terabyte databases. The ArcGIS server runs under Windows, UNIX, and Linux and supports pretty much all the major DBMSs - Oracle, SQL Server, DB2, Informix, and there was even some talk of people using Postgress and Sybase. So we're really getting a closer look at the architecture of a very complex, high-end system. Plus our instructor has been around long enough that he can talk about how things have evolved over the years and the direction the architecture has taken. We're not just getting the tedious technical details, but some insight into the layers of the system, the APIs involved, and how everything interacts on various levels.

So as a case study of a major major information system, this class is actually quite interesting. However, it's really a class on managing geodatabases, not a case study on ArcGIS. So while the concrete details are putting me to sleep, the high-level stuff was definitely worth hearing. As a programmer, you tend to look at and read about things on more of a code-level. It's good to see how the "big boys" handle complicated design issues.

At the ESRI class

Well, here I am at Constitution Plaza in Hartford, Connecticut. I'm actually here on a business trip. My employer sent me to a training course with ESRI entitled Data Management in the Multiuser Geodatabase.

There are a few things to note about the very fact that I'm here. In the 6 years I've been with my current organization (which is far too long), this is the first time I've been on an out-of-state trip. In fact, it's the first time I've been on a trip that lasted longer than a day. It's also the first off-site paid training class I've been sent to. Kind of seems like a waste, given that if things things go well, I'll be able to get the hell out of here before I have a chance to put any of this to use.

As for the class itself, I have no real complaints so far. It's actually hosted by the City of Hartford IT Department, which has 2 people attending. The instructor, Jim, has been in the GIS business forever and has been working for ESRI for 14 years. He's really nice and seems pretty knowledgeable, so I'm actually enjoying the class so far. In particular, learning about the system architecture was kind of cool. Of course, I'll probably never put any of the stuff he's teaching us to use, but at least this gets me out of the office for a week.

One cool thing I've discovered is that Python is apparently one of the favored languages for automating ESRI's system. In fact, the lap machines we're using for the class actually have Python 2.4 interpreters installed on them. I don't know if we'll do any scripting in the class at all, but I just found that to be really cool.

Unhelpful programming tools

You ever have one of those annoying programming moments where you know something isn't working, but for the life of you, you can't figure out why? And then, after spending way too much time on it, it turns out to be a stupid little typo or something like that? And then you're not sure whether to curse yourself for being so stupid or the compiler/interpreter/whatever for not providing a half-way meaningful error message?

I had one of those the other day. I was writing some PHP code to update a MySQL database using the object-oriented interface to the MySQL Improved extension. My problem was that my call to the $connection->prepare($sql) method was failing for no apparent reason.

For those who aren't PHP programmers, PHP 5 introduced MySQL Improved, or mysqli, a new extension for accessing MySQL databases. This includes support for prepared SQL statements. That basically means that you can parameterize your SQL, compiling the SQL and then injecting values from your code into the syntax and type-checked statement. This is a huge improvement over the old, hideously insecure method of just building up a string in code and passing it wholesale to the database.

Anyway, my problem was with the method that prepares an SQL statement. In particular, its return value - it returns a prepared statement object on success, false on failure. In my case, the method was returning false, which meant that MySQL couldn't prepare my statement because...well, it just couldn't. No exception, no message saying what was wrong with the query, just a failure code.

My question is: why? Wouldn't it make more sense to at least raise a warning to give you some idea of why the prepare failed? Or at least set the last error message for the connection? MySQL presumably has some idea why the prepare failed. Why can't the mysqli extension pass that on? After all, machines are good at finding typos. Much better than the developers who put them there in the first place.

Screw encryption!

On Friday, I said I was finally going to secure my wireless LAN. As you can probably tell from the title of this post, that didn't go so well. As of this writing, I am still running an open system because that's the only configuration I can get to work with all three of my computers.

268023_d-link_switch.jpgI've spent several hours messing with this today, and it's put me in a really foul mood. There was a time when I enjoyed messing around with my system configuration, but I just can't do it anymore. I don't care that much about networking. I have too many other things I want to spend my time on. I just want my damn network to function and not let anyone who drives by eavesdrop on all my traffic. Is that too much to ask?

My upgrade process started with a firmware update to my D-Link DI-524 C wireless router. This update included WPA2 support, which was a nice bonus. So my encryption options were now: nothing, WEP, WPA, WPA2, and something called WPA2-auto. On the down side, it included no additional documentation, so I have no clude what this "WPA2-auto" is supposed to be. But "auto" sounded promising, so I decided to go with that mode.

Turns out this was a bad idea. According to this forum thread, WPA2-auto doesn't seem to work consistently. Unfortunately, I didn't discover this until I had spent a considerable amount of time trying to get my PC configuration right. You see, I was misled because my laptop was able to connect one time while the router was in WPA2-auto mode. That led me to assume that the problem was with my PCs, not the router. Guess I should have Googled first.

So, eventually, I ended up going with plain-old WPA. The client configuration was a bit tricky for this. You see, my laptop uses NDISwrapper, so I could just use KNetworkManager to enter the pre-shared key. However, my desktops both have RaLink cards and use the rt2500 driver. This driver does not use the Linux wireless extensions and hence does not work with NetworkManager. To configure these cards, you need to add some lines to your /etc/network/interfaces file, as described here. It works, but the down side is that it breaks NetworkManager. However, since these are desktop PCs with 1 WiFi card connecting to 1 access point, that's not really a big deal.

While the desktops weren't that difficult (one I got the right router settings, that is), the laptop was another story. I still haven't figured that one out yet. Of course, I was out of energy by the time I got around to it, so I wasn't exactly in peak form.

The laptop has in integrated Broadcom card which, as I said becore, is configured to use NDISwrapper. This means it works with KNetworkManager. However, I couldn't get KNetworkManager to connect to the access point with WPA enabled. I selected the encryption mode, entered the pre-shared key, and then the connection progress bar would hang at 28%. The iwconfig output said that the card was associated with my access point, but I never got an IP address.

My current suspicion is that the laptop is using stale configuration data from my failed WPA2-auto attempt. I had some problem with stale configuration on the desktops too. For those, I just did a /etc/init.d/networking stop and then unloaded the driver module, then reloaded and restarted. That cleared everything up. In this case, however, I'm thinking it's the data stored by KNetworkManager. The only problem is, I have no clue whatsoever where I would look to find out. The interface is really spartan and there's no obvious way to delete stale configurations.

There is still one big functionality question I'm left with: how do I get NetworkManager to centrally configure an access point for all users? Both Sarah and I have our own accounts on the laptop, and I'd really like NetworkManager to automatically detect when our home network is present and connect to the access point at system start-up. I'm thinking there must be a way to do that, but there's nothing obvious in any of the configuration tools.

My wireless insecurity

I have a confession to make: it's 2007 and I still haven't set up any encryption on my wireless router. <Wince>

I know I should. I've known I should since I bought the thing two years ago. Every now and then I think I'll set it up, but then I just never get around to it. There just never seems to be a good time. However, as recent events have shown, even the best of us can have security problems, so it's time for me to get my butt in gear.

The real problem is two-fold. First, I'm a programmer, not a network admin. I know the basics of how networking works, and I can set up a basic home LAN without any problems, but I'm hardly an expert. I also know very little about WiFi, so I don't exactly have a lot of confidence that I'll get it right the first time.

This is a problem because I'm married to a lovely woman with absolutely no interest in geeky computer stuff. She also has a very low tolerance for things being temporarily broken. This goes double for "the internet," since web browsing and e-mail are 90% of what she does. This means that my only window of opportunity to mess with our LAN is when she isn't home. However, I usually have other chores to do at those times, so it never gets to the top of the priority list.

The second problem is technical. I wanted to set up encryption when I first installed my wireless hardware. However, the WiFi cards in my desktops are RaLink 2500 chipsets, and at the time, the drivers had only recently been open-sourced. The upshot is that I started out using beta releases of the community-supported driver and I was lucky to get it working at all. Encryption isn't a high priority when the network interface is just barely functional.

On top of the driver, both the router and my software were little help. My router is a D-Link DI-524 C1, which only included WPA-PSK support in a firmware upgrade (I'll be damned if I'm going to set up a Radius server) and included little to no documentation on it. And at the time I was using Slackware 10 and Xandros 3, neither of which had much in the way of helpful wireless configuration tools. So setting it up would have been a command-line, /etc/*, and vi hackathon of the type for which I have long since lost my enthusiasm

Today, however, I'm using Kubuntu 7.04. It includes a nice, stable RaLink driver and the KNetworkManager utility, which allows you to easily connect to any number of wireless networks. I essentially have no excuse anymore.

So, tomorrow afternoon, I'm going to give it a try. I've already downloaded the latest firmware upgrade from D-Link in preparation. I'll just need to dig out my 25-foot network cable, wire up one of my computers, and go to work. With any luck, it will go smoothly and I'll be done in half an hour. If I'm not so lucky...I may still be running an open network next week.

Agile, shmagile

There was an interesting article over on the Typical Programmer the other day. It consisted of a criticism of an anti-agile development paper entitled Agile Methods & Other Fairy Tales, by David Longstreet. While the criticism was interesting in its own right, I found it even more-so after reading the actual paper.

The main criticism the Typical Programmer leveled against the paper was that it was attacking a straw man. Specifically, he claims that many of the practices Longstreet describes are not actually agile, but simply broken. They just look agile because they don't involve much in the way of design, documentation, and so forth. Real agile methods, however, do not scorn those things when they are necessary.

I don't claim to be an expert on agile methods. However, I found Longstreet's paper to be good in some places. It certainly wasn't as intellectually bankrupt as the critique would lead us to believe.

Thinking about all this raised a few questions in my mind.
1) Couldn't the same complaint be leveled at the agile people? That is, might one not claim that they're not reacting against waterfall-style methods, but rather a caricature of them? After all, you can dress some broken methods up as waterfall just as easily as you can dress others up as agile.
2) Just what are real agile methods anyways? How do you differentiate it from "fake" agile? Viewed from the outside, it seems like agile methods could very easily be mistaken for cowboy coding. How do you decide where the line is?
3) Is it just my imagination, or do agile methods get less controversial every time I read about them? For instance, I remember when everyone was talking about the pair programming advocated by eXtreme Programming. Now, it seems like nobody cares about that anymore. (Actually,it almost seems like people have come to a silent agreement that it was kind of a silly idea.) These days, I read about the de-emphasis on documentation and other non-code artifacts. But then, when pressed, the agile people say they're only against the useless documentation.
4) Isn't "requires producing useless documentation" almost the definition of a broken development method? Yeah, sometimes it's demanded by the customer, but in other situations, does anyone ever think this is a good idea? It seems to me that if you're writing design documents that you never use, then you're missing the whole point of writing them in the first place.

Brain dump completed. Time to do some more reading on agile methods.

Yes Virginia, there IS software engineering

We've all heard the phrase "software engineering." The question is, does such a thing actually exist? Many programmers say it doesn't. I disagree.

Steve McConnell recently wrote a couple of very interesting blog posts on this very topic. The second was in response to this post. Eric echoes many of the typical "software development isn't engineering, it's a craft/art/form of black magic" arguments. Steve provides what I think is a good rebuttal.

I have a few problems with the typical anti-software engineering arguments. For starters, the "building software isn't like building bridges" line really misses the point. The idea is that building bridges involves repeatable and predictable patterns that allow engineers to accurately estimate project progress, whereas in software, every project is different and hardly anyone ever knows for sure how long things are going to take.

There may be some truth to that, but there's no reason to assume it's due to any inherent difference between civil engineering and software development. Part of it is simply that most programmers don't develop a lot of very similar systems because they don't have to. When you need a new bridge, you have no choice but to build it, but with software, you can just copy and customize.

But really, the reason most software estimates are just wild guesses is not because of the nature of software development, but because most development organizations are miserably primitive and unorganized. To form reasonable estimates, you need historical data. That means you need to track how long each phase of a project takes. Then you can run the numbers for similar projects and come up with a reasonable estimate for a new project. It's not rocket science. It's just that many organizations simply don't bother to do it.

Of course, as Eric mentioned, there is the issue of varying user requirements to consider. The whole reason we have custom software is that users have differing requirements, and so every application is unique in some sense. But if we're going to be honest, that only goes so far. Every custom business application may be different, but there are still a lot of similarities. You have your database layer, your business rules, your user interface, your reporting, and so on. They may never be exactly the same from application to application, but they're usually not fundamentally different. Think steel bridge over a river compared to a concrete bridge over a ravine. They're not exactly the same, but they're still comparable on some level.

Bottom line, software engineering, like "regular" engineering, is about process, not product. Just because you're building something doesn't mean you're doing engineering. A rigorous software development process is the exception in our industry, not the norm. So it's not that software engineering doesn't exist, it's just that the vast majority of the people writing software are not doing it. If you want engineering, head to someplace like Praxis Critical Systems, not your run of the mill .NET contractor.