LinLog - Topic Search

The other week I tried to install PHPStorm under WSL2. Because that's a thing you can do now (especially since Linux GUI apps now work in recent Windows 10 updates). The installation process itself was pretty simple.

Download PHPStorm for Linux from JetBrains website.
Now extract the tarball and run the bin/phpstorm.sh script.
PHPStorm should start up.

The next step is to configure your license. In my case, I was using a corporate license server. The issue with this is that you need to log into JetBrains' website using a special link to activate the license. Unfortunately:

By default, WSL doesn't have a browser installed.
Firefox can't be installed because the default build uses a snap image, and WSL apparently doesn't support snap.
PHPStorm doesn't appear to be able to properly deal with activating via a Windows browser (I tried pointing it to the Windows Chrome executable and got an error page that points to a port on localhost).

So how do we get around this? Well, we need to install a browser in WSL and configure PHPStorm to use it. So here's what we do:

Skip the registration for now by starting a trial license.
Download the Vivaldi for Linux DEB package from Vivaldi's website. You could use a different browser, but I like Vivaldi and it offers a convenient DEB package, so I used that.
Install the Vivaldi DEB. WSL will be missing some packages, so you have to run apt install --fix-broken after installing it.
Go into the PHPStorm settings and configure your web browsers to include Vivaldi and set it as the default browser.
Go back to the registration dialog and try again. This time, PHPStorm should start up Vivaldi and direct you to the appropriate link.
Log into your JetBrains account and follow the instructions. The web-based portion should succeed and registration should complete when you click "activate" in PHPStorm again.

There we go - PHPStorm is registered and works. Mildly annoying setup, but not actually that bad.

This is a note to my future self about an annoyance with Nextcloud. If you're not aware of it, Nextcloud is basically a fork of ownCloud, which is a "self-hosted cloud" platform. Basically, they both provide a bunch of cloud-based services, like file sync and share, calendar, contacts, and various other things. I switched to Nextcloud last year because ownCloud was lagging way behind in its support for newer PHP versions.

Anyway, I noticed a rather annoying issue where Nextcloud was leaving hundreds of stale auth tokens in the database. Apparently, I'm not the only person this has happened to.

While Nextcloud has a menu item to revoke and remove stale sessions on their settings page, it's on a per-item basis. So if you have hundreds of stale sessions, the only way to remove them is to go through, one by one, and click the menu and select the "revoke" option. Needless to say, this is terrible.

The less annoying solution is to just go straight into the database and delete them there. You can just run something like:
DELETE FROM oc_authtoken WHERE last_activity < <whatever_timestamp>;
That might be ugly, but at least it doesn't take forever.

It's important to note that, in addition to being annoying, this is evidently also a performance problem. From what I've read, it's the reason that authenticating to my Nextcloud instance had gotten absurdly slow. The app responded fine once I was logged in, but the login process itself took forever. It also seems to be the reason why my hosting provider's control panel has been showing I'm way over my allotted MySQL execution time. After deleting all those stale sessions, not only is login nice and snappy again, but my MySQL usage dropped off a ledge. Just look at this graph:

As you can see, January is a sea of red, and then it drops off to be comfortably under the limit after I deleted the old sessions. The Nextcloud team really needs to fix this issue.

It's the end of an era. (Cue overly dramatic music.) I've been using ownCloud as my personal file/caldav/carddav server for years. This week, I finally decided to switch to NextCloud. This is my story.

The thing is, I actually remember when NextCloud split from ownCloud. At the time, I was working on a (now-defunct) product that involved ownCloud. Basically, my company's core business at the time was data backup, so we had a lot of servers with big disks and were looking for a way to monetize that extra space. The idea at the time was to do that by integrating a "file sync and share" product into our offerings, and that product was a rebranded ownCloud Enterprise. Of course, the "file sync and share" space was already pretty crowded, so that product never gained much traction, but it did help me get more into ownCloud and the company even paid to send me to their user conference in Berlin, where I got to meet their team (who, at the time, seemed not-very-impressed with the whole "NextCloud" thing) and see some sites. So it was actually a great experience, even if the product didn't pan out.

Anyway, despite my affection for ownCloud, my motivation for this change was actually pretty simple and prosaic - I was upgrading my home server (that'll be another post), and I didn't want to downgrade shit. See, I actually run two ownCloud instances - one on my local network for accessing various media files, and another in my web hosting, for caldav/carddav and files that I want to be highly available. For my home instance, I was doing a fresh install of the latest Ubuntu MATE on a brand-new box. This shouldn't be an issue, except that MATE comes with PHP 8.1, but for some reason, ownCloud only supports PHP 7.4.

Yes, you heard that right - 7.4. That's the newest version that's officially supported. The last 7.x release. The one that's no longer actively supported and has less than six months of security updates left. That one. That's what they still expect me to use.

For my previous home box, I believe I'd actually hacked up the source a bit to make it work (since I don't think I depended on anything that didn't work in 8.x), but week I was sick and I just didn't feel like it. Depending on a version that's about to lose security fixes is crazy anyway. So I figured I'd "upgrade" to NextCloud, since they actually recommend PHP 8.1.

For my home server, I just did a fresh install, which is fairly straight-forward. The only annoying part was the Apache configuration, and that was only annoying because I was running NextCloud on a non-standard port and forgot to add a "Listen" directive. 🤦‍♂️ For this instance, there was no real need to do any migration, because the only data I had in there was the (very small) list of users - the rest was just files, which can be trivially re-indexed.

Upgrading the instance on my web hosting was another story. Since that had my carddav and caldav data, I really did need to migrate that. I was also already several versions behind on my updates - it was running ownCloud 10.3, whereas 10.8 was current. However, this turned out to be a blessing in disguise.

You see, NextCloud includes support for migrating from an ownCloud instance. The thing is, they only support specific migrations. In my case, the relevant case was that you can migrate from exactly ownCloud 10.5 to NextCloud 20. Sadly, it took me a couple of tries to realize that the version migration matrix are exact, so there was no path to directly migrate from ownCloud 10.3 to NextCloud. So I had to use the auto-updater to update ownCloud 10.3 to 10.4, and then manually update ownCloud 10.4 to 10.5 (because the auto-updater wanted to go all the way to 10.8). Then I could follow the migration process and manually update to NextCloud 20. From there, I was able to use the NextCloud auto-updater four times to upgrade to the current version.

So the upgrade process was...tedious. Not really "hard", but definitely tedious. The directions are pretty clear and simple, it's just a lot of steps to get to a current version of NextCloud. But at least none of the steps were particularly complicated or prone to error. As data migrations go, it could be much worse. And the best part is that it maintained URLs and credentials, so I didn't even have to reconfigure my caldav/carddav clients.

As far as NextCloud itself goes, it seems...pretty much like ownCloud, but nicer. They've made the UI prettier (both for the web interface and the client app), added a nice dashboard landing page, and made some other cosmetic improvements. They also seem to have a wider range of installable apps, which is nice. I haven't had all that long to play with it yet, but so far it seems like a distinct upgrade.

As you may or may not know, I've become an avid Vim user. I use it for work and home, having given up on PHPStorm a couple of years ago.

But one of the things that PHPStorm did automatically, which was quite handy, was to add PHPDoc comments to functions automatically. This is kinda nice because, let's face it, unless you're writing a long description, most of a docblock is just typing. You duplicate the parameters and return signature and, if the names and types are pretty obvious (which they should be), then there's not really much to say. But having them is part of the coding standard, so you can't just skip them, even though they don't add much.

Fortunately, Vim has a plugin for that, known as PDV. It will read the declaration of a function (or class, or a few other things) and auto-generate a docblock for you. This is nice, but the extension was a little out of date - it hadn't been updated to support return type annotations. There was a pending pull request to add that, but it hadn't been merged. I'm not sure why - apparently that repo is dead.

So I decided to just create my own fork and merge the outstanding pull requests. Now I have a version that supports modern type annotations, which is nice. While I was at it, I also added an alternative set of templates for NaturalDocs doc comments. I use NaturalDocs for LnBlog, so I figured it would be nice to be able to auto-generate my docblocks there too. All I needed to do as add a line to my Sauce config to change the PDV template path.

PHP's documentation gets way too much credit. I often hear people rave about how great it is. Many of them are newbies, but I hear the same thing from experienced developers who've been writing PHP code for years.

Well, they're wrong. PHP's documentation sucks. And if you disagree, you're just plain wrong.

Actually, let me add some nuance to that. It's not that the documentation sucks per se, it's that it sucks as documentation.

You see, a lot of PHP's documentation is written with an eye to beginners. It has lots of examples and it actually does a very good job of showing you what's available and giving you a general idea of how to use it. So in terms of a tutorial on how to use the language, the documentation is actually quite good.

The problem is that, sometimes, you don't need a tutorial. You need actual documentation. By that, I mean that sometimes you care less about the generalities and more about the particulars. For instance, you might want to know exactly what a function returns in specific circumstances, or exactly what the behavior is when you pass a particular argument. Software is about details, and these details matter. However, PHP frequently elides these details in favor of a more tutorial-like format. And while that might pass muster for a rookie developer, it's decidedly not OK from the perspective of a seasoned professional.

Case in point: the socket_read() function. I had to deal with this function the other day. The documentation page is rather short and I was less than pleased with what I found on it.

By way of context, I was trying to talk to the OpenVPN management console, which runs on a UNIX domain socket. We had a small class (lifted from another project) that basically provided a nice facade over the socket communication functions. I'd noticed that, for some reason, the socket communication was slow. And I mean really slow. Like, a couple of seconds per call slow. Remember, this is not a network call - this is to a domain socket on the same box. It might not be the fastest way to do IPC, but it should still be reasonably quick.

So I did some experimentation. Nothing fancy - just injecting some microtime() and var_dump() calls to get a general idea of how long things were taking. Turns out that's all I needed. It quickly became obvious that each call to the method to read from the was taking about 1 second, which is completely absurd.

For context, the code in that method was doing something like this (simplified for illustration):

$timeoutTime = time() + 30; $message = ''; while (time() < $timeoutTime) { $character = socket_read($this->socket, 1); if ($character === '' || $character === false) { break; // We're done reading } $message .= $character; }

Looks reasonable, right? After all, the documentation says that socket_read() will return the number of characters requested (in this case one), or false on error, or the empty string if there's no more data. So this seems like it should work just fine.

Well...not so much.

The problem is with the last read. It turns out that the documentation is wrong - socket_read() doesn't return the empty string when there's no more data. In fact, I couldn't get it to return an empty string ever. What actually happens is that it goes along happily until it exhausts the available data, and then it waits for more data. So the last call just hangs until it reaches a timeout that's set on the connection (in our case, it was configured to 1 second) and then returns false.

So because we were relying on that "empty string on empty buffer" behavior to detect the end of input, calling that method always resulted in a one-second hang. This was fairly easily fixed by just reading the data in much larger chunks and checking how much was actually returned to determine if we needed another read call. But that's not the point. The point is that we relied on what was in the documentation, and it was just totally wrong!

And it's not like this is the first time I've been bitten by the PHP docs. Historically, PHP has been very bad about documenting edge cases. For example, what happens if a particular parameter is null? What's the exact behavior if the parameters do not match the expected preconditions? Or what about that "flags" parameter that a bunch of functions take? Sometimes the available flags are well documented, but sometimes it's just an opaque one-line description that doesn't really tell you what the flag actually does. It's a crap shoot.

To be fair, the PHP documentation is not the worst I've ever seen. Not even close. And it really is very good about providing helpful examples. It's just that it errs on the side of being light on details, and software is details.

Note to self: Sometimes you need to run composer du -o to make things work.

I'm not entirely sure why. But it's a pain in the butt.

This has come up a couple of times in working on one of our projects for work. This particular one is an internal command-line application written in PHP and using the command-line components from the Symfony framework. I won't get into the details, but the point is that it glues together the various steps required to configure and run certain things on Linux-based servers. So to test it, I have to put the code on my test server and make sure it works there.

The problem that I ran into today was that I tried to add a new command class to the application and it barfed all over itself. The Symfony DI container complained that it couldn't find a certain class name in a certain file. The PSR-4 autoloader requires that the class name and namespace match the filesystem path, so usually this indicates a typo in one of those. But in this case, everything was fine. The app worked fine on my laptop, and if I deleted the new command, it worked again.

Well, it turns out that running composer du -o fixed it. I suspect, based on the composer documentation, that the issue was that the class map was being cached by the opcache. The Symfony cache was empty, so that's about the only one left. Unfortunately, this is pretty opaque when it comes to trouble-shooting. I would have expected it to fall back to reading the filesystem, but apparently there's more to it than that. Perhaps it's related to how Symfony collects the commands - I haven't put in the time to investigate it.

But in any case, that's something to look out for. Gotta love those weird errors that give you absolutely no indication of the solution.

Author's note: Here's another old article that I mostly finished, but never published - I'm not sure why. This one is from way back on August 23, 2013.

I was working for deviantART.com at the time. That was a very interesting experience. For one, it was y first time working 100% remote, and with a 100% distributed team, no less. We had people in eastern and western Europe, and from the east coast to the west coast of the US. It was also a big, high-traffic site. I mean, not Google or Facebook, but I believe it was in the top 100 sites on the web at the time, according to Alexa or Quantcast (depending on how you count "top 100").

It also had a lot of custom tech. My experience up to that point had mostly been with fairly vanilla stuff, stitching together a bunch of off-the-shelf components. But deviantART was old enough and big enough that a lot of the off-the-shelf tools we would use today weren't wide-spread yet, so they had to roll their own. For instance, they had a system to do traits in PHP before that was actually a feature of PHP (it involved code generation, in case you were wondering).

Today's post from the archives is about my first dealings with one such custom tool. It nicely illustrates one of the pitfalls of custom tooling - it's usually not well documented, so spotting and resolving issues with it isn't always straight-forward. This was a case of finding that out the hard way. Enjoy!

Lesson learned this week: object serialization in PHP uses more space than you think.

I had a fun problem recently. And by "fun", I mean "WTF?!?"

I got assigned a data migration task at work last week. It wasn't a particularly big deal - we had two locations where user's names were being stored and my task was to consolidate them. I'd already updated the UI code to read and save the values, so it was just a matter of running a data migration job.

Now, at dA we have this handy tool that we call Distributor. We use it mainly for data migration and database cleanup tasks. Basically, it just crawls all the rows of a database table in chunks and passes the rows through a PHP function. We have many tables that contain tens or hundreds of millions of rows, so it's important that data migrations can be done gradually - trying to do it all at once would hammer the database too hard and cause problems. Distributor allows us to set how big each chunk is and configure the frequency and concurrency level of chunk processing.

There's three parts to Distributor that come into play here: the distributor runner (i.e. the tool itself), the "recipe" which determines what table to crawl, and the "job" function which performs the actual migration/cleanup/whatever. We seldom have to think about the runner, since it's pretty stable, so I was concentrating on the recipe and the job.

Well, things were going well. I wrote my recipe, which scanned the old_name table and returned rows containing the userid and the name itself. And then I wrote my migration job, which updated the new_name table. (I'm fabricating those table names, in case you didn't already figure that out.) Distributor includes a "counter" feature that allows us to trigger logging messages in jobs and totals up the number of times they're triggered. We typically make liberal use of these, logging all possible code paths, as it makes debugging easier.It seemed pretty straight-forward and it passed through code review without any complaints.

So I ran my distributor job. The old_name table had about 22 million rows in it, so at a moderate chunk size of 200 rows, I figured it would take a couple of days. When I checked back a day or two later, the distributor runner was reporting that the job was only 4% complete. But when I looked at my logging counters, they reported that the job had processed 28 million rows. WTF?!?

Needless to say, head-scratching, testing, and debugging followed. The short version is that the runner was restarting the job at random. Of course, that doesn't reset the counters, so I'd actually processed 28 million rows, but most of them were repeats.

So why was the runner resetting itself? Well, I traced that back to a database column that's used by the runner. It turns out that the current status of a job, including the last chunk of data returned by the crawler recipe, is stored in the database as a serialized string. The reason the crawler was restarting was because PHP's unserialize() function was erroring out when trying to deserialize that string. And it seems that the reason it was failing was that the string was being truncated - it was overflowing the database column!

The source of the problem appeared to be the fact that my crawler recipe was returning both a userid and the actual name. You see, we typically write crawlers to just return a list of numeric IDs. We can look up the other stuff st run-time. Well, that extra data was just enough to overflow the column on certain records. That's what I get for trying to save a database lookup!

It's time for the third, slightly shorter, installment of my ongoing series on refactoring my blogging software. In the first part, I discussed reworking how post publication was done and in the second part I talked about reworking things to add Webmention support. This time, we're going to talk about two mini-projects to improve the UI for editing posts.

This improvement is, I'm slightly sad to say, pretty boring. It basically involves fixing a "bug" that's really an artifact of some very old design choices. These choices led to the existing implementation behaving in unexpected ways when the workflow changed.

The Problem

Originally LnBlog was pretty basic and written almost entirely in HTML and PHP, i.e. there was no JavaScript to speak of. You wrote posts either in raw HTML in a text area box, using "auto-markup", which just automatically linkified things, or using "LBCode", which is my own bastardized version of the BBCode markup that used to be popular on web forums. I had implemented some plugins to support WYSIWYG post editors, but I didn't really use them and they didn't get much love.

The old LnBlog post editor

Well, I eventually got tired of writing in LBCode and switched to composing all my posts using the TinyMCE plugin. That is now the standard way to compose your posts in LnBlog. The problem is that the existing workflow wasn't really designed for WYSIWYG composition.

In the old model, the idea was that you could compose your entire post on the entry editing page, hit "publish", and it would all be submitted to the server in one go. There's also a "review" button which renders your post as it would appear when published and a "save draft" button to save your work for later. These also assume that submitting the post is an all-or-nothing operation. So if you got part way done with your post and decided you didn't like it, you could just leave the page and nothing would be saved to the server.

At this point it is also worth noting how LnBlog stores its data. Everything is file-based and entries are self-contained. That means that each entry has a directory and that directory contains all the post data, comments, and uploaded files that are belong to that entry.

What's the problem with this? Well, to have meaningful WYSIWYG editing, you need to be able to do things like upload a file and then be able to see it in the post editor. In the old workflow, you'd have to write your post, insert an image tag with the file name of your picture (which would not render), add your picture as an upload, save the entry (either by saving the draft or using the "preview", which would have trigger a save if you had uploads), and then go back to editing your post. This was an unacceptably workflow clunky.

On top of this, there was a further problem. Even after you previewed your post, it still wouldn't render correctly in the WYSIWYG editor. That's because the relative URLs were inconsistent. The uploaded files got stored in a special, segregated draft directory, but the post editor page itself was not relative to that directory, so TinyMCE didn't have the right path to render it. And you can't use an absolute URL because the URL will change after the post is published.

So there were two semi-related tasks to fix this. The first was to introduce a better upload mechanism. The old one was just a regular <input type="file"> box, which worked but wasn't especially user-friendly. The second one was to fix things such that TinyMCE could consistently render the correct URL for any files we uploaded.

The solution - Design

The actual solution to this problem was not so much in the code as it was in changing the design. The first part was simple: fix the clunky old upload process by introducing a more modern JavaScript widget to do the uploads. So after looking at some alternatives, I decided to implement Dropzone.js as the standard upload mechanism.

The new, more modern LnBlog post editor.

The second part involved changing the workflow for writing and publishing posts. The result was a somewhat simpler and more consistent workflow that reduces the number of branches in the code. In the old workflow, you had the following possible cases when submitting a post to the server:

New post being published (nothing saved yet).
New post being saved as a draft (nothing saved yet).
Existing draft post being published.
Existing draft post being saved.
New (not yet saved) post being previewed with attached files.
Existing draft post being previewed with attached files.

This is kind of a lot of cases. Too many, in fact. Publishing and saving were slightly different depending on whether or not the entry already existed, and then there were the preview cases. These were necessary because extra processing was required when an entry was previewed with new attachments because, well, if you attached an image, you'd want to see it. So this complexity was a minor problem in and of itself.

So the solution was to change the workflow such that all of these are no longer special cases. I did this by simply issuing the decree that all draft entries shall always already exist. In other words, just create a new draft when we first open the new post editor. This does two things for us:

It allows us to solve the "relative URL" problem because now we can make the draft editing URL always relative to the draft storage directory.
It eliminates some of those special cases. If the draft always exists, then "publish new post" and "publish existing draft" are effectively the same operation. When combined with the modern upload widget, this also eliminates the need for the special "preview" cases.

The implementation - Results

I won't get into the actual implementation details of these tasks because, frankly, they're not very interesting. There aren't any good lessons or generalizations to take from the code - it's mostly just adapting the ideosyncratic stuff that was already there.

The implementation was also small and went fairly smoothly. The upload widget was actually the hard part - there were a bunch of minor issues in the process of integrating that. There were some issues with the other part as well, but less serious. Much of it was just integration issues that weren't necessarily expected and would have been hard to foresee. You know, the kind of thing you expect from legacy code. Here's some stats from Process Dashboard:

Project	File Upload	Draft always exists
Hours to complete (planned):	4:13	3:00
Hours to complete (actual):	7:49	5:23
LOC changed/added (planned):	210	135
LOC changed/added (actual):	141	182
Defects/KLOC (found in test):	42.6	27.5
Defects/KLOC (total):	81.5	44.0

As you can see, my estimates here were not great. The upload part involved more trial and error with Dropzone.js than I had expected and ended up with more bugs. The draft workflow change went better, but I ended up spending more time on the design than I initially anticipated. However, these tasks both had a lot of unknowns, so I didn't really expect the estimates to be that accurate.

Take Away

The interesting thing about this project was not so much what needed to be done but why it needed to be done.

Editing posts is obvious a fundamental function of a blog, and it's one that I originally wrote way back in 2005. It's worth remembering that the web was a very different place back then. Internet Explorer was still the leading web browser; PHP 5 was still brand new; it wasn't yet considered "safe" to just use JavaScript for everything (because, hey, people might not have JavaScript enabled); internet speeds were still pretty slow; and browsing on mobile devices was just starting to become feasible. In that world, a lot of the design decisions I made at the time seemed pretty reasonable.

But, of course, the web evolved. The modern web makes it much easier for the file upload workflow to be asynchronous, which offers a much nicer user experience. By ditching some of the biases and assumptions of the old post editor, I was more easily able to update the interface.

One of the interesting things to note here is that changing the post editing workflow was easier than the alternatives. Keeping the old workflow was by no means impossible. I kicked around several ideas that didn't involve changing it. However, most of those had other limitations or complications and I eventually decided that they would ultimately be more work.

This is something that comes up with some regularity when working with an older code-base. It often happens that the assumptions baked into the architecture don't age well as the world around the application progresses. Thus, when you need to finally "fix" that aspect of the app, you end up having to do a bit of cost-benefit analysis. Is it better to re-vamp this part of the application? Or should you shim in the new features in a kinda-hacky-but-it-works sort of way?

While as developers, our first instinct is usually to do the "real" fix and replace the old thing, the "correct" answer is seldom so straight-forward. In this case, the "real" fix was relatively small and straight-forward. But in other cases, the old assumptions are smeared through the entire application and trying to remove them becomes a nightmare. It might take weeks or months to make a relatively simple change, and then weeks or months after that to deal with all the unforeseen fallout of that change. Is that worth the effort? It probably depends on what the "real" fix buys you.

I had a project at work once that was a great example of that. On the surface, the request was a simple "I want to be able to update this field", where the field in question was data that was generally but not necessarily static. In most systems, this would be as simple as adding a UI to edit that field and having it update the datastore. But in this case, that field was used internally as the unique identifier and was used that way across a number of different systems. So this assumption was everywhere. Everybody knew this was a terrible design, but it had been that way for a decade and was such a huge pain to fix that we had been putting it off for years. When we finally bit the bullet and did it right, unraveling the baked-in assumptions about this piece of data took an entire team over a month. At an extremely conservative estimate, that's well over $25,000 to fix "make this field updatable". That's a pretty hefty price tag for something that seems so trivial.

The point is, old applications tend to have lots of weird, esoteric design decisions and implementation-specific issues that constrain them. Sometimes removing these constraints is simple and straight-forward. Sometimes it's not. And without full context, it's often hard to tell when one it will be. So whenever possible, try to have pity on the future maintenance programmer who will be working on your system and anticipate those kind of issues. After all, that programmer might be you.

About a year and a half ago, I wrote an entry about the first step in my refactoring of LnBlog. Well, that's still a thing that I work on from time to time, so I thought I might as well write a post on the latest round of changes. As you've probably figured out, progress on this particular project is, of necessity, slow and extremely irregular, but that's an interesting challenge in and of itself.

Feature Addition: Webmention

For this second step, I didn't so much refactor as add a feature. This particular feature has been on my list for a while and I figured it was finally time to implement it. That feature is webmention support. This is the newer generation of blog notification, similar to Trackback (which I don't think anyone uses anymore) and Pingback. So, basically, it's just a way of notifying another blog that you linked to them and vice versa. LnBlog already supported the two older versions, so I thought it made sense to add the new one.

One of the nice things about Webmention is that it actually has a formal specification that's published as a W3C recommendation. So unlike some of the older "standards" that were around when I first implemented LnBlog, this one is actually official, well structured, and well thought out. So that makes things slightly easier.

Unlike the last post, I didn't follow any formal process or do any time tracking for this addition. In retrospect I kind of wish I had, but this work was very in and out in terms of consistency and I didn't think about tracking until it was too late to matter. Nevertheless, I'll try to break down some of my process and results.

Step 1: Analysis

The first step, naturally, was analyzing the work to be done, i.e. reading the spec. The webmention protocol isn't particularly complicated, but like all specification documents it looks much more so when you put all the edge cases and optional portions together.

I actually looked at the spec several times before deciding to actually implement it. Since my time for this project is limited and only available sporadically, I was a little intimidated by the unexpected length of the spec. When you have maybe an hour a day to work on a piece of code, it's difficult to get into the any kind of flow state, so large changes that require extended concentration are pretty much off the table.

So how do we address this? How do you build something when you don't have enough time to get the whole thing in your head at once?

Step 2: Design

Answer: you document it. You figure out a piece and write down what you figured out. Then the next time you're able to work on it, you can read that and pick up where you left off. Some people call this "design".

I ended up reading through the spec over several days and eventually putting together UML diagrams to help me understand the flow. There were two flows, sending and receiving, so I made one diagram for each, which spelled out the various validations and error conditions that were described in the spec.

That was really all I needed as far as design for implementing the webmention protocol. It's pretty straight-forward and I made the diagrams detailed enough that I could work directly from them. The only real consideration left was where to fit the webmention implementation into the code.

My initial thought was to model a webmention as a new class, i.e. to have a Webmention class to complement the currently existing TrackBack and Pingback classes. In fact, this seemed like the obvious implementation given the code I was working with. However, when I started to look at it, it became clear that the only real difference between Pingbacks and Webmentions is the communication protocol. It's the same data and roughly the same workflow and use-case. It's just that Pingback goes over XML-RPC and Webmention uses plain-old HTTP form posting. It didn't really make sense to have a different object class for what is essentially the same thing, so I ended up just re-using the existing Pingback class and just adding a "webmention" flag for reference.

Step 3: Implementation

One of the nice things about having a clear spec is that it makes it really easy to do test-driven development because the spec practically writes half your test cases for you. Of course, there are always additional things to consider and test for, but it still makes things simpler.

The big challenge was really how to fit webmentions into the existing application structure. As I mentioned above, I'd already reached the conclusion that creating a new domain object for the was a waste of time. But what about the rest of it? Where should the logic for sending them go? Or receiving? And how should sending webmentions play with sending pingbacks?

The first point of reference was the pingback implementation. The old pingback implementation for sending pingbacks lived directly in the domain classes. So a blog entry would scan itself for links, create a pingback object for each, and then ask the pingback if its URI supported pingbacks, and then the entry would sent the pingback request. (Yes, this is confusing. No, I don't remember why I wrote it that way.) As for receiving pingbacks, that lived entirely in the XML-RPC endpoint. Obviously none of this was a good example to imitate.

The most obvious solution here was to encapsulate this stuff in its own class, so I created a SocialWebClient class to do that. Since pingback and webmention are so similar, it made sense to just have one class to handle both of them. After all, the only real difference in sending them was the message protocol. The SocialWebClient has a single method, sendReplies(), which takes an entry, scans its links and for each detects if the URI supports pingback or webmention and sends the appropriate one (or a webmention if it supports both). Similarly, I created a SocialWebServer class for receiving webmentions with an addWebmention() method that is called by an endpoint to save incoming mentions. I had originally hoped to roll the pingback implementation into that as well, but it was slightly inconvenient with the use of XML-RPC, so I ended up pushing that off until later.

Results

As I mentioned, I didn't track the amount of time I spent on this task. However, I can retroactively calculate how much code was involved. Here's the lines-of-code summary as reported by Process Dashboard:

Base:	8057
Deleted:	216
Modified:	60
Added:	890
Added & Modified:	950
Total:	8731

For those who aren't familiar, the "base" value is the lines of code in the affected files before the changes, while the "total" it the total number of lines in affected files after the changes. The magic number here is "Added & Modified", which is essentially the "new" code. So all in all, I wrote about a thousand lines for a net increase 700 lines.

Most of this was in the new files, as reported by Process Dashboard below. I'll spare you the 31 files that contained assorted lesser changes (many related to fixing unrelated issues) since none of them had more even 100 lines changed.

Files Added:	Total
lib\EntryMapper.class.php	27
lib\HttpResponse.class.php	60
lib\SocialWebClient.class.php	237
lib\SocialWebServer.class.php	75
tests\unit\publisher\SocialWebNotificationTest.php	184
tests\unit\SocialWebServerTest.php	131

It's helpful to note that of the 717 lines added here, slightly less than half (315 lines) is unit test code. Since I was trying to do test-driven development, this is to be expected - the rule of thumb is "write at least as much test code as production code". That leaves the meat of the implementation at around 400 lines. And of those 400 lines, most of it is actually refactoring.

As I noted above, the Pingback and Webmention protocols are quite similar, differing mostly in the transport protocol. The algorithms for sending and receiving them are practically identical. So most of that work was in generalizing the existing implementation to work for both Pingback and Webmention. This meant pulling things out into new classes and adjusting them to be easily testable. Not exciting stuff, but more work than you might think.

So the main take-away from this project was: don't underestimate how hard it can be to work with legacy code. Once I figured out that the implementation of Webmention would closely mirror what I already had for Pingback, this task should have been really short and simple. But 700 lines isn't really that short or simple. Bringing old code up to snuff can take a surprising amount of effort. But if you've worked on a large, brown-field code-base, you probably already know that.

Nice little trick I didn't realize existed: you can install Composer packages globally.

Apparently you can just do composer global init ; composer global require phpunit/phpunit and get PHPUnit installed in your home directory rather than in a project directory, where you can add it to your path and use it anywhere. It works just like with installing to a project - the init creates a composer.json and the require adds packages to it. On Linux, I believe this stuff gets stored under ~/.composer/, whereas on Windows, they end up under ~\AppData\Roaming\Composer\.

That's it. Nothing earth-shattering here. Just a handy little trick for things like code analyzers or other generic tools that you might not care about adding to your project's composer setup (maybe you only use them occasionally and have no need to integrate them into your CI build). I didn't know about it, so I figured I'd pass it on.

The first part of my LnBlog refactoring is kind of a large one: changing the publication logic. I'll start by giving an overview of the old design, discussing the new design, and then looking at some of the actual project data.

History

LnBlog is a very old system. I started it back in 2005 and did most of the work on it over the next two years. It was very much an educational project - it was my first PHP project (in PHP 4, no less) and only my third web-based project. So I really didn't know what I was doing.

As you might expect, the original design was very simple. In the first version, "publishing an entry" really just meant creating a directory and writing a few files to the disk. As the system grew more features, more and more steps were added to that process. The result was a tangle of code where all of the logic lived in the "controller" (originally it was a server-page) with some of the actual persistence logic encapsulated in the domain objects. So, roughly speaking, the BlogEntry object knew how to write it's metadata file to the appropriate location, but it didn't know how to properly handle uploaded files, notifications, or really anything else.

Originally, LnBlog used a server-page model, with each URL being a separate file and code being shared by including common configuration files and function libraries all over the place. Shortly prior to starting this project, I had consolidated the logic from all those pages into a single, massive WebPages class, with one public method for each endpoint - a monolithic controller class, for all intents and purposes. This still isn't great, but it gave me a common entry point to funnel requests through, which means I can better control common setup code, handle routing more easily, and generally not have the pain of dealing with a dozen or more endpoints.

Anyway, this WebPages class served up and edited entries by directly manipulating domain objects such as BlogEntry, Blog, BlogComment, etc. The domain objects knew how to save themselves to disk, and a few other things, but the majority of the logic was right in the WebPages class. This worked fine for the main use-case of creating an entry from the "edit entry" page, but it made things very awkward for the less-used locations, such as the Metaweblog API endpoint or scheduled publication of drafts.

Furthermore, the publication logic in the WebPages class was very hairy. All types of publication flowed through a single endpoint and used a common code path. So there were flags and conditions all over the place, and it was very difficult to tell what needed to be updated when making changes. Bugs were rampant and since there was no test automation to speak of, testing any changes was extremely laborious and error prone.

The New Design

There were two main goals for this refactoring:

Create a new Publisher class to encapsulate the publication logic. The idea was to have a single entity that is responsible for managing the publication state of blog entries. Given a BlogEntry object, it would know how to publish it as a regular entry or a static article, unpublish it, handle updating or deleting it, etc. This would give us a single entity that could own all the steps involved in publishing.
Create unit tests around the publication process. The logic around the entire process was more complicated than you'd think and the old code was poorly structured, so it broke with disturbing regularity. Hence I wanted some automated checking to reduce the risk of bugs.

So the design is actually pretty straight-forward: create a "publisher" class, give it operations for each of the things we do with blog entries (create, update, delete, etc.), copy the existing logic for those cases into the corresponding methods, update the endpoints to use this new class, and call it a day.

So it was mostly just a reorganization - there wasn't any significant new logic that needed to be written. Simple, right? What could go wrong?

Results

While I was happy with the result, this project turned out to be a much larger undertaking than I'd assumed. I knew it was going to be a relatively large task, but I was off by a factor of more than two. Below is a table summarizing the project statistics and comparing them to the original estimates (from PSP data I captured using Process Dashboard).

Planned and actual project statistics
	Planed	Actual
Total Hours	29:01	64:00
LOC added and modified	946	3138
LOC/Hour	32.6	49.0
Total Defects	82.1	69.0
Defects/KLOC	86.8	21.7

When I originally did the conceptual design and estimate, I had planned for relatively modest changes to the Article, BlogEntry, and WebPages classes and the creators.php file. I also planned for new Publisher and BlogUpdater classes, as well as associated tests and some tests for the WebPages class. This came to 29 hours and 946 new or changed lines of code across nine source files. Definitely a big job when you consider I'm working in increments of two hours or less per day, whenever I get around to it.

In actuality, the changes were much larger in scope. I ended up changing 27 additional files I hadn't considered, and ended up creating two other new utility classes (although I did ditch the BlogUpdater class - I no longer even remember what it was supposed to do). The resulting 3138 lines of code took me 64 hours spread over five months.

Effects of Testing

I did test-driven development when working on this project. I've found TDD to be useful in the past and it was very helpful to me here. It was also very effective in meeting my second goal of building a test suite around the publication logic. PHPUnit reports statement coverage for the Publisher tests at 97.52% and close to 100% coverage for the tested methods in the WebPages class (I only wrote tests for the endpoint that handles creating and publishing entries).

More importantly, using TDD also helped me to untangle the logic of the publication process. And it turns out there was actually a lot to it. I ended up generating about 2000 lines of test code over the course of this project. It turns out that the design and unit test phase occupied 65% of the total project time - about 41 hours. Having a comprehensive test suite was immensely helpful when I was rebuilding the publication logic across weeks and months. It allowed me to have an easy check on my changes without having to load all of the code I worked on three weeks ago back into my brain.

Sadly, the code was not such that I could easily write tests against the existing code. In fact, many of the additional changes came from having to break dependencies in the existing code to make it possible to unit test. Luckily, most of them were not hard to break, e.g. using an existing file system abstraction layer, but the work still adds up. It would have been very nice to have an existing test suite to prevent regressions in the rewrite. Unfortunately, even integration tests would have been awkward, and even if I could have written them, it would have been very hard to get the level f coverage I'd need to be confident in the refactor.

Conclusion

In terms of the results, this project worked out pretty well. It didn't really go according to plan, but I got what I was looking for in the end - a better publication design and a good test suite. However, it was a long, slow slog. Maybe that was too big a slice of work to do all at once. Perhaps a more iterative approach could have kept things moving at a reasonable pace. I'll have to try that on the next big project.

Today, we're going to talk a little about design and refactoring. As a case-study, we're going to use a little blogging application called LnBlog. You probably haven't heard of it - it's not very popular. However, you have used it, at least peripherally, because it's running this site. And you also have at least a passing familiarity with the author of that application, because it's me.

Motivation

Software is an "interesting" field. The cool new technologies, frameworks, and languages get all the press and they're what everybody wants to work with. But let's be honest: it's generally not what makes the money. I mean, how could it be? It just came out last week!

No, if you have the good fortune to work on a grown-up, profitable product, it's almost certainly going to be the "old and busted" tech. It might not be COBOL or Fortran, but it's almost certainly "legacy code". It might be C++ or Java or, in our case, PHP, but it's probably old, poorly organized, lacking unit tests, and generally hard to work with.

I work on such a product for my day job. It's a 10-year-old PHP codebase, written in an old-fashioned procedural style. There are no unit tests for the old code, and you couldn't really write them even if you wanted to. Sure, there's a newer part with proper design, tests, etc., but the old code is the kind of stuff that "works for the most part", but everybody is afraid to touch it because it's so brittle and tightly coupled that God alone knows what will break when you make a change.

This also applies to LnBlog. It was my very first PHP application. I started it way back in early 2005, in the bad old days of PHP 4. Over the next two or three years, I managed to turn it into something that was relatively functional and full-featured. And for the last ten years or so, I've managed to keep it working.

Of course, it hasn't gotten a whole lot of love in that time. I've been busy and, for the most part, it worked and was "good enough". However, I occasionally need to fix bugs or want to add features, and doing that is a truly painful process. So I would very much like to alleviate that pain.

The Issue

Let me be honest: I didn't really know what I was doing when I wrote LnBlog. I was about four years out of school and had only been coding for about six or seven years total. And I was working mostly in Visual Basic 6 at the time, which just barely counts. It was also only my third web-based project, and the first two were written in classic ASP and VBScript, which also just barely counts.

As a result, it contains a lot of questionable design decisions and overly-complicated algorithms. The code is largely procedural, kind of buggy, and makes poor use of abstraction. So, in short, it's not great.

But, in fairness to myself, I've seen worse. In fact, I've seen a lot worse. It does have a class hierarchy for the domain objects (though it's a naive design), an abstraction layer for data access (though it's inconsistently used), and a templating system for separating markup from domain logic (though the templates are an ungodly mess). And it's not like I had a role model or mentor to guide me through this - I was figuring out what worked on my own. So while it's not great, I think it's actually surprisingly good given the circumstances under which it was built.

The Goal - Make the Code "Good"

So I want to make LnBlog better. I've thought about rewriting it, but decided that I wouldn't be doing myself any favors by going that route. I also hold no illusions of a grand re-architecture that will fix all the problems and be a shining beacon of design perfection. Rather, I have a relatively modest list of new features and bug fixes, and I just want to make the code good enough that I can make changes easily when I need to and be reasonably confident that I'm not breaking things. In other words, I want to do a true refactoring.

If you haven't read Martin Fowler's book, the word "refactoring" is not a synonym for "changing code" or "rewriting code". Rather, it has a very specific meaning: improving the internal design of code without changing the external behavior. In other words, all you do is make the code easier to work with - you don't change what it does in any way. This is why people like Bob Martin tell you that "refactor X" should never be an item in your Scrum backlog. It is purely a design and "code cleanliness" activity, not a "feature" you can deliver.

So my goal with LnBlog is to gradually reshape it into what it should have been in the first place. This is partially to make changing it easier in the future. But more importantly, it's a professional development goal, an exercise in code craftsmanship. As I mentioned above, I've worked professionally on many systems that are even more messed up than LnBlog. So this is a study in how to approach refactoring a system.

And So It Begins...

My intention is to write a number of articles describing this process. I've already completed the first step, which is rewriting some of the persistence and publication logic. I'm using the PSP to track my planned and actual performance, so I'll have some actual data to use in my discussion of that process. Hint: so far, the two are very different.

With any luck, this project will enable me to draw some useful or interesting conclusions about good and bad ways to approach reworking legacy systems. Maybe it will also enlighten some other people along the way. And if nothing else, I should at least get a better codebase out of it.

Note to self: when Ext Direct calls start failing, look in the request headers for error messages. I'm not sure whether it's Ext itself or just our framework, but for whatever reason, Ext Direct calls seem to more or less swallow server-side errors.

In this particular case, I was experimenting with some of the code that our in-house development framework uses to render maps. We have OpenLayers on the front-end a custom PHP back-end that we communicate with in part through Ext Direct, which is the handy-dandy AJAX RPC framework that comes packaged with Sencha's ExtJS.

So anyway, I made some changes, reloaded the page, and all my Ext Direct calls were failing. No meaningful error messages, nothing in the JavaScript console, and the response body was just empty. So what the heck was happening? (Yeah, I know, I could have just run the unit tests, since we actually have unit tests for the framework code. But that didn't occur to me because so much of the application code is missing them and I was just experimenting anyway. Get off my back!)

Then I noticed, just by chance, that the request headers in the network tab of Chrome's dev tools looked weird. In particular, it contained this header:
X-Powered-By:PHP/5.3.21 Missing argument 1 for...

So that's what happened to the error message — it got dumped into the headers. Not terribly helpful, but good to know.

A year or two ago, I decided to update my skill set a little and brush up on my Python. In particular, I wanted to do a little web development in Python, which I hadn't done in nearly five years. Since I wanted to start with something fairly basic, I decided to go with the Flask micro-framework. I ended up using that to build Lnto, which I've mentioned before and I swear will get its own page someday.

One big problem with this project was was that my hosting is kind of bare-bones. Don't get me wrong - the service is fine. I use ICDSoft, whom I've been with for several years. I have no complaints about the service I've received - in fact I'm quite happy with it. However, it's mostly focused on PHP and MySQL and there's no shell access. But on the other hand, I'm paying less than $5 a month with only a one-year commitment, so I don't have much to complain about.

Anyway, the problem with running Flask apps, or pretty much anything other than PHP, is that they have no documentation for that whatsoever. There's a FAQ, but it says absolutely nothing about Python other than that they "support" it. As far as I can tell, that just means that they have a Python interpreter installed and Apache CGI handler is configured to run *.py files. There's certainly no mention of running things using WSGI, which seems to be the recommended method for running most Python frameworks.

Another problem is actually installing the Flask libraries. The documentation for, well, pretty much every Python framework or app out there tells you the best way to install it is with pip or easy_install. But, of course, you need shell access to run those, assuming they're even installed on the server. (And I did check - they're not installed.)

Fortunately, getting around these problems was relatively easy using using virtualenv, which I'd nearly forgotten existe). This creates a virtual Python environment which is isolated from the rest of the system. A side-benefit of this is that virtualenv creates a copy of pip in your virtual environment.

You can use virtualenv directly from the source distribution. This was required in my scenario, since I lack any shell access, let alone root privileges. I simply extracted the virtualenv source archive, uploaded it to my host, and ran the following command (I used PHPsh, a web-based shell emulator, but copying them into a PHP script would have worked just as well):
python /path/to/virtualenv-1.11.4/virtualenv.py /path/to/myapp/venv

This create a virutal environment in the /path/to/venv directory. You can then install packages into that environment using the "activate" script to configure the shell, like this:
. /path/to/venv/bin/activate && pip install Flask

That was easy. I now have a Python environment with Flask installed. All I need do at this point is configure my application code to use it. That's accomplished with a few lines to initialized the virtualen environment and start up Flask as a CGI app:
activate_this = '/path/to/venv/bin/activate_this.py' execfile(activate_this, dict(__file__=activate_this)) from myapp import app from wsgiref.handlers import CGIHandler CGIHandler().run(app)
I just re-ran that entire process using the latest version of virtualenv and it's actually quite painless.

And as a side-note, the reason I did that was because I noticed the other day that Lnto had suddenly stopped working - the server was just returning 500 errors. Which was odd because I hadn't changed anything with the app or the database in weeks. However, the answer was found on the virtualenv PyPi page:

Python bugfix releases 2.6.8, 2.7.3, 3.1.5 and 3.2.3 include a change that will cause "import random" to fail with "cannot import name urandom" on any virtualenv created on a Unix host with an earlier release of Python 2.6/2.7/3.1/3.2, if the underlying system Python is upgraded.

When I created that virtualenv environment, the server was running Python 2.6. But when I checked yesterday, the Python version was 2.7. So apparently ICDSoft upgraded their servers at some point. No big deal - just recreated the environment and I was good to go!

The other day, one of my co-workers posted a link to a semi-comprehensive list of everything that's wrong with PHP. This should be required reading for anyone who claims to like PHP.

The best thing about this article is that I actually learned a number of things from it. Apparently PHP has a lot of WTFs that I simply hadn't run into. Though it's relatively minor, my personal favorite is :

"gzgetss - Get line from gz-file pointer and strip HTML tags." I'm dying to know the
series of circumstances that led to this function's conception.

I would also love to know what motivated someone to include this function in the zlib extension. I mean, how many people have ever actually used that function? Like, three? It's ridiculously specialized and isn't part of the C API (I know because I checked) - there's no reason for it to exist. To me, that perfectly sums up the language's lack of any coherent design or sense of priority.

Of course, there were plenty of other nice little gems. For instance, the face that when $x is null, $x++ is 1, but $x-- is null. Or the notion that named parameters were rejected as a feature because they would "make for messier code". There's certainly lots of information and humor value there for anybody who is familiar with PHP.

Personally, after reading that, I think it's time for me to brush up on my C# and Python. I've been making my living writing PHP for five years now, and I fear it may be causing brain damage. After all, Edsger Dijkstra said that BASIC warps the mind, and PHP has been called "the Visual Basic of the web". And after reading this article, can there really be any doubt?

I ran into an interesting bug in OSX the other day. Well, I regard it as a bug, anyway - Apple may feel differently, as it's been there for a while.

I was using a PHP library called ZipStream. It basically lets you create a zip archive on the fly and stream it to the client as each files is compressed, as oppsed to generating the entire on the server and then sending it to the client. This is nice because the user doesn't have to have a (potentially) long delay before the download starts - you can start sending data right away.

Anyway, the library was working wonderfully...until I tested the archive on my MacBook. Turns out that OSX doesn't like the archives that ZipStream generates. Or, rather, the OSX Archive Utility doesn't like them. When you double-click the archive in Finder, rather than properly decompressing it, OSX extracts it into a .cpgz file. And if you double-click that file, it extracts into another archive, and so on ad infinitum.

By way of contrast, everything else seems to be able to extract the archive normally. The Windows zip archive handles manages it fine, as does WinRAR and 7-zip; on OSX, Safari's built-in zip handling transparently decompresses it without problems; even the OSX command-line "unzip" program handles it without problems. It's just the Archive Utility - which is, unfortunately, the default handler in Finder.

Luckily, the solution is pretty simple. It turns out that the OSX archive tool doesn't like the "version needed to extract" set by ZipStream. The value set for ZipStream's archives is 0x0603. If you change that to 0x000A, then the OSX Archive Utility will open the file normally, just like every other program. Of course, you have to modify ZipStream itself to get this to work, but that's not really a big deal - it's just a one-line change.

I'm not entirely sure why the OSX archiver doesn't like that version number. Perhaps that flag implies some other features that the Archive Utility doesn't support. Or maybe it requires additional metadata which wasn't set in the archive, and so it was technically out of spec. But to me, it really doesn't matter - either way, it's a bug in the OSX archiver. If the zip file was out of spec, they should just detect that and handle it, because everybody else does. And if the program doesn't support other features implied by that version number, then they should have either implemented those features (I mean, it's not like the ZIP format is new or anything) or they should have done the check based on the archive content rather than version number - if the file doesn't use any of the unsupported features, then there's no reason that the archiver shouldn't handle it correctly.

You know what annoys me? People with crazy ideas. Especially when they pimp them like crazy.

That's why this tutorial on "code separation" from a web forum I occasionally visit annoys me so much. The author links to this thing like his life depends on it. Whenever somebody has the nerve to post a code snippet that has both HTML and PHP in it, he brings it up. Even if the code is just echoing a variable inside an HTML block. It's ridiculous.

Don't get the wrong idea - separation of concerns is obviously a good thing. If you're outputting HTML and querying the database in the same file, you're doing things wrong. But this guy takes it to absurd lengths and insists that you should never have any PHP code mixed in with your HTML. Not even echo statements.

The real kicker is the content of this "tutorial". It's basically a half-baked template system that does nothing but string replacement of pre-defined placeholders. At best, it grossly over simplifies the problem. I suppose it does demonstrate that it's possible to output a page without having HTML and PHP in the same file (as if anyone really doubted that), but that's about it.

The thing that really bothers me about this approach is that the author promotes it results in code that's easier to understand than code that has both PHP and HTML in it. Except that it's not. The guy apparently just has a pathological fear of having two different languages in the same file. it's completely irrational.

The problem with his approach is that it doesn't actually solve the problem, but just moves it. Sure, basic replacement like that it's fine for simple cases, but as soon as the requirements for your markup get more complicated, things blow up. For example, how do you do conditionals? Well, you have another template file and you do a check in your controller for which one to inject into the page. What about loops? Well, you have a template file for the loop content and you run the actual loop in your controller, build up the output, and inject that into the page.

The net result? What would normally be a fairly simple page consisting of one template with a loop and two conditionals is now spread across six template (one main template, one for the loop body, and two for each if statement) and pushes all the display logic into the controller. So instead of one "messy" template to sort through, you now have a seven-file maze that accomplishes the same thing.

I find it difficult to see how this is any sort of improvement. At best, it's just trading one type of complexity for another in the name of some abstract principle that mixing code and markup is evil. Of course, if you want to follow that principle, you could always go the sane route and just use something like Smarty instead. But let's be honest - that's just using PHP code with a slightly different syntax. It may be useful in some cases, but it's not really fundamentally different from just writing your template files in PHP.

Personally, I've come to be a believer in keeping things simple. PHP is a template system. It was originally designed as a template system. It's good at it. So there's no need to introduce additional layers of template abstraction - just stick with PHP. There may be cases where things like Smarty are useful, but they're far from necessary. And the half-baked templating systems like those advocated in that tutorial are just intellectual abortions. There's no need to reinvent a less functional version of the wheel when you can just use a working, tested wheel.

At my current company, we've been trying to hire a new back-end PHP developer for some time. As the senior engineer on staff, and the only back-end person, it's my job to handle the technical portion of the screening process, including resume screening and technical interview. Unfortunately, I have lots of other work to do, so I never had time to come up with a really good procedure to test candidates. In addition, most of the interviews are over the phone with people in other cities, which makes it difficult to have them write code without some advanced preparation - which, again, I haven't had time to do.

So, as a compromise, I came up with a standard list of questions that I ask candidates. I originally wrote this list with the intention of using it for junior developers (1 to 3 years of experience), thinking that anyone with 5+ years of experience would breeze through it. However, I've found that a disturbingly large percentage of more experienced developers fumble more than one of the questions. Unfortunately, I'm not yet sure if that's just because we're getting poor quality applicants, or if it's because my standards are too high.

Therefore, in the interests of helping my fellow developers - both those looking for new opportunities and those who are interviewing the first group - as well as in the hopes of getting some good feedback, I present my list of interview questions. Note that I do not include answers. There are two reasons for this:
1) There is not always a single "right" answer - it is perfectly valid for the question to turn into a conversation.
2) If you're an interviewer and you don't already know the answer, you shouldn't be asking the question; if you're an interviewee and you don't know the answer, you should look it up and learn something new.
With each question I've included some notes on the question's intent. Some of these are conceptual, some are simply tests of knowledge, and some are spring-boards to further discussion.

Solve the FizzBuzz problem. How would you solve it in an object-oriented fashion? Using recursion?

Define "inheritance" and "polymorphism". Give an example of when you might use each.

What is the difference between an "abstract class" and an "interface"?

What is "SQL injection" and how might you protect against it? What about XSS/CSRF?

strip_tags(), htmlentities()

In JavaScript, what does it mean to say that functions are "first class" or "first order" objects?

What is the difference between NULL and the empty string in SQL?

NULL = ''

What is an "index" in a database and how would you decide what columns to index?

Given a sample HR database that contains the following tables:

Employees (employee_id INTEGER PRIMARY KEY, employee_name TEXT, ...) Departments (department_id INTEGER PRIMARY KEY, department_name TEXT, ...)
What would you have to add to the schema to associate employees with departments?

Describe a project you worked on that was particularly enjoyable or interesting. What did you like about it?

Personally, I consider this list to be inadequate at best. But as I said, I haven't had time to develop an adequate test and this is at least better than asking "So how well do you know X?", if for no other reason than it's harder to BS an acceptable answer. Any thoughts or opinions are welcome.

I sometimes forget that UNIXy things aren't as easy on Windows.

Today I spent way too long trying to figure out why PHP file uploads weren't working on IIS 7. First, I was getting permission denied messages on "C:\Windows\Temp". OK, that's fine, we'll just change upload_tmp_dir to something else, say "C:\Inetpub\wwwroot\uploads".

So I change that in my php.ini and restarted IIS. The result? Same permission denied message on "C:\Windows\Temp". Hmm.... Apparently, according to the docs, if you don't have write permissions on the upload_tmp_dir, PHP falls back to the system default. So I checked the permissions - IIS_IUSERS has read/write permissions. Apparently that's not good enough. I ended up adding write permissions for the "Users" group, which didn't sit well, but worked. However, after I removed write permissions for that group, just to check if that was the fix, it still worked. (Edit: Apparently it actually didn't work after that. Man, I really must have been out of it last night.)

So now, uploads work, but I have no idea what's going on or what the proper ACLs are. It's probably just too late and I need to go to bed. But for future reference, make sure to change the permissions on Windows when working with uploads.

Today I'm going to discuss a comedy of errors. This starts out with a nasty bug that surfaced in my company's product a couple of months ago, and finally became clear when I was doing some prep work for implementing a CDN. It's a tale of good intentions, bad ideas, and code I didn't write, but have to fix.

First, the bug.

To explain the bug, I have to tell you a little about how my company's new product works. Basically, it's a drag-and-drop UI builder for Flash ads. The user designs their ad on a design surface in a FLEX app, saves it, than can then publish the result. However, rather than actually compile a SWF for the ad, we're currently assembling all the assets at run-time on the client-side. Our ad tags serve up a shell SWF file and inject a few parameters into it, including a URL to a "recipe" file that we use to cook up the ad. This is just an XML file contains the information for the ad's constituent elements. The SWF file parses it and pulls/creates all the needed objects to build the ad. There were various reasons for doing it this way, but I won't get into those.

Now, this bug was a real head-scratcher. The actual problem was simple - the shell SWF just wasn't rendering the ad. No error, no message - just didn't work. However, it only happened in IE - Firefox, Chrome, Opera, and Safari worked just fine. It also only happened in our production and test environments - our dev servers and demo server worked fine in IE. The SWF files were identical in every environment - I know because I triple checked. What's more, I could see the XML file being requested in the server logs, so the SWF wasn't totally crapping out. And, again, it worked in other browsers, so it didn't seem like there could be an issue with the SWF.

Well, after researching and messing around with this for the better part of a day, our QA person found a link that put us on the right track. In fact, there are a bunch of such links. It turned out to be an issue with the HTTP headers on the XML file. The file was being served over SSL with the "no-cache" header set. Turns out IE doesn't like this. Apparently "no-cache" keeps the file completely out of cache when used on SSL, which means not even long enough for the browser to pass it off to the Flash plugin. Apparently we would have seen an error for this if we did the SWF file in ActionScript 3, but we used ActionScript 2 (apparently most of our customers require ads to be in AS 2 - don't ask me why), which has a penchant for failing silently. And the reason it didn't happen in all four environments is because the Apache configurations were actually not all the same. Go figure.

Fast forward two months to the discovery of the cause.

I'm looking to implement a CDN. We've got one that does origin pull, so it shouldn't be a big deal, right? Well, yeah, but we still have to make some changes, because those "recipe" files are on the same path as rest of our media and it won't do to have the CDN caching them when users are editing them and trying to preview the output. So I need to fix it so that, at least for the previews, we can serve the recipes from our own server instead of the CDN.

A few months ago, when we implemented user uploads, we (and by "we" I mean "another guy on my team") added the concept of a "media URL" to our system. The idea is that we could just change this media URL to switch to a CDN without having to change any URLs in code. This was implemented as a method on one of our back-end classes. It would return the base domain and path, if applicable, from which media is served. So building a URL would look like this:
$image_url = ObfuscatedAdClassModel::getMediaURL(); $image_url .= '/path/to/naughty_pic.jpg';
The getMediaURL() method just checks the database/Memcache for a saved URL and returns the saved value or a static default if nothing is found. Easy-peasy.

Or not. You see, I exaggerated in that last sentence - what I meant was, that's what getMediaURL() should do. In actuality, it does a bit more. In fact, as an object lesson in over complication, I'm posting the redacted code below.
static public function getMediaURL(){ global $cache_enabled;

$db = self::getDB(); if (!is_object($db)) { throw new Exception('Error getting database connection'); }

$settings_key='media_url'; if( $ft_query_cache_enabled ) { $settings_value = $db->getCache($settings_key); }

if (!$settings_value){ $settings_value=DEFAULT_MEDIA_URL; $query = "SELECT value FROM settings WHERE key='$settings_key'"; $ret = $db->query($query); if (is_array($ret[0])) { $settings_value = $ret[0]["settings_value"]; } elseif ($ret === true){ try{ $db->startTrans(); $have_lock = $db->query( "LOCK TABLE settings IN ACCESS EXCLUSIVE MODE;" ); if ($have_lock){ $query = "SELECT value FROM settings WHERE key='$settings_key'"; $ret = $db->query($query); if (!is_array($ret[0])) { $query = "INSERT INTO settings (key, value) VALUES ('$settings_key','$settings_value')"; $ret = $db->query($query); if($ret !== true){ $err_msg = "Unable to insert setting: $settings_key"; Log::error($err_msg); throw new Exception($err_msg); } } } else { $err_msg = "Could not acquire table lock or table does not exist: settings ".$db->getLastError(); Log::error($err_msg); throw new Exception($err_msg); } $db->commitTrans(); } catch (Exception $e){ $db->rollbackTrans(); $err_msg = 'DB error while attempting to update settings.'; Log::error($err_msg); throw new Exception($err_msg); } } else { $err_msg = 'DB error while attempting to load settings.'; Log::error($err_msg); throw new Exception($err_msg); } if( $cache_enabled ) { $db->setCache($settings_key, $settings_value, 3600); } } if ( $_SERVER["SERVER_PORT"] == 443 ){ return "https://" . $settings_value . "/"; } else { return "http://" . $settings_value . "/"; } }
This should be just a simple look-up in a generic settings table. However, when the desired setting is not found, this method tries to insert it into the database. This is completely unnecessary and leads us to five levels of nested "if" blocks and a table lock. We already have the default value value, so why not just return that? And the table lock is just paranoid. This is not a setting that's likely to change more than once every couple of months. And if we do get a minor inconsistency, so what? You think users are going to complain that an ad didn't render properly?

Note also that the returned URL is set to straight HTTP or SSL based on the current server port. Hint: this will be important later.

Now, the reason I'm looking at this is because I need to adjust how the URLs to the recipe files are handled in our system. Given our media URL scheme, if we set our static contents to serve from the CDN, the recipes will go with them. For previewing in-progress ads, this won't work. So I need to change the recipe URLs to use something other than the media URL. So I find where the recipe URL is injected into the client-side code and start tracing it backward.

To my surprise, I find the that recipe URL is actually not set dynamically using getMediaURL(). It turns out it's coming from our back-end ad object, via a getMetadata() getter method, as a full, absolute URL. And what is this metadata? Well, it's an associative array of seemingly random data that we serialize and cram in the database. And by "we", I mean the same guy who wrote getMediaURL(). For the record, I told him it was a bad idea.

So if the recipe URL is coming out of this metadata, what does that mean? That it's stored in the database. So I start grepping for the key for the metadata of the recipe URL. And I find it in the web service method in our management dashboard that saves the XML files.

Let's pause here. Now, if you're very sharp, you may have asked yourself earlier why we were serving out this XML file over SSL. We're serving ads right? They don't need to go over SSL. And this Flash problem was specific to SSL, so if we just served them over straight HTTP, we should have been good. So why didn't we do that?

That's a good question. It had occurred to me after I fixed that bug (by changing the Apache configuration), but when I looked, I couldn't find where the URL was set to HTTPS. Plus I didn't know why we were serving them over HTTPS, so I assumed there must be some reason for it. And besides, the bug was "fixed" and I had lots of other work to do at the time, so it wasn't a priority.

Again, if you're sharp, you can probably see that there was no good reason. The recipes were being served over SSL by accident! You see, our management dashboard, which is where the recipe file is saved, runs over HTTPS, and getMediaURL() selects the protocol based on the current one. So when we called getMediaURL() to build the recipe URL to save in the database, it came back as an HTTPS URL.

So there you have it. A crazy, hard to diagnose bug, caused by a method that's too smart for its own good, and hidden by an ill-conceived half-abstraction. I hate to speak negatively of a friend and former colleague, but this was really a lesson in poor system design. He needed to separate things out a bit more. He should have done less in getMediaURL() and factored the generic "metadata" out into separate properties rather than lumping them all together.

Situations like this are why we have guidelines in programming. Things like "don't put serialized strings in the database", "a method should do one thing", and "don't use global variables" can seem arbitrary to the inexperienced. After all, it's so much quicker an easier to do all those things. But those of us who've been around the block a few times get that queasy feeling. We know there's a reason for those guidelines - those things can easily come back and bite you hard later on. Sure, they might not become a problem, but if they do, it's going to be much harder to fix them later than to "do it right" the first time. The hard part of software development is not "making things work" - a trained chimp can do that. The real art is in keeping things working over the long haul. That's what separates the real pros from the ones who are just faking it.

OK, I'm not crazy! At least, not for the reason I thought.

I was playing with adding some dependencies to my Selenium PHPUnit tests. See, PHPUnit 3.4 has this handy little test dependency feature, the @depends annotation. It's basically a PHPDoc comment that includes a test name. You add that to a test and, if PHPUnit hasn't run the dependency yet, it skips the current test.

Well, I'm working on Windows. And every time I added an @depends to a test, PHPUnit would skip it. Even if the dependency had run! Even on the example from the documentation! I thought I was losing my mind.

However, it turns out that it's a known bug in PHPUnit. Basically just a line ending issue. The bug ticket has a 2 character patch that fixes it, or you can just save your files with UNIX line endings. Go figure.

Update: Wow, talk about weird timing. It turns out this issue was fixed in PHPUnit 3.4.1, which was released the day after I posted this.

Here's a weird little "feature" I came across the other day. It seems that Selenium discriminates between spacing in source and rendered HTML. I suppose this shouldn't be that surprising, but I just hadn't thought to account for it.

Here's what happened: I was writing a Selenium test case using PHPUnit's Selenium extension and I kept getting a failure on a simple assertTextPresent() call. I checked the page manually, and the text was definitely there. I selected the text and right-clicked to try the assertion in Selenium IDE, and it worked there. Then I tried copying the expected string out of my PHP test case and checking that in the IDE, and that's where it failed.

The only difference: one extra space in the PHP string. The real kicker is that that space is present in the page source. But when I stopped to think about it, it make perfect sense. Selenium interacts with the page through the DOM, and multiple spaces get rendered as one in HTML. The source is irrelevant - it's the rendered code that counts. I'll have to remember that for next time.

You know that weird jump in auto-incremented IDs we saw in our database at work the other day? Well, we discovered the cause: apparently we were "hacked".

I use the work "hacked" in the loosest sense of the word. The culprit contacted us the next day, making a number of demands and issuing veiled threats. I may get into the details in another post (it's a long story), but the point is that he appeared to have, at best, a script kiddie level understanding of security. He pointed out a number of flaws in the site, but the ID incrementing was the only one that could actually have caused us any real trouble - and he didn't even mention it! We assume that he actually did that by accident and didn't realize the potential implications if we ran out of ID numbers.

The flaw allowed the ID incrementing was actually quite simple. No SQL injection or cross-site scripting. It was a simple case of passing too much data into a function.

Here's how it works. Our site, which runs on PHP and MySQL, is built on a custom MVC framework. It uses a custom Active Record-style ORM, which is where the flaw lies. The ORM performs database updates and inserts by validating the object's fields against the database schema. Basically, it reads the schema from the database, compares that with the data to be updated, and constructs the SQL accordingly. So when you execute the save() method on an object, it will save the values for the fields that appear in the table schema, but ignore any other fields in the object. The ORM also has static update() and insert() methods that take an associative array, mapping the indexes to field names and performing this same validation. So if you have an array of data, only some of which maps to actual columns in the underlying table, you can just pass the whole thing and not have to go through and separate out the fields you need to save.

That last point is where we got in trouble. We have a method that adds items to our media table. It takes an array of data, does some sanitizing and validation, calls the insert() method to add it to the media table, and adds appropriate records to other tables. The problem was that, in the place where this was called most frequently, we were passing in $_POST as the data array. And while this method did sanitize the fields that we wanted to add to the database, it didn't check for extra fields that just happen to be valid fields in the media table. So, to make a long story short, if you were to put an "id" field in the POST and assign it an integer value, our ORM would happily add that field and value to the INSERT statement it sent to the database.

Of course, this was easily fixed. In fact, it wasn't even hard to find. I did a fair amount of work on our ORM at the beginning of the year, so once I made the connection between the "hacker" and the ID number jump, the source of the bug was immediately obvious. It's just one of those things that nobody ever thought to check until it became a problem.

So the moral of the story is: security is all about attention to detail. Following the "rules" is all well and good, but it's not enough. In our case, we were sanitizing data to protect against XSS attacks and using PDO prepared statements to protect against SQL injection attacks, but it wasn't enough. By forgetting to check for unexpected additional input, we left ourselves open to a completely different type of attack. Of course, it's a significantly less serious class of attack - maxing out our auto-incrementing IDs is recoverable, if annoying - but it's still an issue.

With any luck some good will come out of this. I think we've all learned to be a little more mindful of such issues. And perhaps this will act as a cue to management that maybe - just maybe - it would be better to do some actual testing and review of new code before it's released, rather than just pushing things into production and hoping they work.

Well, it's official: the people who develop PHP are morons. Or, rather, the people responsible for adding namespaces to PHP 5.3 are.

Why do I say this? Because I just read an annoucnement on Slashdot that they've decided on the operator to use for separating namespace in PHP 5.3: the backslash (\).

Seriously? The friggin' backslash? What kind of choice is that? Last I knew they'd pretty much decided to go with the double colon (::), like C++, which at least makes sense. But the backslash?

What's worse, just look at the RFC listing the operators they were considering. In addition to the backslash, they had the double star (**), double caret (^^), double percent (%%), a shell prompt (:>), a smiley face (:)), and a triple colon (:::). For God's sake, it looks like they picked this list out of a hat. They might as well have just used the string NAMESPACESEPARATOR. It's no less absurd than any of those.

Now, let's be realistic for a minute. In terms of syntax, PHP is a highly derivative language. It's an amalgamation of Perl, C++, and Java, with a dash of a few other things thrown in.

Given that heritage, there's really only a handful of choices for namespace separators that even make sense. The first, and most natural, is the double colon (::). This is what C++ uses and it's already used for static methods and class members in PHP. So the semantics of this can naturally be extended to the generic "scope resolution operator." Keeps things clean and simple.

The second choice is the dot (.), which is what's used in Java, C#, Python, and many others. This is a bit unnatural in PHP, as dot is the string concatenation operator, but it at least offers consistency with other related languages.

Third is...actually, that's it. There are only 2 valid choices of namespace separator. And the PHP namespace team didn't pick either one. Nice work guys.

The Slashdot article also linked to an interesting consequence of the choice of backslash: it has the potential to mess up referencing classes in strings. So if your class starts with, say, the letter "t" or "n", you're going to have to be very careful about using namespaces in conjunction with functions that accept a class name as a string. Just what we needed. As if PHP isn't messed up enough, now the behaviour of a function is going to depend on the names of your classes and the type of quotes you use.

I guess I'm going to have to bone up on my C#, because PHP seems to be going even farther off the deep end that before. It was always a thrown-together language, but this is just silly. The backslash is just a stupid choice for this operator and there's just no excuse for it.

If you've been a programmer for any length of time, you've probably seen lots of hacks designed to work around limitations of a particular language or toolkit. You may have even come up with some yourself. They're a mainstay of programming.

Thing thing about hacks is that we all know we're not really supposed to use them. At least, those of us who are moderately competent know that. Of course, sometimes we use them anyway, either because we have no choice or because the alternatives are just as bad (like all the nasty CSS hacks for Internet Explorer), but we're generally not happy about it. Coming up with hacks is more like a game. It's an excuse to push the technology to its limits, to see how far we can bend it to get what we want. And if you have enough knowledge and, just as importantly, imagination, you can do some pretty impressive things.

I started thinking about this last night when I came across what I considered a very impressive hack for PHP. The limitation it addresses, is that PHP binds the self keyword at compile-time, rather than run-time. The up-shot of that is that if you have a static method foo() in a base class Fizz and override it in a child class Buzz, any methods in your base class that call self::foo() will always use the implementation in Fizz, even if they're called through Buzz.

There was a work-around for this limitation in the comments on the get_class() page on php.net. This particular hack used a trick I never would have thought to consider - using the results of the debug_backtrace() function to loop though the call-stack and determine the class of the calling method.

Just stop and think about that - using a debugging function to control how calls to class methods are resolved. It's both brilliant and completely insane. It just feels wrong - like you're going way too far for a piece of functionality that's easily achieved by using a different approach (i.e. instance methods). If I ever saw one of my co-workers put that in our codebase, I'd beat him in the head with a keyboard until he apologized. And yet, at the same time, it makes me smile, because this is one of those things you really shouldn't be able to do, and yet you can.

Linux, Programming, and Computing in General

Site Map

Location

The Problem

The solution - Design

The implementation - Results

Take Away

Feature Addition: Webmention

Step 1: Analysis

Step 2: Design

Step 3: Implementation

Results

History

The New Design

Results

Effects of Testing

Conclusion

Motivation

The Issue

The Goal - Make the Code "Good"

And So It Begins...

About Me

Search

News Feeds