Note: This is a short entry that's been sitting in my drafts folder since March 2010, i.e. from half a career ago. My "new" job at the time was with an online advertising startup. It was my first and only early-stage startup experience. In retrospect, it was useful because it exposed me to a lot of new thing, in terms of not only technology, but people, processes, and ways of approaching software development (looking at things from the QA perspective was particularly eye-opening). It was not, however, enjoyable. I've also worked for later-stage startups and found that much more enjoyable. Sure, you don't get nearly as much equity when you come in later, but there's also less craziness. (And let's face it, most of the time the stock options never end up being worth anything anyway.)

Wow. I haven't posted anything in almost six months. I'm slacking. (Note: If only I'd known then that I wouldn't publish this for nine years...)

Actually, I've been kind of busy with work. I will have been at the "new" job for a year next month. The first six months I was doing the QA work, which was actually kind of interesting, as I'd never done that before. I did some functional test automation, got pretty familiar with Selenium and PHPUnit, got some exposure to an actual organized development process. Not bad, overall.

On the down side, the last six months have been a bit more of a cluster file system check, if you get my meaning. Lots of overtime, throwing out half our existing code-base, etc. On the up side, I've officially moved over to development and we're using Flash and FLEX for our new product, which are new to me.

The good part: FLEX is actually not a bad framework. It's got its quirks, but it's pretty powerful and, if nothing else, it beats the pants off of developing UIs in HTML and JavaScript. And while it's not my favorite language in the world, ActionScript 3 isn't bad either. It's strongly typed, object oriented, and generally fairly clean.

The bad part: Flash is not so nice. It wasn't quite what I was expecting. I guess I assumed that "Flash" was just design environment for ActionScript programming. Actually, it's more of an animation package that happens to have a programming language bolted onto it. The worst part is that our product requires that we do the Flash portion in ActionScript 2, which seriously sucks. I mean, I feel like I'm back in 1989. And the code editor in Flash CS4 is...extremely minimal. As in slightly less crappy than Windows Notepad. I am sersiously not enjoying the Flash part.

(Note: On the up side, none of this matters anymore because Flash is now officially dead.)

Note: This is an article I started in October of 2012 and never finished. Fortunately, my feelings on the issue haven't changed significantly. So I filled it out into a real entry. Enjoy!

As I alluded to in a (not so) recent entry on switching RSS readers, I'm anti-cloud.

Of course, that's a little ambiguous. The fact is, "cloud" doesn't really mean anything anymore. It's pretty much come to refer to "doing stuff on somebody else's server." So these days we refer to "having your e-mail in the cloud" rather than "using a third-party webmail service," like we did 15 years ago. But really it's exactly the same thing - despite all the bells and whistles, GMail is not fundamentally different than the Lycos webmail account I used in 1999. It still amounts to relying entirely on some third-party's services for your e-mail needs.

And if the truth were known, I'm not even really against "the cloud" per se. I have no real objection to, say, hosting a site on an Amazon EC2 or Windows Azure instance that I'm paying for. It's really just the "public cloud." You know, all those "cloud services" that companies offer for free - things like GMail and Google Calendar spring to mind.

And it's not even that I object to using these servies. It's just that I don't want to rely on them for anything I deem at all important. This is mostly because of the often-overlooked fact that users have no control over these services. The providers can literally cut you off at a moment's notice and there's not a thing you can do about it. With a paid service, you at least have some leverage - maybe not much, but they generally at least owe you some warning.

There are, of course, innumerable examples of this. The most recent one for me is Amazon Music. They used to[i] offer a hosting service where you could upload your personal MP3 files to the Amazon cloud and listen to them through their service. I kinda liked Amazon Music, so I was considering doing that. Then they terminated that service. So now I use Plex and/or Subsonic to listen to my music straight from my [i]own server, than you very much!

As a result, I have my own implementation of a lot of stuff. This includes running my own calendar server. This is a project that has had a few incarnations, but that I've always felt was important for me. Your calendar is a window into your every-day life, a record of every important event you have. Do you really want to trust it to some third party? Especially one that basically makes its money by creating a detailed profile of everything you do so that they can better serve ads to you? (I think we all know who I'm thinking of here....)

For several years I used a simple roll-your-own CalDAV server using SabreDAV. That worked fine, but it was quite simple and I needed a second application to provide a web-based calendar (basically a web-based CalDAV client). So I decided to switch to something a little more full-featured and easier to manage.

So these days, I just run my own OwnCloud instance. At it's core, OwnCloud is basically a WebDAV server with a nice UI on top of it. In addition to nice file sync-and-share support, it gives me web-based calendar and contact apps with support for CalDAV and CardDAV respectively. It also has the ability to install additional apps to provide more features, such as an image gallery, music players, and note-taking apps. Most of the more impressive apps are for the enterprise version only, or require third-party services or additional servers, but all I really wanted was calendar and contact support.

To get the full experience, I also use the OwnCloud apps on my laptop and phone to sync important personal files, as well as the DAVx5 app on my phone to synchronize the Android calendar and contacts database with my server. Overall, it works pretty well and doesn't really require much maintenance. And most important, I don't have to depend on Google or Amazon for a service that might get canned tomorrow.

About a year and a half ago, I wrote an entry about the first step in my refactoring of LnBlog. Well, that's still a thing that I work on from time to time, so I thought I might as well write a post on the latest round of changes. As you've probably figured out, progress on this particular project is, of necessity, slow and extremely irregular, but that's an interesting challenge in and of itself.

Feature Addition: Webmention

For this second step, I didn't so much refactor as add a feature. This particular feature has been on my list for a while and I figured it was finally time to implement it. That feature is webmention support. This is the newer generation of blog notification, similar to Trackback (which I don't think anyone uses anymore) and Pingback. So, basically, it's just a way of notifying another blog that you linked to them and vice versa. LnBlog already supported the two older versions, so I thought it made sense to add the new one.

One of the nice things about Webmention is that it actually has a formal specification that's published as a W3C recommendation. So unlike some of the older "standards" that were around when I first implemented LnBlog, this one is actually official, well structured, and well thought out. So that makes things slightly easier.

Unlike the last post, I didn't follow any formal process or do any time tracking for this addition. In retrospect I kind of wish I had, but this work was very in and out in terms of consistency and I didn't think about tracking until it was too late to matter. Nevertheless, I'll try to break down some of my process and results.

Step 1: Analysis

The first step, naturally, was analyzing the work to be done, i.e. reading the spec. The webmention protocol isn't particularly complicated, but like all specification documents it looks much more so when you put all the edge cases and optional portions together.

I actually looked at the spec several times before deciding to actually implement it. Since my time for this project is limited and only available sporadically, I was a little intimidated by the unexpected length of the spec. When you have maybe an hour a day to work on a piece of code, it's difficult to get into the any kind of flow state, so large changes that require extended concentration are pretty much off the table.

So how do we address this? How do you build something when you don't have enough time to get the whole thing in your head at once?

Step 2: Design

Answer: you document it. You figure out a piece and write down what you figured out. Then the next time you're able to work on it, you can read that and pick up where you left off. Some people call this "design".

I ended up reading through the spec over several days and eventually putting together UML diagrams to help me understand the flow. There were two flows, sending and receiving, so I made one diagram for each, which spelled out the various validations and error conditions that were described in the spec.

That was really all I needed as far as design for implementing the webmention protocol. It's pretty straight-forward and I made the diagrams detailed enough that I could work directly from them. The only real consideration left was where to fit the webmention implementation into the code.

My initial thought was to model a webmention as a new class, i.e. to have a Webmention class to complement the currently existing TrackBack and Pingback classes. In fact, this seemed like the obvious implementation given the code I was working with. However, when I started to look at it, it became clear that the only real difference between Pingbacks and Webmentions is the communication protocol. It's the same data and roughly the same workflow and use-case. It's just that Pingback goes over XML-RPC and Webmention uses plain-old HTTP form posting. It didn't really make sense to have a different object class for what is essentially the same thing, so I ended up just re-using the existing Pingback class and just adding a "webmention" flag for reference.

Step 3: Implementation

One of the nice things about having a clear spec is that it makes it really easy to do test-driven development because the spec practically writes half your test cases for you. Of course, there are always additional things to consider and test for, but it still makes things simpler.

The big challenge was really how to fit webmentions into the existing application structure. As I mentioned above, I'd already reached the conclusion that creating a new domain object for the was a waste of time. But what about the rest of it? Where should the logic for sending them go? Or receiving? And how should sending webmentions play with sending pingbacks?

The first point of reference was the pingback implementation. The old pingback implementation for sending pingbacks lived directly in the domain classes. So a blog entry would scan itself for links, create a pingback object for each, and then ask the pingback if its URI supported pingbacks, and then the entry would sent the pingback request. (Yes, this is confusing. No, I don't remember why I wrote it that way.) As for receiving pingbacks, that lived entirely in the XML-RPC endpoint. Obviously none of this was a good example to imitate.

The most obvious solution here was to encapsulate this stuff in its own class, so I created a SocialWebClient class to do that. Since pingback and webmention are so similar, it made sense to just have one class to handle both of them. After all, the only real difference in sending them was the message protocol. The SocialWebClient has a single method, sendReplies(), which takes an entry, scans its links and for each detects if the URI supports pingback or webmention and sends the appropriate one (or a webmention if it supports both). Similarly, I created a SocialWebServer class for receiving webmentions with an addWebmention() method that is called by an endpoint to save incoming mentions. I had originally hoped to roll the pingback implementation into that as well, but it was slightly inconvenient with the use of XML-RPC, so I ended up pushing that off until later.

Results

As I mentioned, I didn't track the amount of time I spent on this task. However, I can retroactively calculate how much code was involved. Here's the lines-of-code summary as reported by Process Dashboard:

Base:	8057
Deleted:	216
Modified:	60
Added:	890
Added & Modified:	950
Total:	8731

For those who aren't familiar, the "base" value is the lines of code in the affected files before the changes, while the "total" it the total number of lines in affected files after the changes. The magic number here is "Added & Modified", which is essentially the "new" code. So all in all, I wrote about a thousand lines for a net increase 700 lines.

Most of this was in the new files, as reported by Process Dashboard below. I'll spare you the 31 files that contained assorted lesser changes (many related to fixing unrelated issues) since none of them had more even 100 lines changed.

Files Added:	Total
lib\EntryMapper.class.php	27
lib\HttpResponse.class.php	60
lib\SocialWebClient.class.php	237
lib\SocialWebServer.class.php	75
tests\unit\publisher\SocialWebNotificationTest.php	184
tests\unit\SocialWebServerTest.php	131

It's helpful to note that of the 717 lines added here, slightly less than half (315 lines) is unit test code. Since I was trying to do test-driven development, this is to be expected - the rule of thumb is "write at least as much test code as production code". That leaves the meat of the implementation at around 400 lines. And of those 400 lines, most of it is actually refactoring.

As I noted above, the Pingback and Webmention protocols are quite similar, differing mostly in the transport protocol. The algorithms for sending and receiving them are practically identical. So most of that work was in generalizing the existing implementation to work for both Pingback and Webmention. This meant pulling things out into new classes and adjusting them to be easily testable. Not exciting stuff, but more work than you might think.

So the main take-away from this project was: don't underestimate how hard it can be to work with legacy code. Once I figured out that the implementation of Webmention would closely mirror what I already had for Pingback, this task should have been really short and simple. But 700 lines isn't really that short or simple. Bringing old code up to snuff can take a surprising amount of effort. But if you've worked on a large, brown-field code-base, you probably already know that.

Nice little trick I didn't realize existed: you can install Composer packages globally.

Apparently you can just do composer global init ; composer global require phpunit/phpunit and get PHPUnit installed in your home directory rather than in a project directory, where you can add it to your path and use it anywhere. It works just like with installing to a project - the init creates a composer.json and the require adds packages to it. On Linux, I believe this stuff gets stored under ~/.composer/, whereas on Windows, they end up under ~\AppData\Roaming\Composer\.

That's it. Nothing earth-shattering here. Just a handy little trick for things like code analyzers or other generic tools that you might not care about adding to your project's composer setup (maybe you only use them occasionally and have no need to integrate them into your CI build). I didn't know about it, so I figured I'd pass it on.

Note: This is another "from the archives" essay that I started two years ago. It was inspired by this talk by Kevlin Henney as well as the general low quality of the unit test suite in the product I was working on at the time. If you're browsing YouTube, I generally recommend any of Kevlin Henney's talks. He's a bit like Uncle Bob Martin - insightful and entertaining with a deep well of experience to back it up.

Does your code have GUTs? It should. If you're a professional programmer, you really should have GUTs. And by "GUTs", I mean "Good Unit Tests".

Unless you've been living under a rock for the last decade or two (or you're just new), the industry has decided that unit tests are a thing all "Serious Engineers" need to write. For years I've been hearing about how you need to have unit tests, and if you don't you're not a real programmer. So I decided to get with the program and start writing unit tests.

And you know what? I failed miserably. It was a serious train wreck.

Why did I fail? Because I listened to those blog posts and podcast episodes that just tell you to write unit tests, and then I went out and just tried to write unit tests. My problem was that I didn't know what a good unit test looked like. So I wrote tests, but they were brittle, hard to understand, hard to maintain, and not even especially useful. The result was that after a while, I concluded that these "unit test" people didn't know what they were talking about and had just latched onto the latest fad. And then every year or two I would try again and reach the same conclusion.

So the real question is not "do you have unit tests?", but "do you have good unit tests?" Because, as I alluded to above, bad unit tests are not necessarily better than no unit tests. A bad unit test suite will break all the time and not necessarily give you any useful information, which makes maintaining it more costly than just dropping it.

What do GUTs look like?

Good unit tests have three main properties:

Readability
Maintainability
Durability

What does this mean? Well, readability means just what it sounds like - that you can read and understand the test cases. Without being an expert in the system under test, you should be able to look at a test case and be able to figure out what behavior it's testing.

Likewise, maintainability means the same thing as it does for production code - that you should be able to update and expand the test suite without undue effort or pain.

Durability simply means that your tests should not break willy-nilly. Obviously there are lots of potential changes you can make to code that would break your tests, but the tests should not break unless they really need to. So a durable test suite should not start failing because internal implementation details of the code under test were changed.

Guidelines for writing tests

So how to you write tests that have those three properties? Well, I have a few guidelines and common practices that I use. At work, I routinely find myself suggesting these things during code reviews. Some people might disagree, but I've found these to be helpful and I'll try to justify my opinions. Hopefully you'll find them useful.

Note: for the sake of simplicity and consistency, examples will be of a blogging system written in PHP, but these principles and ideas are by no means specific to such a system. I just use that because I happen to have plenty of examples at hand so that I don't have to come up with fake ones.

Test names

Naming your tests is an important, but often-overlooked aspect of test writing. Your test cases should have descriptive names. And by "descriptive", I mean it should tell you three things:

What behavior is being tested.
What the conditions of the test are.
What the expected outcome is.

If you use the "generate test cases" feature of your IDE, or something like that, you might end up with test names like testSaveEntry or testPublishEntry. These are bad names. For one thing, they only tell you the name of the function they're testing. For another, they guide you into a convention of a one-to-one mapping of test classes and methods to production code classes and methods. This is limiting and unnecessary. You should have as many test classes and methods as you need. I sometimes have an entire test class just to test one production method. I don't recommend that as a general rule, but there are cases where it makes sense.

When in doubt, I recommend choosing a test naming convention. I often use "test<name of method under test>_When<conditions of test>_<expected result>". So, for example, if I was writing a test to check if the publishEntry() method throws an exception when the database update fails, I might name it testPublishEntry_WhenDbUpdateFails_Throws. Or if I wanted to test that a text sanitizing function HTML encodes any angle brackets that it finds in the input, I might call it testSanitize_WhenDataContainsHtml_ReturnsDataWithTagsEscaped. Obviously you can use a different convention, or no convention at all, but the point is that the test name tells you everything you need to know about what is being tested.

One thing to note here: the above names are kind of long. I know - that's OK. Remember that these are test names. They're not methods that somebody is going to be calling in other code. Nobody is ever going to have to type that name again. So don't get hung up on the length.

Also of note, you should think about what the expected behavior is. Things like testSaveEntryWorks are not enlightening. What does "work" mean? The same goes for things like testSaveEntryMakesExpectedDatabaseCalls. OK...what are the expected database calls? If you can't come up with something specific, that probably means you need to think more about your test, maybe even break it into multiple tests.

A good guideline to keep in mind is that it should be possible to write a pretty-printer that can read the names of all your test methods and print out a low-level specification for your system. Ideally, you should be able to figure out how everything works just by looking at the test names.

Granularity

Big tests are bad. Why? Because the bigger the test is, the more things can break it. This is the same reason that big functions and methods are bad - the more they do, the more opportunity there is for something to go wrong. So if you can, it's better to keep things small. Just as each function should do one thing, and do it well, each test should test one thing, and test it well.

And when I say "one thing", I really mean one thing. Ideally, each test should have one, and only one, reason to fail. The goal is that when a test fails, it should be pretty obvious what went wrong. So if the only thing a test asserts is that X = Y, and it fails, then you can be pretty sure X isn't equal to Y.

On the other hand, if you're asserting a bunch of different things, it's harder to pinpoint the problem. Furthermore, since most test frameworks will fail the test at the first failed assertion, you can end up with one failure masking another, i.e. the first of several assertions fails, so you fix it, and then the next assertion in that test fails, etc.

So if you need to check a bunch of things, then write a bunch of tests. Don't try to cram them all into one test.

Don't assert too much, but always assert something

A corollary to rule that a test should only test one thing is that a test should test at least one thing. In code reviews for junior developers, I sometimes see tests that don't actually contain any assertions. They have a bunch of setup code that configures mocks to make a method run, but no explicit assertions or call expectations on mocks. All it really asserts is "this method doesn't throw an exception."

Needless to say, that's not a very strong assertion. Heck, I can write methods that don't throw an exception all day long. If the test fails, all you have to do is delete all the code in the method you're testing and there you go - it won't throw anymore. Granted, your application won't do anything, but at least the test suite will pass.

So always make sure you're making some explicit assertion. Otherwise, there's no point in writing the test.

Readability

Ideally, your tests should be a low-level specification - they should be clear and detailed enough that you could hand the tests to a developer and they could reverse engineer every part of your system based on that. You probably won't actually accomplish that, but that's the goal we're shooting for in terms of coverage and clarity.

So given that, it's obviously very important that tests be readable. A test that's hard to read isn't doing its job of communicating the intent of the system. But what, exactly, does "readable" mean?

In my view, it's a combination of things. Obviously, the action of the test should be clear. You should be able to look at the test and easily be able to see what it's doing without having to search around through other code. But you also want the test to be simple enough that you can easily understand it and not miss the forest for the trees.

One simple strategy to address this is to adopt a test structure convention. This is typically an "arrange-act-assert" layout, where you group all of the setup code, action of the code-under-test, and the assertions into their own sections, so it's clear where each one starts and ends. Where people sometimes get into trouble with this is when using mocking frameworks that require you to declare expectations and assertions on mocks up-front before they are called (you usually have to do "arrange-assert-act" in that case, but it's the same idea). What I often see is people declare some a mock, declare some some assertion on it, then declare another mock with it's assertions farther down the test method, etc. This makes the test harder to read because you have to hunt through all the setup code to determine which mocks have assertions on them. There isn't a single place you can look at to see the expected result.

Another good strategy is the judicious use of utility functions. This is especially the case if you have lots of similar tests with a substantial amount of setup code, which is not an uncommon case. So, for example, if you have a method that calls a remote web service, you might want to have a bunch of different test methods that check all the different error conditions and return values, the mock object setup for many of those is probably very similar. In fact, it might be the same except for one or two values. You could handle that by just doing a copy-and-paste of the setup code and changing what you need, but then you end up with a sea of almost identical code and it becomes harder to see what the differences between the tests are. The solution is to encapsulate some of that setup code in a utility method that you can just call from each of the tests. This allows you to have your mocks be configured in one place and keep all those extraneous details out of the test body.

The one thing to watch out for with utility methods is that they should be as simple as possible. When I can, I try to make them take no parameters and just give them a name that describes how they configure my mocks. Remember that all the setup doesn't have to be encapsulated in a single function - you can call multiple utility functions that configure different things. In cases where it makes sense for the utility methods to take parameters, I try to limit how many they take, so as to keep the scope of what the function does clear. And if you find yourself putting conditionals or loops in your utility functions, you might want to think about your approach. Remember, the more logic there is in your tests, the more likely it is that your tests will have bugs, which is the last thing you want.

Conclusion

These are just some of the techniques and considerations that I employ when writing unit tests. They've been working for me so far, but your mileage may vary.

My main goal here is just to give some ideas to people who are new to this unit testing thing. Remember, writing unit tests is a skill that you have to learn. This is especially the case when doing test-driven development. Despite what you may have been led to believe, it's not a simple and obvious thing that you can just do cold. So be patient, try different things, and take your time. It's the journey, not the destination, that's the point.

LinLog

Linux, Programming, and Computing in General

Site Map

Location

At least it's a resume builder

Post a comment

Running my own calendar server

Post a comment

LnBlog Refactoring Step 2: Adding Webmention Support

Feature Addition: Webmention

Step 1: Analysis

Step 2: Design

Step 3: Implementation

Results

Post a comment

View Pingbacks (1)

Global composer

Post a comment

Your code needs GUTs

What do GUTs look like?

Guidelines for writing tests

Test names

Granularity

Don't assert too much, but always assert something

Readability

Conclusion

Post a comment

About Me

Recent Entries

Articles

Search

News Feeds

Topics