PSP Break-down, part 4: Results

Welcome to the end of my series of PSP posts.  In part one, I started with an overview of the Personal Software Process.  Part two covered the process of learning about the PSP and how to apply it.  Part three was the sales pitch of nice things that the PSP is supposed to enable you to do.  Now, in part four, we'll get down to brass tacks and talk about how well the PSP actually works for me.

I'll start with a quick overview of the good, bad, easy, and hard parts.  Then I'll dive into a deeper discussion of my experiences and the details of what did and didn't work.

PSP at a Glance

This table gives you a nice little overview of my experience.  It rates various parts of using the PSP on two axes - whether I judge them to be useful or not (good/bad) and how difficult they are to apply in practice (easy/hard).

Easy Maintaining process
Defect tracking
Time tracking
Size tracking
Code reviews
PROBE estimates
Reviewing on paper
Setting up tool support
Test report template
Coding standards
Hard Planning process
Design reviews
Postmortem analysis
Defect data analysis
Creating relative size tables
Creating review checklists
Design templates
Design verification methods

As you can see, I find the benefits of the PSP to outweigh the drawbacks.  For me, it's useful enough that I plan to keep using it, in some form, for the foreseeable future. 

Using the PSP

Despite what you might read about how cumbersome the PSP is to use, I actually didn't find it that difficult at all.  Granted, it takes a little getting used to - every process change does.  But once I had the basics down, actually using and sticking to the process wasn't that hard.  The use of written scripts with well defined entry and exit criteria helps keep you honest and disciplined.

Likewise, with the help of Process Dashboard, I found the entire data collection process to be relatively painless.  There are a few pain-points, of course.  For me, the biggest one was simply properly configuring a line counting tool so that you can measure project size.  This is actually more annoying than you'd think.  I use the integrated line-counter in Process Dashboard to count change size and cloc to measure initial size for estimation purposes, mostly because I've integrated it into my IDE.  The Process Dashboard tool has some nice features, but will require you to write custom language definitions for pretty much anything that doesn't use C-style syntax.  It uses a fairly easy to follow XML format for configuration, but still....  Cloc has much better language support, but is harder to customize.  As a further annoyance, while Process Dashboard does have VCS diff support, it currently only supports Subversion.  So if you're using Git, Mercurial, or anything else reasonably modern, you'll have to set up two copies of your local repo for before and after comparison.

Code Review

Personal pre-commit code review is one of those things that every responsible developer does, but hardly anyone seems to talk about.  I know I've been doing an informal version of it for years.  It simply consisted of looking over the diff of my changes before hitting the "commit" button and making sure that I didn't make any obvious mistakes, that I'm not checking in any changes I don't mean to, etc.  It's something you quickly learn to do after making a few embarrassing mistakes.

Doing an organize code review with a checklist really takes this to the next level.  Instead of being a CYA thing, code review becomes a way to preemptively find problems in your code.  And at its best, it can be hugely effective.  As any experienced developer knows, there are classes of problem that are hard to find in testing, but stick out like a sore thumb when you actually stop and read the code.

The only hard thing about code review is actually customizing your review checklist.  I find that it's difficult for me to do that simply by looking at the defect categories in my data because those seldom tell you anything actionable.  There are a few defect categories that readily translate into checklist items, but there are many defects that are more subtle and are either difficult to categorization or difficult to generalize into checklist items.

The one thing I really didn't like about the PSP code review process was the recommendation to do it on paper.  Using paper does have the benefit that you can get more code in front of you at a time, and it is easier to make annotations.  However, it also bypasses the navigational and analysis power baked into modern IDEs.  For instance, Komodo let's me easily navigate between functions, access standard library documentation at a click, and search for uses of an identifier.  Those things are much more tedious to check on paper. 

But the big kicker for me was that trying to review a diff on paper is just painful.  It's hard enough on screen with color highlighting, but it really sucks on paper.  And on paper I don't have things like the Komodo diff viewer's feature to jump from a diff item to that location in the file to view the context.  It might work well to review new code on paper, but for changes to existing code it feels really clunky.

Design Reviews

While we're on the topic of reviews, let's talk about design reviews for a minute.  Again, this is a really good idea, for the same reasons that code review is a good idea.  And the PSP does offer some productive advice in the recommendation to adopt a design standard and a checklist for common errors.

However, the specific methods the PSP recommends just don't work for me.  The four standard design templates are a nice idea, but they feel very repetitive and clunky to work with.  And even if I switched to the UML equivalent, the recommended list of artifacts is just too much for a lot of the things I do.  And at the risk of having my CS degree rescinded, I have to admit that I have trouble constructing a state machine for many of the projects I do - at least, one that's even remotely enlightening.  In general, I just find them painful to work with and biased toward "new program" development rather than incremental enhancement.

And the design verification techniques are even worse!  They're time-consuming and tedious - you're basically executing your program on paper.  It's a nice idea, and might come in handy occasionally, but frankly, I have less confidence in my ability to perform those verification exercises correctly than I do in my code being correct in the first place.  And, again, they're just way too heavy for most of the projects I do.

I'm still working on finding a design and design review approach that's sustainable for me.  Since my last few shops have used agile methodologies, most of the "projects" I do are fairly small enhancements to existing code - usually just two or three days, seldom more than a week.  So a heavy-weight design process with lots of templates or UML diagrams just isn't going to work. 

My current approach is as follows (note that I'm using a variant on the PSP3 process in Process Dashboard):

  1. Sketch out a high-level design in a word processor.  This is a refinement of the conceptual design used for planning, usually in the form of plain prose and bullet lists.
  2. Review that primarily for feasibility, completeness, and requirements coverage.
  3. For each component I've broken out, do a more detailed "transient design" (I'll describe that in a moment).
  4. Review the transient design primarily for completeness, correctness, and requirements coverage.

I refer to the detailed designs as "transient designs" because I don't actually create separate documents for them.  I blend implementation and design and actually do the design right in the source code.  I generally stub out things like classes and methods and fill in the details either with actual code (for simpler items) or with "design annotations", which are just comments that use a special formatting to mark them as design artifacts.  Sometimes they're pseudo-code, other times they're just descriptions of what needs to be done, whatever seems appropriate.  Then, in the code phase, I simply replace those annotations with the actual implementation.  It's certainly not perfect, but it seems to be working well enough for me so far.  As a next step, I'm going to look at a TDD-like approach and try incorporating unit test definitions as one of the design artifacts.


In my experience so far, PSP estimation using PROBE actually works remarkably well. In the data from my last job, I was eventually able to get to the point where my actual development times were generally within about 15% of my estimates.  I consider that to be pretty good, especially when you consider that the estimates were done in minutes and based on fairly sketchy user-stories.

Of course, estimation is a learned skill, and PROBE doesn't change that.  You still need to be able to accurately account for the possible changes when constructing the conceptual design.  And as with any data-driven approach, the results are going to be sensitive to the quality of your data.  So if your relative size tables are just made up rather than being based on your past work, then don't expect your estimates to be too accurate.

It's also important to note that there's a bias against project diversity here.  For example, line counts can differ wildly for different programming languages, different problem domains, etc.  So if you tend to work on projects that are generally very similar to each other, then PROBE will work much better than if all your projects are widely divergent.  My data from my last job is based largely on a single code-base, so while the purposes of the individual projects varied wildly, the technology stack was consistent.

The hardest part about estimation, at least for me, is coming up with those relative size tables.  It's one of those things that sounds easy, but actually isn't.  For one thing, I don't have a tool to automatically count lines and methods in classes - much less one that works across a diversity of languages.  For another, my data largely comes from work, which is a problem because I do mostly web development on a team, which means I need to extract my method and class size data from a code base written in five different languages by five people.  When you wrote a quarter of the methods in one class, half of a third of the methods in another class, etc., how do you count all that?  You can just forget that and count the files you have, but then it's not really your data, so it's not clear how useful it will be. 

I've also found it challenging to come up with useful categorizations for my relative size tables.  Perhaps it's just the products I've been working on, but I end up with a whole lot of database-related classes and a smattering of other categories.  That's fine for those products, it's hard to figure out how to extrapolate that to other kinds of projects.  My suspicion is that that's a result of sub-optimal system design.  I'm currently trying to adhere religiously to the SOLID principles, which should result in more and more targeted classes, which should solve that problem.

Other Bad Things

There are a few other annoying things about the PSP as Humphrey describes it.  While the general focus on templates and checklists is not bad in and of itself, their value tends to vary.  The test report template, for instance, is one that I've not found particularly valuable.  While it is useful to sketch out your testing strategy, or maybe make a quick checklist of your test cases, the test report template is more like something that you'd give to a QA team to do manual testing.  It has a bias towards verbosity that makes it seem like more effort than it's worth.

Likewise with the focus on coding standards.  We can all agree that having coding standards, and following them consistently, is a very good thing.  Everyone should do it.  However, I've been working as a software developer for a long time now and my "coding standard" is something I've long since internalized.  I don't need to spend time formalizing it or checking it in my code reviews.  Ditto the size counting standard.  You can get really fancy if you want to, but to I'm not convinced that anything much more complicated than counting physical lines is likely to be helpful.  I suspect that, at least for my purposes, any elaborate counting standard would just serve to complicate measurement.

And, of course, there's the simple fact of process overhead.  It's really not too bad when you use Process Dashboard, but it's still there.  For example, there's a non-trivial amount of work that goes into configuring Process Dashboard itself.  It's a useful and powerful tool, but its not always simple to use.  There's also the analysis time use to assess and correct your process.  This is unquestionably valuable, but it's still some additional time that you need to plan for.  And, of course, there's just the time to actually follow the process.  This isn't actually that much, but for very small tasks (e.g. one or two hours), your estimates might be thrown off by the fact that your standard phase break-down results in a phase that's one or two minutes, and it takes you longer than that just to type in the data you need.

And Some Good Things to End On

Last but not least, I wanted to highlight two more "good but hard" things from the table above: the planning and postmortem stages.  At first, these seemed like silly, pro forma phases to me, but they're actually quite valuable.  And the most valuable thing about them is something that doesn't show up in the script: they make you stop and reflect.

The planning phase forces you to think about what you're doing.  To construct an estimate, you have to think about the project you're trying to do, break it down, and define its scope.  Even if you don't believe there's value in the estimate itself, simply going through the process gives you lots of good insight into just what you're trying to do, which reduces the number of surprises later in the development cycle and makes everything go smoother in general.

Likewise, the postmortem stage prompts you to reflect on how you're doing and figure out how you can improve.  It's like a one-man sprint retrospective.  And like the sprint retrospective, it's actually the most important part of the process.  The simple task of looking at your statistics and filling out a Process Improvement Proposal forces you to stop and focus on your performance and what you can do to improve your work.  I find that simply looking at that PIP line in the postmortem exit criteria keeps me honest and makes me stop and think of something I could improve.  And if you can't thing of at least one thing you could do better, you're just lying to yourself.


So there you have it.  In many ways, the PSP is working well for me.  Some of Humphrey's suggestions work well out of the box, some don't, and some need tweaking.  I do find that my process is evolving away from the "standard" PSP, but that's neither bad nor unexpected.  The basic techniques and ideas are still useful to me, and that's what matters.

You can reply to this entry by leaving a comment below. This entry accepts Pingbacks from other blogs. You can follow comments on this entry by subscribing to the RSS feed.

Add your comments #

A comment body is required. No HTML code allowed. URLs starting with http:// or ftp:// will be automatically converted to hyperlinks.