LnBlog Refactoring Step 3: Uploads and drafts

It's time for the third, slightly shorter, installment of my ongoing series on refactoring my blogging software.  In the first part, I discussed reworking how post publication was done and in the second part I talked about reworking things to add Webmention support.  This time, we're going to talk about two mini-projects to improve the UI for editing posts.

This improvement is, I'm slightly sad to say, pretty boring.  It basically involves fixing a "bug" that's really an artifact of some very old design choices.  These choices led to the existing implementation behaving in unexpected ways when the workflow changed.

The Problem

Originally LnBlog was pretty basic and written almost entirely in HTML and PHP, i.e. there was no JavaScript to speak of.  You wrote posts either in raw HTML in a text area box, using "auto-markup", which just automatically linkified things, or using "LBCode", which is my own bastardized version of the BBCode markup that used to be popular on web forums.  I had implemented some plugins to support WYSIWYG post editors, but I didn't really use them and they didn't get much love.

The old LnBlog post editor

Well, I eventually got tired of writing in LBCode and switched to composing all my posts using the TinyMCE plugin.  That is now the standard way to compose your posts in LnBlog.  The problem is that the existing workflow wasn't really designed for WYSIWYG composition.

In the old model, the idea was that you could compose your entire post on the entry editing page, hit "publish", and it would all be submitted to the server in one go.  There's also a "review" button which renders your post as it would appear when published and a "save draft" button to save your work for later.  These also assume that submitting the post is an all-or-nothing operation.  So if you got part way done with your post and decided you didn't like it, you could just leave the page and nothing would be saved to the server.

At this point it is also worth noting how LnBlog stores its data.  Everything is file-based and entries are self-contained.  That means that each entry has a directory and that directory contains all the post data, comments, and uploaded files that are belong to that entry.

What's the problem with this?  Well, to have meaningful WYSIWYG editing, you need to be able to do things like upload a file and then be able to see it in the post editor.  In the old workflow, you'd have to write your post, insert an image tag with the file name of your picture (which would not render), add your picture as an upload, save the entry (either by saving the draft or using the "preview", which would have trigger a save if you had uploads), and then go back to editing your post.  This was an unacceptably workflow clunky.

On top of this, there was a further problem.  Even after you previewed your post, it still wouldn't render correctly in the WYSIWYG editor.  That's because the relative URLs were inconsistent.  The uploaded files got stored in a special, segregated draft directory, but the post editor page itself was not relative to that directory, so TinyMCE didn't have the right path to render it.  And you can't use an absolute URL because the URL will change after the post is published.

So there were two semi-related tasks to fix this.  The first was to introduce a better upload mechanism.  The old one was just a regular <input type="file"> box, which worked but wasn't especially user-friendly.  The second one was to fix things such that TinyMCE could consistently render the correct URL for any files we uploaded.

The solution - Design

The actual solution to this problem was not so much in the code as it was in changing the design.  The first part was simple: fix the clunky old upload process by introducing a more modern JavaScript widget to do the uploads.  So after looking at some alternatives, I decided to implement Dropzone.js as the standard upload mechanism.

The new, more modern LnBlog post editor.

The second part involved changing the workflow for writing and publishing posts.  The result was a somewhat simpler and more consistent workflow that reduces the number of branches in the code.  In the old workflow, you had the following possible cases when submitting a post to the server:

  1. New post being published (nothing saved yet).
  2. New post being saved as a draft (nothing saved yet).
  3. Existing draft post being published.
  4. Existing draft post being saved.
  5. New (not yet saved) post being previewed with attached files.
  6. Existing draft post being previewed with attached files.

This is kind of a lot of cases.  Too many, in fact.  Publishing and saving were slightly different depending on whether or not the entry already existed, and then there were the preview cases.  These were necessary because extra processing was required when an entry was previewed with new attachments because, well, if you attached an image, you'd want to see it.  So this complexity was a minor problem in and of itself.

So the solution was to change the workflow such that all of these are no longer special cases.  I did this by simply issuing the decree that all draft entries shall always already exist.  In other words, just create a new draft when we first open the new post editor.  This does two things for us:

  1. It allows us to solve the "relative URL" problem because now we can make the draft editing URL always relative to the draft storage directory.
  2. It eliminates some of those special cases.  If the draft always exists, then "publish new post" and "publish existing draft" are effectively the same operation.  When combined with the modern upload widget, this also eliminates the need for the special "preview" cases.

The implementation - Results

I won't get into the actual implementation details of these tasks because, frankly, they're not very interesting.  There aren't any good lessons or generalizations to take from the code - it's mostly just adapting the ideosyncratic stuff that was already there.

The implementation was also small and went fairly smoothly.  The upload widget was actually the hard part - there were a bunch of minor issues in the process of integrating that.  There were some issues with the other part as well, but less serious.  Much of it was just integration issues that weren't necessarily expected and would have been hard to foresee.  You know, the kind of thing you expect from legacy code.  Here's some stats from Process Dashboard:

Project File Upload Draft always exists
Hours to complete (planned): 4:13 3:00
Hours to complete (actual): 7:49 5:23
LOC changed/added (planned): 210 135
LOC changed/added (actual): 141 182
Defects/KLOC (found in test): 42.6 27.5
Defects/KLOC (total): 81.5 44.0

As you can see, my estimates here were not great.  The upload part involved more trial and error with Dropzone.js than I had expected and ended up with more bugs.  The draft workflow change went better, but I ended up spending more time on the design than I initially anticipated.  However, these tasks both had a lot of unknowns, so I didn't really expect the estimates to be that accurate.

Take Away

The interesting thing about this project was not so much what needed to be done but why it needed to be done. 

Editing posts is obvious a fundamental function of a blog, and it's one that I originally wrote way back in 2005.  It's worth remembering that the web was a very different place back then.  Internet Explorer was still the leading web browser; PHP 5 was still brand new; it wasn't yet considered "safe" to just use JavaScript for everything (because, hey, people might not have JavaScript enabled); internet speeds were still pretty slow; and browsing on mobile devices was just starting to become feasible.  In that world, a lot of the design decisions I made at the time seemed pretty reasonable.

But, of course, the web evolved.  The modern web makes it much easier for the file upload workflow to be asynchronous, which offers a much nicer user experience.  By ditching some of the biases and assumptions of the old post editor, I was more easily able to update the interface.

One of the interesting things to note here is that changing the post editing workflow was easier than the alternatives.  Keeping the old workflow was by no means impossible.  I kicked around several ideas that didn't involve changing it.  However, most of those had other limitations or complications and I eventually decided that they would ultimately be more work.  

This is something that comes up with some regularity when working with an older code-base.  It often happens that the assumptions baked into the architecture don't age well as the world around the application progresses.  Thus, when you need to finally "fix" that aspect of the app, you end up having to do a bit of cost-benefit analysis.  Is it better to re-vamp this part of the application?  Or should you shim in the new features in a kinda-hacky-but-it-works sort of way?

While as developers, our first instinct is usually to do the "real" fix and replace the old thing, the "correct" answer is seldom so straight-forward.  In this case, the "real" fix was relatively small and straight-forward.  But in other cases, the old assumptions are smeared through the entire application and trying to remove them becomes a nightmare.  It might take weeks or months to make a relatively simple change, and then weeks or months after that to deal with all the unforeseen fallout of that change.  Is that worth the effort?  It probably depends on what the "real" fix buys you.

I had a project at work once that was a great example of that.  On the surface, the request was a simple "I want to be able to update this field", where the field in question was data that was generally but not necessarily static. In most systems, this would be as simple as adding a UI to edit that field and having it update the datastore.  But in this case, that field was used internally as the unique identifier and was used that way across a number of different systems.  So this assumption was everywhere.  Everybody knew this was a terrible design, but it had been that way for a decade and was such a huge pain to fix that we had been putting it off for years.  When we finally bit the bullet and did it right, unraveling the baked-in assumptions about this piece of data took an entire team over a month.  At an extremely conservative estimate, that's well over $25,000 to fix "make this field updatable".  That's a pretty hefty price tag for something that seems so trivial.

The point is, old applications tend to have lots of weird, esoteric design decisions and implementation-specific issues that constrain them.  Sometimes removing these constraints is simple and straight-forward.  Sometimes it's not.  And without full context, it's often hard to tell when one it will be.  So whenever possible, try to have pity on the future maintenance programmer who will be working on your system and anticipate those kind of issues.  After all, that programmer might be you.

You can reply to this entry by leaving a comment below. This entry accepts Pingbacks from other blogs. You can follow comments on this entry by subscribing to the RSS feed.

Related entries

Add your comments #

A comment body is required. No HTML code allowed. URLs starting with http:// or ftp:// will be automatically converted to hyperlinks.