Chrome debugging victory

This post is about a Chrome bug that I fixed a few months ago in new product I was working on. Normally, this would not be noteworthy, but this one was a real mind-bender.

The product I was working on at the time is an add-on to one of my company's services. To give a little background, our bread and butter is high-resolution aerial imagery - think satellite images, but with better resolution and with oblique shots taken from multiple directions in addition to the overheads. Our biggest customers for these fly-overs are county governments, who use the imagery for a variety of purposes. For example, when I was working for a county government, we used Pictometry's images for 911 dispatching, so that dispatchers could give detailed information on a property to police and other first responders.

Anyway, counties will typically order a new fly-over every year or every few years so that their images remain up to date. One of the add-on services we offer with this is a "change finder" service, where we'll analyze the images from two different years and provide the customer with a data set highlighting the structures that have changed. The product I was working on is actually an add-on to that service. It's a web-based application targeted at local assessors that gives them a simple, no-setup-needed way to access their change data and provides a basic workflow for analyzing it and making determinations on property values. So rather than spending an hour driving out to inspect a location, the assessor can use a rich web UI to see that the property now has garage that wasn't there last year and estimate how big it is. Saves them time and keeps the tax rolls up-to-date.

As for the application itself, it's based on our next-generation internal development framework. The back-end is built on PHP and PostgreSQL/Oracle with the front-end being built on Ext JS. It features a dual-pane "before and after" map viewer that's built on OpenLayers and integrates with the customer GIS data. There is a set of search tools that lets the customer narrow down and navigate their data set. In addition, the images are overlaid with outlines to highlight what properties have changed.

But enough exposition - on to the bug! This one was a real head-scratcher. Basically, sometimes the UI would just freak out. All the controls outside the map area would just turn into one solid gray field, effectively rendering them invisible. They still worked, if you knew where to click, but you couldn't see them. They'd come back for a second when you panned the map image, but then disappear when you stopped panning. And it wasn't a transient JavaScript thing - reloading the page didn't help. And the real kicker was that this bug only happened in Chrome (we didn't test other Webkit-based browsers, as Chrome is the only one we officially support).

This bug had been languishing in our queue for weeks. It wasn't consistently reproducible. We found that it happened less if you turned off Chrome's hardware acceleration, but it didn't go away entirely. Sometimes it would happen consistently, whereas other times you could go a full week and never see it. We eventually determined that it was related to the number of images overlayed in the map area, because hiding enough of them made the issue go away. We also found that it happened more at larger window sizes. In retrospect, that was probably because that allows room for more images.

Since the non-map portion of the UI was what was getting messed up, I hypothesized that it might be due do some weird interaction with the styes used by Ext JS. However, that didn't pan out. It turned out that there was nothing remarkable about Ext's styles and every "discovery" I made ended up reducing to "more images in the map". The one insight I was able to glean was that the issue was the number of images visible in the viewport, not just the number present in the map area DOM.

I eventually found the source of this bug through sheer brute force, i.e. process of elimination. I opened up the Chrome developer tools and starting hiding and removing DOM nodes. I started outside the map area and gradually pruned the DOM down to the smallest number of elements that still caused the bug.

It was at that point that I finally found a suspicious CSS rule. It was a CSS3 transform coming from our client-side image viewing component. Apparently the rule was originally put there to trigger hardware acceleration in some browsers. I have no idea why it caused this bug, though - I don't think we ever did figure that out. But it was an easy fix and I was rather proud of myself for having figured it out. I suppose that just goes to show that sometimes persistence is more important than brilliant insight when it comes to bug fixing.

You can reply to this entry by leaving a comment below. This entry accepts Pingbacks from other blogs. You can follow comments on this entry by subscribing to the RSS feed.

Comments #

    Ugh

    You should be using Chromium

Add your comments #

A comment body is required. No HTML code allowed. URLs starting with http:// or ftp:// will be automatically converted to hyperlinks.