Vim emulation in Komodo

Authors note: Here's yet another installment of "From the Archives".  Clearly I haven't really felt like coming up with new ideas lately.  I blame the pandemic.  Seriously - I'm not going to be back in the office until at least next summer.  Even though restrictions have eased (at least for the time being), lock-down fatigue has definitely long since set in.

At any rate, this is another one on using Komodo with Vim emulation.  This was written on April 16, 2014, just a few days after the last one on the same topic.  These days I'm using Vim all the time, so none of this is relevant to me anymore.  However, it is a nice example of how it's possible to extend a good IDE to customize your workflow.

In this case Komodo is (or was) customizable using JavaScript, which is nice - lots of people know JavaScript.  The down side is that, to do actually useful thing, you also it also used XUL and SciMoz, the Mozilla Scintilla binding.  These are less commonly known, to put it mildly.

To be fair, Vim isn't much better on this score.  While it supports multiple scripting languages, the primary one is, of course, VimScript, which is...not a great language.  However, it's also quite old, quite well documented, and there are lots of examples of how to use it.  The VimScript API is also pretty stable, as opposed to Komodo, which was in the process of moving away from the XUL-based stuff when I stopped using it.  And really, a little VimScript will actually take you farther than you think.

In any event, I guess the idea is that it's good to know how to customize your editor, at least a little.  You know, sharpening the saw, knowing your tools, and all that.  Good stuff.  Enjoy!


Since upgrading to Komodo IDE, I've been looking a little more at customizing my development environment.  This is actually made somewhat easier by Komodo's "sync" feature, which will synchronize things like color schemes, key bindings, etc. between IDE instances via ActiveState's cloud.

Anyway, as part of this I've also been looking more at the Vim keybindings.  I've been a casual Vim user for a very long time, but I was never hard-core enough to do things like stop using the arrow keys or give up using ctrl+C and ctrl+V for copy and paste.  So now I'm trying to do just that.

Of course, Komodo's VI emulation mode is a pale imitation of what's available in Vim.  However, even that pale imitation is actually pretty good.  In fact, its even better than Komodo's documentation would lead you to believe.  In addition to the basic modal editing stuff, Komodo supports a decent range of movement commands, variants of change and delete commands, etc.  Basically, it supports everything I already knew about plus a lot more.  So now I'm trying to get that extra stuff into my muscle memory.

In the course of looking at some Vim command guides, I naturally came across some handy looking commands that Komodo didn't support.  So I'm going to try to fix that.

The first one is the reg command.  Vim's registers were something I hadn't really worked with before, but it turns out that they're not only pretty cool, but that Komodo actually has some support for them.  I only know this because the documentation mentions the key binding for the "set register" command.  However, it doesn't implement the reg command, so you can't actually see what's in any of those registers.

So, long story short, I fixed that with a macro.  Just create a new macro named "reg" in your "Vi Commands" folder in your toolbox and add the following code (this requires another macro, executed at start-up, containing the "append_to_command_output_window" function lifted from here):

var viCommandDetails = Components.classes['@activestate.com/koViCommandDetail;1'].
                                getService(Components.interfaces.koIViCommandDetail);
var count = new Object();
var args = viCommandDetails.getArguments(count);

append_to_command_output_window('');
append_to_command_output_window("--- Registers ---");

for (item in gVimController._registers) {
    if (args.length > 0 && args.indexOf(item) < 0) {
        continue;
    }
    if (typeof gVimController._registers[item] !== 'undefined') {
        append_to_command_output_window('"' + item + '   ' + gVimController._registers[item].trimRight());
    }
}

This allows you to type ":reg" and get a list of the current registers in the "command output" window in the bottom pane.

Another good one:

var scimoz = ko.views.manager.currentView.scimoz;
if (scimoz.selText.length == 0) {
    ko.commands.doCommand('cmd_vim_cancel');
} else {
    ko.commands.doCommand('cmd_copy');
}

This can be bound to ctrl+C and allow you to keep the default "copy text" behavior when there is text selected, and still work for Vim's "back to normal mode" when nothing is selected.

Komodo and Vim

Author's Note: We're back with another installment of "From the Archives", the blog show where I declare writing bankruptcy and just post an old, half-finished article that's been sitting in my drafts for years.  This entry is from April 9, 2014.  This was in the midst of my long stint as a Komodo IDE user. 

One of my favorite things about Komodo was that it had pretty good Vim emulation.  I started using that because a few years before I'd spent a lot of time going back and forth between a Windows PC and a Macbook Pro.  The Macbook keyboard had that weird Apple layout going and it routinely messed with me, so I eventually gave up and decided to use Vim-mode because that's the same on both platforms.

Of course, things have changed since then.  I've become a full-time Vim user, and have all the fancy faux-IDE stuff set up.  I actually like it so much that I stopped using PHPStorm for work and switched to doing all my development in Vim.  So this post is no longer relevant to me, but it at least has a few handy links, so enjoy!


I've been a Vim user more or less since I started using Linux.  Mind you, I was never really a hard core Vim user.  I still use the arrow keys, for instance, and manage to get by on maybe a couple dozen keybindings and commands.  I have no clue how Vim's scripting or configuration systems work.  All I know about ctags is that they're a thing that exists.  So really, I'm more of a dabbler.

The other part of this is that I like at least a small amount of IDE in my normal working-day editor.  I kind of like having some sort of "project view" of my files, a code hierarchy viewer, some form of Intellisense, etc.  And while you can get most of the stuff I like in Vim, they're not there out of the box.  And even if you can get them, you can't count on having an obvious graphical way to manipulate them.  Typically, you just have to read the documentation to find out what the key bindings are to trigger everything.

So the upshot of this is that I use Komodo IDE with the Vi emulation setting.  This essentially turns on a Vim emulation mode that makes the editor modal and enables a lot of the standard Vim keybindings as well as a small subset of common commands.  So I get Vim goodness with all the convenience of a full IDE.  I had never really looked closely at just how much of Vim Komodo would emulate, though - I just knew it supported everything I commonly used.

Well, I had some extra time after finishing all my bug fixes the other day, and since really learning Vim has been on my list of things to do for, er, over 10 years, I decided to look up some Vim reference sheets and see how many of the commands and keybindings actually worked in Komodo.  Turns out it was a pretty decent amount.  (Note from the future: I didn't write down the details at the time and don't care enough to catalog now that I no longer use Komodo.  Suffice it do say that Komodo's Vi emulation was actually pretty good.  Maybe not as good as IdeaVim, but pretty good.)

Solo should have had this discussion

Earlier this year, before the entire world caught fire, I posted a review of Solo: A Star Wars Story.  (Spoiler: it stinks.)  In that post, I mentioned that Solo raised the topic of droid slavery (which, if droids are actually sentient, is the only word for it) but failed to do anything substantive with it.  Well, it turns out somebody made a video on that very topic.

The video is quite interesting and well argued.  It gets into a bit of the sci-fi history of using robots as an allegory for various sorts of social oppression and digs deep into the specifics of how droids are depicted in Start Wars.  Its a very good analysis with lots of examples drawn from the various films, including Solo.

The narrator makes an interesting point that the Star Wars franchise in general has tried to have it both ways on droids.  On the one hand, the heroic droids like R2-D2 and C-3PO are clearly intended to be fully sentient characters, with feelings, desires, and distinct personalities.  But on the other hand, all those expendable droids, like the battle droids from the prequel trilogy, seem intended to be viewed as "just machines".  That allows the Jedi and Naboo to slaughter bad guys by the thousands without any blood or mess moral consequences.

I'd often been confused by that.  Given the various depictions, it wasn't always clear how droid sentience was supposed to be viewed.  It's nice to know that, apparently, it's not just me being thick - its equivocation in the stories themselves.  

Actually, maybe that disappearing "knowledge" is OK

A couple of weeks ago I posted an entry about the disappearance of online academic journals and how that's a bad thing.  Well, this article made me rethink that a little bit.

The author, Alvaro de Menard (who seems knowledgeable, but on whom I could find no background information, so caveat emptor), apparently participated in Replication Markets, which is a prediction market focused on the replicability of scientific research.  This is not something I was familiar with, but the idea of a prediction market is basically to use the model of economic markets to predict other things.  The idea is that the participants "bet" on specific outcomes and that incentives are aligned in such a way that they gain if the get it right, lose if they get it wrong, and maintain the status quo if they don't bet.

In this case, the market was about predicting whether or not the findings of social science studies could be replicated.  As you probably know, half the point of formalized scientific studies is that other researchers should be able to repeat the study and replicate the results.  If the result can be replicated consistently, that's good evidence that the effect you're observing is real.  You probably also know that science in general, and social science in particular, has been in the midst of a replication crisis for some time, meaning that for a disturbingly large number of studies, the results cannot be replicated.  The exact percentage varies, depending on what research area you're looking at, but it looks like the overall rate is around 50%.

It's a long article, but I highly recommend reading de Menard's account.  The volume of papers he looked at gives him a very interesting perspective on the replication crisis.  He skimmed over 2500 social science papers and assessed whether they were likely to replicate.  He says that he only spent about 2.5 minutes on each paper, but that his results were in line with the consensus of the other assessors in the project and with the results of actual replication attempts.

The picture painted by this essay is actually pretty bleak.  Some areas are not as bad as you might think, but others are much worse.  To me, the worst part is that the problem is systemic.  It's not just the pressure to "publish or perish".  Even of the studies that do replicate, many are not what you'd call "good" - they might be poorly designed (which is not the same thing as not replicable) or just reach conclusions that were pretty obvious in the first place.  As de Menard argues, everyone's incentives are set up in a perverse way that fails to promote quality research.  The focus is on things like statistical significance (which is routinely gamed via p-hacking), citation counts (which doesn't seem to correlate with replicability), and journal rankings.  It's all about producing and publishing "impactful" studies that tick all the right boxes.  If the results turn out to be true, so much the better, but that's not really the main point.  But the saddest part is that it seems like everybody knows this, but nobody is really in a position to change it.

So...yeah.  Maybe it's actually not such a tragedy that all those journals went dark after all.  I mean, on one level I think that is kinda still is a bad thing when potentially useful information disappears.  But on the other hand, there probably wasn't an awful lot of knowledge in the majority of those studies.  In fact, most of them were probably either useless or misleading.  And is it really a loss when useless or misleading information disappears?  I don't know.  Maybe it has some usefulness in terms of historical context.  Or maybe it's just occupying space in libraries, servers, and our brains for no good reason.

Stupid PHP serialization

Author's note: Here's another old article that I mostly finished, but never published - I'm not sure why.  This one is from way back on August 23, 2013.

I was working for deviantART.com at the time.  That was a very interesting experience.  For one, it was y first time working 100% remote, and with a 100% distributed team, no less.  We had people in eastern and western Europe, and from the east coast to the west coast of the US.  It was also a big, high-traffic site.  I mean, not Google or Facebook, but I believe it was in the top 100 sites on the web at the time, according to Alexa or Quantcast (depending on how you count "top 100").

It also had a lot of custom tech.  My experience up to that point had mostly been with fairly vanilla stuff, stitching together a bunch of off-the-shelf components.  But deviantART was old enough and big enough that a lot of the off-the-shelf tools we would use today weren't wide-spread yet, so they had to roll their own.  For instance, they had a system to do traits in PHP before that was actually a feature of PHP (it involved code generation, in case you were wondering).

Today's post from the archives is about my first dealings with one such custom tool.  It nicely illustrates one of the pitfalls of custom tooling - it's usually not well documented, so spotting and resolving issues with it isn't always straight-forward.  This was a case of finding that out the hard way.  Enjoy!


Lesson learned this week: object serialization in PHP uses more space than you think.

I had a fun problem recently.  And by "fun", I mean "WTF?!?"  

I got assigned a data migration task at work last week.  It wasn't a particularly big deal - we had two locations where user's names were being stored and my task was to consolidate them.  I'd already updated the UI code to read and save the values, so it was just a matter of running a data migration job.

Now, at dA we have this handy tool that we call Distributor.  We use it mainly for data migration and database cleanup tasks.  Basically, it just crawls all the rows of a database table in chunks and passes the rows through a PHP function.  We have many tables that contain tens or hundreds of millions of rows, so it's important that data migrations can be done gradually - trying to do it all at once would hammer the database too hard and cause problems.  Distributor allows us to set how big each chunk is and configure the frequency and concurrency level of chunk processing.

There's three parts to Distributor that come into play here: the distributor runner (i.e. the tool itself), the "recipe" which determines what table to crawl, and the "job" function which performs the actual migration/cleanup/whatever.  We seldom have to think about the runner, since it's pretty stable, so I was concentrating on the recipe and the job.

Well, things were going well.  I wrote my recipe, which scanned the old_name table and returned rows containing the userid and the name itself.  And then I wrote my migration job, which updated the new_name table.  (I'm fabricating those table names, in case you didn't already figure that out.)  Distributor includes a "counter" feature that allows us to trigger logging messages in jobs and totals up the number of times they're triggered.  We typically make liberal use of these, logging all possible code paths, as it makes debugging easier.It seemed pretty straight-forward and it passed through code review without any complaints.

So I ran my distributor job.  The old_name table had about 22 million rows in it, so at a moderate chunk size of 200 rows, I figured it would take a couple of days.  When I checked back a day or two later, the distributor runner was reporting that the job was only 4% complete.  But when I looked at my logging counters, they reported that the job had processed 28 million rows.  WTF?!?  

Needless to say, head-scratching, testing, and debugging followed.  The short version is that the runner was restarting the job at random.  Of course, that doesn't reset the counters, so I'd actually processed 28 million rows, but most of them were repeats.  

So why was the runner resetting itself?  Well, I traced that back to a database column that's used by the runner.  It turns out that the current status of a job, including the last chunk of data returned by the crawler recipe, is stored in the database as a serialized string.  The reason the crawler was restarting was because PHP's unserialize() function was erroring out when trying to deserialize that string.  And it seems that the reason it was failing was that the string was being truncated - it was overflowing the database column!

The source of the problem appeared to be the fact that my crawler recipe was returning both a userid and the actual name.  You see, we typically write crawlers to just return a list of numeric IDs.  We can look up the other stuff st run-time.  Well, that extra data was just enough to overflow the column on certain records.  That's what I get for trying to save a database lookup!