Saturday, August 01, 2009

Learning to Love Jetpack, Part 2

Mozilla Jetpack is an interesting beast, sitting (as it does) at the crossroads of Spidermonkey and XPCOM. It brings JavaScript programmers within a stone's throw of the imressive XPCOM API with its 1450 interfaces and 890 components, plus it unlocks a world of cross-platform AJAX capabilities and local data persistence. Mind you, Jetpack does not actually hand you the keys to the entire XPCOM universe; that may come later. Right now you just get access to certain wrapped objects. But there's more than enough power under the hood to give Greasemonkey a run for the money.

If you're already familiar with Greasemonkey, you'll grok the basics of Jetpack instantly. A fundamental pattern is firing a script when a page loads in Firefox (except, you have to start thinking in terms of "tabs," not pages). So for example,
   jetpack.tabs.onReady( callback );

function callback( doc ) {
// do something
}
This is a pretty common pattern. Your callback function is triggered when the target document's DOMContentLoaded event is fired. You can manipulate the DOM in your callback before the page actually renders in the browser window. So for example, you might want to filter nodes in some way, rearrange the page, make AJAX calls, attach your own event handlers to page objects, or wreak any manner of other havoc, before the page is actually displayed to the user. This is a standard Greasemonkey paradigm.

The "We're not in Kansas any more, Toto" feeling starts to hit you when you realize that your script can walk the entire list of open tabs and vulture any or all DOMs and/or window objects, for all frames in all tabs; something you can't do in Greasemonkey, since GM is scoped to the current window.

Also, you have access to jQuery. So if you want to see how many links a page contains, you can do:
   var linkCount = $('a').length;
and that's that. If you're not already a jQuery user, you'll want to vault that learning curve right away in order to get max value out of the Jetpack experience. It's not a requirement, but you're shortchanging yourself if you don't do it.

Developing for Jetpack takes a little getting used to. First, of course, you have to install the Jetpack extension. The direct download link, at the moment, is here, but it could go stale by the time you read this. If so, go straight to https://jetpack.mozillalabs.com/.

To get to the development environment, you have to type "about:jetpack" in Firefox's address bar and hit Enter. When you do that, you'll see something like this:



There are several links across the top of the page (Welcome, Develop, etc.). It's not obvious that they are links, because they are not in the usual shade of blue and aren't underlined. Nevertheless, to do any actual code development in the embedded Bespin editor, you have to click the word "Develop" (which I've circled in red above). This brings up a page where, if you scroll down, you'll see a black-background text editor console.



NOTE: Not visible in this screenshot are the final two lines of code:

var ONCE_A_MINUTE = 1000*60;
setInterval( getTweet, ONCE_A_MINUTE );

Note that right under the console, you'll find the words "try out this code." (See red arrow.) They are not highlighted by default and thus show no evidence of being clickable. But if you roll your mouse over the words, they get a grey highlight as shown here.

Note: If you make an edit to your code and click "try out this code" a second time, you may find that nothing happens until you refresh the entire page in the browser. Fortunately, you don't lose your work. But it feels scary nonetheless to refresh the page immediately after making a code echange.

I find it really odd that Jetpack has these obvious user-interface design gaffes. These aren't bugs but straight-out poor UI design decisions. What makes it so odd is that some world-class UI experts (such as Aza Raskin) are involved with Jetpack. Guys. Come on. I mean, really.

Maybe I'll do a code runthrough (for something a little more interesting than the above code) next time. For now, note that the code shown above makes use of Jetpack's built-in Twitter library (which wraps Twitter API functions and simplifies some of the AJAX calls, although I don't know why I should have to learn Jetpack's own Twitter API now). The code shown above simply checks Twitter every 60 seconds for any updates created by a particular user (me, in this case). If a new update is found, the relevant tweet is shown in a toaster notification in the bottom righthand corner of the desktop:



So far so good. But what if you want to give your script to a friend? How does your friend install it? Surely not by using the Bespin console?

Well, assuming your friend has installed the Jetpack add-on already, you can give him or her the script in a text file called something like myscript.jetpack.js. Or better yet, put that script online somewhere. Then you also need to have a page somewhere that contains this line in the HTML (in the head portion):

<link rel="jetpack" href="myscript.jetpack.js" name="TabList">

When your friend opens the page that contains this line, a warning-strip will appear at the top of the Firefox window saying that the page contains a script; do you want to install it? Answer yes, and you get a big scary red page that, if you scroll to the bottom, has these buttons:



Obviously, you need to click the button on the right. At that point, the script will be installed and "live."

There's a lot more to Jetpack development than what I've described here, but this should be enough to get you started. Next time I'll present a (marginally) more meaningful code example so you can get yet another taste of what Jetpack has to offer. Then I'll get back to blogging about more important things, like the hazards of chewing gum while programming, or the high cost of not doing adequate usability testing.

Or maybe I'll just sit back, put my feet up, and tweet.

Thursday, July 16, 2009

Learning to Love Jetpack, Part 1

I've been having a torrid on-again/off-again love affair with Mozilla Jetpack for the past several weeks, and I have to say, it's been a bit exhausting at times. But try as I may to walk away, I keep coming back for more. I guess fools rush in where less impetuous web developers fear to tread.

It should be said up front that Jetpack is quite immature at this point (having been announced only a couple months ago) and there's a new release almost every week. Putting it through its paces feels a bit like driving a concept car. It's fun, it's exciting, it'll amaze your friends. But is it ready for production?

The basic concept is compelling: Make it possible to develop JavaScript-based Firefox extensions that have special security privileges (the ability to do cross-site AJAX, for example) combined with the ability to vulture page objects at load time, a la Greasemonkey. The goal is to let mere mortals write Firefox add-ons without having to get mired in the XPCOM morass. (If you've ever tried to write a Firefox extension, you know what I'm hinting at. If not, you can get a good whiff here.)

What can you do with Jetpack? Right now, not a lot, other than peek and poke the DOM as a page loads. True, the MozLabs crew recently added audio-recording support, and there's a persistence API that doesn't rely on cookies. Plus you can iterate through tabs, put stuff in the status bar, and create toaster popup notifications. (Woo hoo.) But still, not a lot. You can do most of this kind of stuff with Greasemonkey.

That'll all change soon, though, as Jetpack's APIs expose more and more XPCOM internals. Be clear, a year from now, no one will be mentioning Greasemonkey and Jetpack in the same breath.

Even now, though, Greasemonkey and Jetpack are pretty far apart, under the covers. One difference is the runtime scope. Jetpack runs "above" all open tabs. This is quite handy, because it means you can easily enumerate and manipulate all open tabs (I'll provide some source code for this in a later post), something that's all but impossible to do in Greasemonkey.

Another nice thing about Jetpack is that it comes preloaded with jQuery. You don't have to do anything special to access jQuery methods; just start using them.

There's some built-in support (convenience methods) for the Twitter API, which is kind of interesting.

And you get an integrated development environment with Bespin, which is pretty nice. That, combined with instant-on loading of scripts (no need to restart Firefox), makes for a rapid dev/test cycle, greatly reducing Rolaids consumption. I'll lead you through the dev workflow in my next post so you can get an idea of what it's like to develop Jetpack scripts.

There are a couple of issues (one of them quite serious) to be aware of, though. First, you can't install and use a Jetpack script without installing Jetpack. In other words you can't just give a script to a friend and say "Here, install this, it's cool." Instead it's "Go to the Mozilla Labs download page, install this week's alpha build of Jetpack, along with Firebug 1.4, and pray God my script still works on your machine next week."

The MozLabs guys say that eventually, Firefox may come with embedded Jetpack support so that no one need proactively do a Jetpack install before being able to use Jetpack scripts. That would be a Very Good Thing, except for one potentially nasty issue.

The nastiness has to do with the way Jetpack facilitates memory leakage. Simply put, it's extraordinarily easy to write scripts that eventually cause Firefox to hang. One can argue that adhering to best practices will prevent this (which is true), but I think that if the Jetpack agenda truly does revolve around getting mere mortals (people with modest JavaScript skills) to participate en masse in creating Firefox extensions, the potential exists for disaster. You're inevitably going to have large numbers of amateur programmers getting into trouble with memory leakage, and that's not going to do anything good to Firefox's already poor reputation for memory leakage nor to Jetpack's reputation.

As I see it, the problem is really twofold. Fold Number One has to do with the way XPConnect works. (XPConnect is the bridging technology that allows JavaScript to interoperate with XPCOM objects written in C++.) Without going into gory detail, C++ and JavaScript have different memory management models. One is a world of reference counting, the other is a mark-and-sweep world similar to Java. When you wrap a C++ object in such a way that it's usable from JavaScript, you're entering a whole new universe of memleak possibilities. This is the domain where Jetpack lives.

The second aspect of the problem is that the kinds of capabilities that attract programmers to something like Greasemonkey or Jetpack tend to draw on programming patterns that are inherently dangerous from a memleak point of view, chief among them the Observer pattern. Sometimes I think the Observer pattern is actually better termed The Cyclic Reference Memleak pattern, because you're basically creating objects and/or wrappers that maintain references to each other. It's a great way to generate memory leaks.

Again, any competent developer (myself not included) will understand the importance of best practices here, and staying out of trouble is not rocket science. But to expect the average script kiddie to know or care about memleak mitigation is like expecting the average McDonalds customer to know how to make a roux.

In any case, Jetpack is an interesting beast and I continue to be fascinated by (and infatuated with) it -- enough so, that I intend to devote at least a couple more blog posts to it. Check back here in a day or two. We'll have some fun.

Friday, July 10, 2009

JavaScript Expression Closures and Why They Rock

JavaScript expression closures (new for JS 1.7 but syntactically even cleaner in 1.8) are one of those things that seem very so-what the first time you hear about it, but leaves you slapping your forehead and saying "OMG this is so darn freakin' cool" repeatedly after you start using it. Why? Because OMG. It's so darn freakin' cool.

The basic notion is that instead of writing short utility functions (which I have a million of) as, for example
function square( x ) {
return x * x;
}
you just write
function square( x ) x * x
which (yes yes, I know) looks very so-what, but bear with me for a moment. The short syntax starts to become more compelling when you begin using it in anonymous functions, particularly callback functions (which, as you know, tend to be anonymous a great deal of the time). For example, suppose you have a custom comparator for the sort() method of Array:

function descending( a,b) {
return b > a;
}

[5,1,3,2,4].sort(descending); // [5,4,3,2,1]

You can instead write it as:
[5,1,3,2,4].sort( function(a,b) b > a );
Another fairly trivial example: Suppose you want a function that will constrain strings to a certain length. You can do something like:
function limit(s, n) s.length > n ? s.substring(0,n): s
As a more elaborate example, suppose you have the following code, designed to convert a "camelCase" string to underscore_format.
function underscore( str ) {

function toUnderscore( s ) {
return '_' + s.toLowerCase();
}

return str.replace(/[A-Z]/g, toUnderscore );
}

// underscore( "aName") --> "a_name"

With the new closure syntax, you can do:
function underscore( str )
str.replace( /[A-Z]/g,
function(s) '_' + s.toLowerCase() );
Here I've converted not only the outer function, but also the callback to replace(), to expression-closure form. (I split it into 3 lines for readability, but it will still execute correctly as a 3-liner. You don't have to write these things as one-liners.)

Still not convinced? Try using this syntax (supported in Firefox 3+, and anywhere else JS 1.7 or 1.8 are implemented) in some functions yourself, and see if your code doesn't become easier to write, shorter, and (in most cases) more readable. It may be only a small improvement on the more verbose older syntax, but an improvement is an improvement. I'll take it.

Thursday, July 09, 2009

Why Google Chrome OS is a nonstarter

As the entire world knows by now, Google recently announced its intention to muscle its way into the operating-system space (supposedly) by way of something called Google Chrome Operating System.

But is it really an operating system? By Google's own account, it's actually an instant-on windowing system sitting atop a Linux kernel, and it will run on certain netbooks only, using certain chipsets only. Google is reportedly working with Acer, Adobe, ASUS, Freescale, Hewlett-Packard, Lenovo, Qualcomm, Texas Instruments, and Toshiba to "deliver an extraordinary end user experience." I take it that means Flash will be supported, since Adobe is on the partner list.

But where are the value-adds in this picture? What, exactly, does Chrome OS bring to the table that you can't already get elsewhere?

Not much, it turns out.

"Speed, simplicity and security are the key aspects of Google Chrome OS," acccording to Google. Speed, in this case, means instant-on. Turn your netbook on, it lights up, you have e-mail and browsing. Of course, you might have to wait a few seconds while the netbook (re)acquires a wi-fi connection, but at least you don't sit there for two minutes waiting for Godot.

This sounds like a great technological advance until you realize that the same instant-on capabilities promised by Chrome OS are already available via HyperSpace from BIOS vendor Phoenix Technologies Ltd., Splashtop from DeviceVM Inc. and Cloud from Good OS, as well as (more recently) Presto from Xandros Inc. In addition, Dell is putting special instant-on features in its Latitude laptops, and oh by the way, I can put my last-year's-model Dell laptop to sleep any time I want and wake it up later in the day, right now.

Bottom line, instant-on is not new, and it'll be even less new when netbooks running Google Chrome OS become available for consumers in the second half of 2010 (according to Google).

Simplicity is another supposed value-add. What this really seems to mean is that you can only run web apps, and you have only one UI to learn (Chrome's). Which is fine. I spend most of my day in a browser already, thank you.

Security is the third main value add, according to Google. Security expert Bruce Schneier has already derided this claim, however, calling it "idiotic." Far be it for me to disagree with such a distinguished expert.

What are we left with? In terms of new technology, not much, really. There's nothing here you can't already get elsewhere. The only hope Google has of differentiating itself in this market is to offer a jaw-dropping user experience, something so compelling that nothing else even compares. In other words, they have to out-Apple Apple. I have yet to see Google do that -- with anything. Maybe this time they'll pull off a miracle. But somehow I doubt it.

Tuesday, July 07, 2009

An exercise in riddle-solving

1 2 3 4 5 = 1
5 4 3 2 1 = 2
1 1 1 1 1 = 5
2 2 2 2 2 = 1
3 3 3 3 3 = 6
4 4 4 4 4 = 2
5 5 5 5 5 = 7
6 6 6 6 6 = 3
1 1 2 2 2 = 6
3 3 4 4 4 = 7
1 1 1 1 2 = 1
1 1 1 1 3 = 6
1 1 1 1 4 = 2
1 1 1 1 5 = 7
2 2 2 2 6 = 3
2 2 2 2 5 = 7
2 2 2 2 4 = 2
2 2 2 2 3 = 6
2 2 2 2 2 = 1
6 1 1 1 1 = 2
5 1 1 1 1 = 8
4 1 1 1 1 = 5
3 1 1 1 1 = 2
2 1 1 1 1 = 8

3 1 4 1 5 = ?


This puzzle comes by way of a terrific blog post by James Marcus Bach, who in turn got the puzzle from Trey Klein. Bach candidly reports: "I found it difficult to solve. Took me a couple of hours."

As a rule, I hate puzzles of the recreational (purposeless, do-nothing) sort -- colloquially known as brain-teasers -- simply because they accomplish nothing except to waste time and make me feel stupid. I either already know "the trick," or when I find out what the trick is, it turns out not to be useful for anything other than solving the puzzle in question. Then I feel cheated and stupid.

But I'm inherently a masochist, so naturally, when I saw the above puzzle, I just had to try it. ;)

I resolved not to spend more than two or three minutes on it, though.

It turns out, I got the puzzle right, and it took me only a minute or so. You might want to stare at the numbers yourself for a couple minutes now, before reading the next paragraph.

The supposed "answer" and its reasoning (involving a somewhat painful-looking formula with modulo arithmetic) is given here. I took a far more pragmatic approach. I looked only at occurrences of "5 =" and saw that in 3 out of 4 cases, a 7 appeared on the right-hand side of the equals sign. If this is some kind of casino game and I'm betting real money, and I see "5 =" come up again, I'm betting on 7 being the answer.

I decided to reverse every line of the problem and try it again. I looked at each occurrence of "7 =". And again, 3 out of 4 times, "7 =" ends up paired with 5.

That's it, I decided. The answer must be 7.

It turns out 7 is the correct answer.

"But you just guessed!" someone will say. "You didn't prove that the answer is 7."

My retort is that neither did the person who wrote the explanation given here prove that the inevitable answer is 7, because he or she didn't prove that the given explanation is the only explanation that will work; he merely gave an explanation that is consistent with 7 being the answer.

I'm 100% sure if I had 3 hours, I could come up with at least a couple of formulas that, given the input data shown above, will be consistent with an answer of 7. I can also come up with a couple of formulas that are consistent with an answer of 1. So which formula (out of these several formulae) is "correct"? Is any one formula provably the only possible correct one? That's the question. I'd argue it's not even possible to know if that question can be answered.

Still, is it reasonable to attack a problem like this heuristically, and "bet" on an answer that seems (statistically) likely to be correct? Is it better to spend 3 hours arriving at a formula that's consistent with a given answer (but that could be "shot down" by a different formula later)? What kind of approach do you take if you're in an out-of-control spacecraft and you need an answer within 3 minutes or you'll burn up on re-entry? What if you're in the Titanic and have a full 24 hours' notice of icebergs ahead and need to decide on the correct heading to take? Do you "bet the farm" on a heuristic method -- or on a formula that seems right (and has the appearance of being rigorous, because it's so formulaic) but isn't provably correct?

I think the approach you take depends on the situation, but it also depends on your personal working style. In the absence of a reason not to, I tend to take a heuristic approach. My style is not to waste time, even when I have time to waste. What's yours?

Wednesday, July 01, 2009

Taking microURLs to the next level

If you've been following my Tweets lately, you may have noticed that I've begun using 3.ly ("threely") as my URL-shortener of choice. Before that, it was bit.ly, and before bit.ly I used TinyURL.

Like many Twitter users, I moved from TinyURL to bit.ly for the simple reason that bit.ly produces tinier URLs. Threely gives even smaller URLs. But that's not why I'm moving from bit.ly to 3.ly.

Threely has an ambitious goal: to take micro-URL rewriting to the next level. The Threely folks haven't yet said what their list of services will include. But already, they offer an interesting glimpse of what they might be up to. All you have to do is append a hyphen to any 3.ly URL in order to activate a hit counter. If you surf to the hyphenated URL, you come to a preview page that displays the hit count and one or more continuation links. (To see what I mean, try http://3.ly/ymZ-, which links to one of my previous blog posts but first takes you to the hit-counter page.)

One can imagine a range of services that Threely could offer based on short query strings. For example, what if, by appending "-1" to a 3.ly URL, you could have Threely not only track hit-counts but send you a daily or weekly report of traffic on that URL via e-mail. Suppose "-2" means the same thing, except your e-mail contains verbose results instead of cursory results. The verbose results might include the HTTP headers from each visitor's GET request, the date/time of each request, etc.

There isn't a Threely API yet, but one can imagine that Threely's API could offer various takes on "URL-rewriting-as-a-service" (URaaS? oh dear God please not another acronym...). Or should we just call it micro-analytics?

As I say, exactly what the Threely folks have in mind, I don't know. But I'll be watching their home page closely, and signing up as a registered user as soon as they make sign-ups available. I have a feeling this could get interesting.

Monday, June 29, 2009

Making music with Apple IIgs technology


Over the weekend I found myself using floppy disks for the first time in I-don't-know-how-many Moore epochs. It was a brain-bending reminder of how much great technology was built atop operating systems that fit into a few K of memory and Mylar storage media.

Back in the mid-1980s, Bob Yannes, the guy responsible for the MOS Technology SID (Sound Interface Device) chip in the Commodore 64 and the Ensoniq Digital Oscillator Chip (Ensoniq ES5503 DOC) that powered the Apple IIGS computer's audio system, introduced the first low-cost digital sound-sampling keyboard, the Ensoniq Mirage. At $2000 (in 1985 dollars!), it might not seem low-cost, but you have to realize that the Mirage's main competition at the time was the Fairlight CMI Series II, which, at ~£27,000, only very successful musicians could afford.

The Mirage was an incredibly versatile (and for its day, quite impressive-sounding) MIDI keyboard, with a 333-note sequencer (big stuff back then!), velocity-sensitive keys, and of course, the ability to use sampled sound. It loaded sound samples from 128K single-sided double-density floppy disks. Each disk held six 32-note "voices" plus a copy of the operating system.

In my case, I couldn't afford a Mirage keyboard ($2000 was too steep for me), but around 1987 Ensoniq came out with a rack-mount (keyboardless) version of the Mirage that was much cheaper (~$1300). Once it became "last year's model," I picked one up, new, at Sam Ash for $1000. The rack-mount unit (exactly the same as what I have) is pictured above.

My Mirage box had collected so much dust in the basement that I had to spend 5 minutes just cleaning the chassis exterior. I was very worried, of course, that the accompanying floppy disks (without which the device is useless), now 20 years old, would be so corrupted with flipped bits that I wouldn't even be able to boot the machine. Cheap floppies always seemed to go bad on me in two or three years. Could these still be good after 20 years?

Yes. Amazingly, I was able to boot the Mirage and load sound samples from disk. And just as amazingly, everything else worked -- MIDI in/out/thru, all the front panel buttons, etc. What had me challenged was the fact that I couldn't find the user's manual -- and this was a complex box to operate. Not really complex, just a poor user interface. Many buttons, many numeric inputs, no text messages whatsoever. It's a matter of knowing which buttons to push in which order.

I looked online for a user's manual, assuming there would be one. There wasn't. Instead, it turns out there's a lively aftermarket in Mirage user manuals, which often sell for more than used Mirage boxes themselves! But I did finally find a "cheat sheet" showing all the parameter codes. From there, I figured out the button-combos myself.

I wired the MIDI-in port to my Yamaha DX-7's MIDI-out, and lo! There was sound coming out the Mirage's mono output jack. And the layered sound (Mirage over DX-7) was actually quite marvelous, especially when piped through Cakewalk's digital effects (running on my Compaq Presario).

I often wonder if we really need the 120 gigabyte drives and 2GHz CPUs and mega-ginormous "operating systems" (so-called) on which we so slavishly rely today. Do we need to live in such digital squalor, really?

I look at the Mirage and think: "Not."

Friday, June 26, 2009

Google Voice: Cloud Meets Cell

Yesterday, to surprisingly little fanfare, Google started fulfilling invites to its new Google Voice service, which was announced back in March. This is essentially the culmination of Google's GrandCentral acquisition of 2007, but have no doubt, it's not the end of the story. It's the beginning of one.

It's hard to sound-bite Gvox. Ostensibly, it's a way to sign up for a single vanity phone number that lets you do voicemail and SMS text retrieval using VoIP from any device. According to Google:
[ Google Voice ] improves the way you use your phone. You can get transcripts of your voicemail and archive and search all of the SMS text messages you send and receive. You can also use the service to make low-priced international calls and easily access Goog-411 directory assistance.
If that sounds boring, it's because it is. But I think it may be a handful of frost scraped off the tip of a big icy thing floating in the ocean. If all Gvox did were to offer superior service to Verizon's Visual Voicemail, it would be a significant advance. But there's the potential for more. Much more.

Even before the GrandCentral acquisition, Google had big plans in the VoIP and mobile domains. You have to look no farther than United States Patent Application 20080232574 to get a hint of what I mean.

But use your imagination. If your cell phone becomes a gateway to Android cloud apps, you've essentially got the power to retool your phone with a virtualized OS, and obtain access to web apps galore. In fact, with a little sleight-of-hand, Gvox can give the appearance of completely reskinning your phone and offering a whole new menu system. Want to store your address book in the Gvox cloud? Done. Get auto-complete as you enter phone numbers? Do Google white-pages 411 lookups? Do reverse phone lookups? Record a podcast (and store it in the cloud)? Upload a cell-phone video to YouTube in a single speed-dial click? You get the idea. Let your imagination (ahem...) roam.

One thing I wish Gvox would do that I don't think it does (yet) is full many-to-many mappings of phone numbers. Right now you just get a one-to-many mapping of a vanity number to all your other numbers, which is handy enough because it means people can always reach you with just a single phone number, but I think it would be cool (especially if you're a marketer) to have the ability to map many phone-number aliases to a single target number and get advanced analytics for the resultant traffic. Who called from which number? Which ad or blog placement or campaign was the number associated with? Etc. etc.

We'll probably get all that -- and more. As I say, right now, on the surface, Google Voice doesn't sound all that exciting. But under the surface, there's a whole new world waiting. The Internet is your new dial tone.

Thursday, June 25, 2009

Java app server popularity wanes, except for one



Java app server popularity is on the decline -- if Google search volume is any indication. The above graphic (from Google Trends) shows five years of data representing search volume (upper portion) and news-citation volume (lower) for keywords "apache tomcat","websphere","jboss," and (just for fun) "weblogic." The trend lines are pretty convincing, it seems to me. Only Tomcat (which of course lacks full-blown app server status and is the orange in this apples-to-apples comparison) has a relatively steady trend line. The trajectory for all others is about the same as that of the U.S. Airways jet that landed in the Hudson.

In recent years, some of the lack of app-server-targeted new development has shifted to development that targets runtime frameworks like Spring. But it appears even interest in Spring has peaked.



The news for Java runtime containers is not all bad, however. One open-source application server -- namely Glassfish -- has been coming on strong.



What's interesting about Glassfish, of course (aside from the fact that it has a microkernel based on OSGi), is that it's a Sun Microsystems-backed project, with Sun providing commercial support for the enterprise version of the server. That means the future of the enterprise version rests with Oracle. One wonders what, if anything, Oracle will do with it.

Sunday, June 21, 2009

I thunk, therefore I am

Every once in a while I'm humbled to find that there's a word, in computer science, for something fairly common, that I've never heard of before. Today's word? Thunk.

According to Wikipedia, "thunk" may refer to:
  • a piece of code to perform a delayed computation (similar to a closure)
  • a feature of some virtual function table implementations (similar to a wrapper function)
  • a mapping of machine data from one system-specific form to another, usually for compatibility reasons
The original paper on thunks was P.Z. Ingerman's somewhat inscrutable (to me, at least) piece in Communications of the ACM, Vol. 4, No. 1 (1961).

I am, as I say, humbled to find that I did not know the term thunk. All this time, I thought it was the past tense of think.

Wednesday, June 17, 2009

How do you get 10K Twitter-followers legitimately?

NOTE: This post is obsolete. You want this one instead.

I explained yesterday why it's important, professionally, for me to have a large number of Twitter followers, and I pointed out that having quality followers is important: What good is it to have 10K followers if they're all trying to sell you a MLM scam?

I also promised that today, I'd tell you how I got to the 10K Twitter-follower mark legitimately, without sacrificing quality, which is to say without resorting to the sleazy list-juicing tricks that so many "marketing experts" seem to use on Twitter to gain followers.

Here is what worked for me.

I'm in the analyst business. My company doesn't compete directly with Gartner or Forrester or any of the other big analyst firms, but some of Gartner's (and Forrester's, etc.) customers do have an interest in some of the areas we cover. Thus it stands to reason that if someone (a given Twitter user) is following the Twitterstream of a particular analyst working for XYZ Group, that person might very well be interested in following my stream.

So imagine this scenario. Joe Twitter-user works in IT for a big company. He has decided to follow Analyst Linda Lou's Twitterstream because she covers content technology at XYG Group.

I decide to follow Joe Twitter-user on Twitter -- and Joe notices that someone named @kasthomas has decided to follow him. He says: "Hmm, who is this @kasthomas character? I better go check him out." Joe arrives at my Twitter page and sees from my Bio that I'm a content-technology analyst. He scratches his chin, shrugs, and says "Okay, maybe I'll follow this guy and compare what he has to say with what Linda Lou has to say."

The nice thing about Twitter is that people's follower lists are public. I can get the Twitter usernames of all of, say, Forrester's list of 18K+ followers. I can inspect that list by going directly to http://twitter.com/forrester/followers, or I can harvest the list programmatically by using the Twitter API and a little Javascript.

I have written various scripts (most of which run in the Firebug console; some are Greasemonkey scripts) that do things like harvest a given account's followers, do a diff between my follower list and another list, and batch-follow a given group of IDs. I also have scripts for finding out who I'm following that is not following me back, and purging those people from my Following list. I've blogged before about some of these scripting techniques, which (as I say) often involve only a few lines of code that can be run in the Firefox/Firebug console.

Bottom line, it's not rocket science. You follow people with common interests. They follow you back. Or not.

Now as to how you can meaningfully follow and process incoming tweets from thousands of followees, that's another story. And another blog, for another day.

For more on this, see my 99-cent e-book.

Tuesday, June 16, 2009

Opera unveils just what I said they would

A few days ago, I predicted (based on no inside information whatsoever; strictly an educated guess) that Opera would today unveil just exactly what it today unveiled: http://unite.opera.com/.

What can I say? Once in a while you get lucky.

Monday, June 15, 2009

What does it mean to have 10K Twitter followers?

NOTE: This post, about how I do Twitter, is obsolete now, in that I currently have well over 300,000 followers, not 10K, and I've learned a lot about Twitter in the four years since this post was written. See this 2013 post for more up-to-date info. 

NOTE: If you came here wanting to know what "+K" means in a tweet, it refers to Klout.com kred points. If you came here to learn what the 'K' means in 10K, it means thousand. Lots of people Google it, apparently.

Somebody on Twitter asked me over the weekend what it means to have 10K Twitter followers. Since most people learn in grade school that 'K' used in a numeric context means "thousand," I have to assume that the person's question was rhetorical in nature rather than numerical. Either that, or he failed 5th grade.

Still, let me make it perfectly clear. Having 10K Twitter followers means just one thing. It means you have ten times more than 1K Twitter followers.

It's important for me to have an audience, because by profession I'm a person who transfers knowledge and influences opinion. If I have a larger audience, I'm more effective at what I do. That's fundamentally why I went about the six-month-long task of trying to attract and keep ten thousand Twitter followers, a goal I reached on Sunday.

Of course, it doesn't help to have followers if they are all robots, nut cases, and mouthbreathers. Quality counts. Unfortunately, Twitter attracts its share of hucksters, scammers, lamers, and marketing hangers-on, and many of them spend their days and nights trying to follow people in hopes of a pingback of some kind. I have some of those people in my 10K, but not so many as to make the remainder not worth having. In fact, I count the quality of my follower list to be extremely high, and I'll explain why I think that -- and just how I got to the 10K mark, incidentally -- tomorrow.

For more on this subject, see my 99-cent e-book.

Saturday, June 13, 2009

Here's what Opera is about to unveil

I love a good mystery as much as the next guy, but many people (myself included) are finding Opera's latest mysterious claim a bit over-the-top. In case you haven't been following this story, the Opera folks are saying that on June 16 they will unveil something that will "reinvent the Web."

Many people have speculated as to what it could be (I will tell you what I think it is in a moment). Some have said, based on a tantalizing tweet by Hicksdesign, that Opera has found a way to put the Internet on a USB stick. However, that's (yawn) been done.

Others have suggested that the new Opera will offer a seamless (don't you hate that word now?) way to sync everything to everything, so that all your contact info, e-mail archives, cached web pages, notes-to-self, car keys, and loose pocket change are synchronized across all your devices, including your refrigerators, all the time.

That's been done too. More or less.

Folks, let me tell you what's going to happen. I have a pretty strong hunch (but no inside info, I assure you) on this one. This is something I've thought about for years -- it has needed to happen for years -- and I'll be thrilled if Opera pulls it off, although whether people will flock to adopt it is another question.

The answer is that Opera is going to embed a web server in itself.

When you fire up Opera, you'll be operating a secure server and you will be able to serve all kinds of content (whatever you want, basically: bookmarks, contacts, cached content, arbitrary files from a roped-off area of your local storage, web pages of your own) to other Opera users, at the very least, and maybe all browser users, at the very most. The security aspects will be interesting, but presumably they've got a solution there, too.

Such a trick would solve the sync-anything problem trivially, as a side benefit. The more interesting question is what kind of two-way AJAX apps and mashups people will be able to write when they can use each other's browser as a web server. The Web goes from being a bunch of big public servers plugged into a common backbone, to a confederation of micro-servers distributed across individual devices running Opera -- a Web within a Web, the peer-to-peer Web. Except instead of running a P2P protocol, you'll be running good old HTTP.

The embedded-server browser (possibly with embedded Derby or other database) is what I see coming on the 16th. Or something like it.

Anyone got a better guess?

Wednesday, June 10, 2009

What has become of the old-fashioned Encyclopædia?

Answer:

Photo courtesy http://www.rob-matthews.com

This bound version of Wikipedia contains only the featured articles from Wikipedia. It obviously does not encompass all of Wikipedia. If it did, it would stand as tall as . . .well, a tree.

Monday, June 08, 2009

CMIS, or DMIS?

It occurred to me the other day that CMIS (Content Management Interoperability Services, the proposed OASIS "common protocol" for Enterprise Content Management) is actually a Document Management standard, not a Content Management standard. Its name should therefore be DMIS.

For proof, one need look no further than the data model. "Document" and "Folder" are the principal first-class objects in the CMIS model. Thus, "content" (the 'C' in 'CMIS') is assumed, by the authors of the standard, to mean "document."

The CMIS data model is also RDBMS-centric and SQL-friendly (as it is in all good DM systems). It follows the tried-and-true relational model of every respected legacy DM system.

I might add that the authors of the standard have basically declared WCM to be out of scope.

Basically, anything that doesn't fit the "everything is a document or a folder" model is either out of scope or will be extremely difficult to force-fit into the CMIS mold. At least, that's how it's starting to look to me.

I can't see WCM or DAM fitting easily into the CMIS worldview (which is a 1999 worldview, in terms of content being nothing more than documents and folders). What do you do with XMP in a CMIS world? Indeed, what do you do with unstructured content, in general? CMIS looks for content that's structured. That's not today's world. Sorry.

So CMIS is, for all practical purposes, a document-management standard -- a way to let DM systems (mostly legacy DM systems) talk to each other. There's nothing at all wrong with that. DM is still a critical piece of the ECM puzzle. But it's important not to mistake CMIS for what it is not and can never be: a universal content management API.

Sunday, June 07, 2009

WCM and ECM vendor home-page loadability: a quick followup

Back in April, you may recall, I ran a quick performance test on the home pages of various WCM and ECM software vendors using the popular YSlow tool. I blogged the results and compiled the test data into a 121-page PDF ("A Comparison of Home Page Loadability Scores for Major WCM and ECM Vendors") that can be downloaded (free) here.

I checked the download page's stats today and was surprised to find that the report has been seen by 360 visitors since I posted it on April 23. I don't know how many downloads it has had, but it doesn't really matter since the download page has a PDF viewer built in, and you can just peruse the document online.

For such an obscure, technical, narrowly conceived document, I think 360 views is pretty remarkable. I had expected about 40 views (corresponding to the number of vendors tested).

What will be interesting is to run the same battery of tests again in a month or two, to see who has tried to restructure their pages to improve their YSlow rankings.

Note-to-Self: Run YSlow on all the big-WCM/ECM-vendor web sites again soon. And this time, add a few more vendors.

Thursday, June 04, 2009

StreamGraphs are neat


Twitter StreamGraph for the last 1000 occurrences of 'CMS'.

As an unrepentent eye-candyholic, I love to stumble across things like the Twitter StreamGraphs from Neoformix. What the graph plots is the instantaneous volume, over time, of the most recent 1000 occurrences a given keyword or hashtag in the twitstream. There's nothing special about plotting the data in abscissa-symmetric fashion; it just looks neat and is cognitively "easy."

The plot shown above (click on it to enlarge it) shows the last 1000 occurrences of "CMS" in the twitstream. If you go to the link shown given earlier, you can generate your own StreamGraph for any keyword of your choice.

I might add that the Neoformix site itself is a great place to learn about novel data-visualization techniques (and new twists on existing techniques). Lots of eye-candy to be had there. I say drop what you're doing and go there now. You deserve a snack.

Sunday, May 31, 2009

Fractal Imaging


The other day, a friend of mine made me very happy. He returned my only copy of my favorite book, Fractal Imaging, by Ning Lu. This book has long been my favorite technical book, bar none, but I am beginning to think it might just be my favorite book of any kind.

"Virtue, like a river, flows soundlessly in the deepest place." You don't expect to encounter this sort of statement in an übergeek tome of this magnitude, and yet Lu scatters such proverbs (as well as quotations from Nietzsche, Joseph Conrad, Ansel Adams, Shakespeare, Jim Morrison, and others) throughout the text. This alone, of course, makes it quite an unusual computer-science book.

But the best part may be Lu's ability to blend the oft-times-sophisticated concepts of measure theory (and the math behind iterated function systems) with beautiful graphs, line drawings, transformed images (some in color; many downright spectacular), and the occasional page or two of C code. The overall effect is mesmerizing. Potentially intimidating math is made tractable (fortunately) through Lu's clear, often inspiring elucidations of difficult concepts. Ultimately, the patient reader is rewarded with numerous "Aha!" moments, culminating, at last, in an understanding for (and appreciation of) the surpassing beauty of fractal image transformation theory.

What is fractal imaging? Well, it's more than just the algorithmic generation of ferns (like the generated image above) from non-linear equation systems. It's a way of looking at ordinary (bitmap) images of all kinds. The hypothesis is that any given image (of any kind) is the end-result of iterating on some particular (unknown) system of non-linear equations, and that if one only knew what those equations are, one could regenerate the image algorithmically (from a set of equations) on demand. The implications are far-reaching. This means:

1. Instead of storing a bitmap of the image, you can just store the equations from which it can be generated. (This is often a 100-to-1 storage reduction.)
2. The image is now scale-free. That is, you can generate it at any scale -- enlarge it as much as you wish -- without losing fidelity. (Imagine being able to blow up an image onscreen without it becoming all blocky and pixelated.)

Georgia Tech professor Michael Barnsley originated the theory of fractal decomposition of images in the 1980s. He eventually formed a company, Iterated Systems (of which Ning Lu was Principal Scientist), to monetize the technology, and for a while it looked very much as if the small Georgia company would become the quadruple-platinum tech startup of the Eighties. Despite much excitement over the technology, however, it failed to draw much commercial interest -- in part because of its computationally intensive nature. Decompression was fast, but compressing an image was extremely slow (especially with computers of the era), for reasons that are quite apparent when you read Ning Lu's book.

Iterated Systems eventually became a company called MediaBin, Inc., which was ultimately acquired by Interwoven (which, in turn, was recently acquired by Autonomy). The fractal imaging technology is still used in Autonomy's MediaBin Digital Asset Management product, as (for example) part of the "similarity search" feature, where you specify (through the DAM client's GUI) a source image and tell the system to find all assets in the system that look like the source image. Images in the system have already been decomposed into their fractal primitives. When the source image's primitives (which, remember, are scale-free) are known, they can be compared against sets of fractal shards in the system. When the similarity between shard-sets is good enough, an "alike" image is presumed to have been found.

It's a fascinating technology and one of the great computer-imaging developments of the 20th century, IMHO. If you're into digital imaging you might want to track down Ning Lu's book. A 40-page sample of it is online here. I say drop what you're doing and check it out.

Thursday, May 21, 2009

U.S. Patent Office web site runs on Netscape iPlanet?

I suppose I shouldn't be so surprised at this, but I never seriously thought I would encounter the name "iPlanet(TM) Web Server Enterprise Edition" on an in-production web site again any time soon. And yet it turns out, the U.S. Patent and Trademark Office's publicly searchable database of patent applications is powered by a 2001 version of Netscape's app server.

The iPlanet "welcome page" can be found at http://appft1.uspto.gov/, which is one of the hosts for the Patent Application search site.

Brings back memories, doesn't it?

Sunday, May 17, 2009

WolframAlpha fails to impress

I been fooling around with WolframAlpha (the much-ballyhooed intelligent search engine) yesterday and today, and all I can say is, it feels very alpha.

I tried every query I could think of to get the annual consumption of electricity in the U.S., and got nowhere. On the other hand, if you just enter "electricity," your first hit is Coulomb's law 2.0mC, 5.0mC, 250cm, your second hit is 12A, 110V, and your third hit is diode 0.6 V. Which seems (how shall I say?) pretty useless.

As it turns out, WolframAlpha is also extraordinarily slow most of the time and hasn't performed well in load tests. Apparently Wolfram's AI doesn't extend to figuring out how to make something scale.

I'll suspend judgment on WA for a while longer, pending further testing. But right now it looks and smells like the answer to a question nobody asked.

Saturday, May 16, 2009

Google patent on floating data centers

This may be old news to others, but I only learned about it just now, and I have to assume there are still people who haven't heard it yet, so:

It seems the U.S. Patent and Trademark Office has awarded Google a patent for for a floating data center that uses the ocean to provide both power and cooling. The patent, granted on 28 April 2009, tells how floating data centers would be located 3 to 7 miles from shore, in 50 to 70 meters of water. The technique would obviously use ocean water for cooling, but according to the patent, Google also intends to use the motion of ocean surface waves to create electricity, via “wave farms” that produce up to 40 megawatts of electrical power. As a side-benefit, floating data centers (if located far enough out to sea) are not subject to real estate or property taxes.

More details of how the wave energy machines would work can be found here.

The interesting thing will be to see where, in the world, Google would put its first offshore data center. Any guesses?

Thursday, May 14, 2009

How to get 62099 unique visitors to your blog in one day

I spent the last two days in New York City attending a conference (the Enterprise Search Summit) and didn't have time to check Google Analytics (for my blog traffic) until just now. Imagine my shock to discover that my May 12 post, "One of the toughest job-interview questions ever," drew 62099 unique visitors from 143 countries. Plus over 125 comments.

So I guess if you want to draw traffic to your blog, the formula is very simple:

1. Get a job interview with a large search company.
2. Talk about one of the interview questions in excruciating depth.
3. (Optionally) Spend an inordinate amount of time discussing algorithms and such.

Conversely, if you want to completely kill your blog's traffic numbers, the formula for driving people away seems to be:

1. Discuss CMIS (the new OASIS content-management interop standard).
2. Fail to mention some kind of programming topic.
3. Avoid controversy and don't piss anyone off.

Hmm, I wonder which recipe I should gravitate toward over the next few weeks?

Aw heck, screw recipes. This is not a cooking show.

Tuesday, May 12, 2009

One of the toughest job-interview questions ever

I mentioned in a previous post that I once interviewed for a job at a well-known search company. One of the five people who interviewed me asked a question that resulted in an hour-long discussion: "Explain how you would develop a frequency-sorted list of the ten thousand most-used words in the English language."

I'm not sure why anyone would ask that kind of question in the course of an interview for a technical writing job (it's more of a software-design kind of question), but it led to a lively discussion, and I still think it's one of the best technical-interview questions I've ever heard. Ask yourself: How would you answer that question?

My initial response was to assail the assumptions underlying the problem. Language is a fluid thing, I argued. It changes in real time. Vocabulary and usage patterns shift day-to-day. To develop a list of words and their frequencies means taking a snapshot of a moving target. Whatever snapshot you take today isn't going to look like the snapshot you take tomorrow -- or even five minutes from now.

So the first question is: Where do we get our sample of words from? Is this about spoken English, or written English? Two different vocabularies with two different frequency patterns. But again, each is mutable, dynamic, fluid, protean, changing minute by minute, day by day.

Suppose we limit the problem to written English. How will we obtain a "representative sampling" of English prose? It should be obvious that there is no such thing. There is no "average corpus." Think about it.

My interviewer wanted to cut the debate short and move on to algorithms and program design, but I resisted, pointing out that problem definition is extremely important; you can't rush into solving a problem before you understand how to pose it.

"Let's assume," my inquisitor said, "that the Web is a good starting place: English web-pages." I tormented my tormentor some more, pointing out that it's dangerous to assume spiders will crawl pages in any desirable (e.g., random) fashion, and anyway, some experts believe "deep Web content" (content that's either uncrawlable or has never been crawled before) constitutes the majority of online content -- so again, we're not likely to obtain any kind of "representative" sample of English words, if there even is such a thing as a representative sample of the English language (which I firmly maintain there is not).

By now, my interviewer was clearly growing impatient with my petulence, so he asked me to talk about designing a program that would obtain a sorted list of 10,000 most-used words. I dutifully regurgitated the standard crawl/canonicalize/parse/tally sorts of things that you'd typically do in such a program.

"How would you organize the words in memory?" my tormentor demanded to know.

"A big hash table," I said. "Just hash them right into the table and bump a counter at each spot."

"How much memory will you need?"

"What've you got?" I smiled.

"No, seriously, how much?" he said.

I said assuming 64-bit hardware and software, maybe something like 64 gigs: enough memory for a 4-billion-slot array of 16 bytes of data per slot. Most words will fit in that space, and a short int will suffice for a counter in each slot. (Longer words can be hashed into a separate smaller array.) Meanwhile you're using 32 bits (64 available; but you're only using 32) of address space, which is enough to hash words of length 7 or less with no collisions at all. (The typical English word has entropy of about 4.5 bits per character.) Longer words entail some risk of hash collision, but with a good hash function that shouldn't be much of a problem.

"What kind of hash function would you use?" the interviewer asked.

"I'd try a very simple linear congruential generator, for speed," I said, "and see how it performs in terms of collisions."

He asked me to draw the hash function on the whiteboard. I scribbled some pseudocode that looked something like:

HASH = INITIAL_VALUE;
FOR EACH ( CHAR IN WORD ) {
HASH *= MAGIC_NUMBER
HASH ^= CHAR
HASH %= BOUNDS
}
RETURN HASH

I explained that the hash table array length should be prime, and the BOUNDS number is less than the table length, but coprime to the table length. Good possible values for the MAGIC_NUMBER might be 7, 13, or 31 (or other small primes). You can test various values until you find one that works well.

"What will you do in the event of hash collisions?" the professor asked.

"How do you know there will be any?" I said. "Look, the English language only has a million words. We're hashing a million words into a table that can hold four billion. The load factor on the table is negligible. If we're getting collisions it means we need a better hash algorithm. There are plenty to choose from. What we ought to do is just run the experiment and see if we even get any hash collisions. "

"Assume we do get some. How will you handle them?"

"Well," I said, "you can handle collisions via linked lists, or resize and rehash the table -- or just use a cuckoo-hash algorithm and be done with it."

This led to a whole discussion of the cuckoo hashing algorithm (which, amazingly, my inquisitor -- supposedly skilled in the art -- had never heard of).

This went on and on for quite a while. We eventually discussed how to harvest the frequencies and create the desired sorted list. But in the end, I returned to my main point, which was that sample noise and sample error are inevitably going to moot the results. Each time you run the program you're going to get a different result (if you do a fresh Web crawl each time). Word frequencies are imprecise; the lower the frequency, the more "noise." Run the program on September 10, and you might find that the word "terrorist" ranks No. 1000 in frequency on the Web. Run it again on September 11, and you might find it ranks No. 100. That's an extreme example. Vocabulary noise is pervasive, though, and at the level of words that rank No. 5000+ (say) on the frequency list, the day-to-day variance in word rank for any given word is going to be substantial. It's not even meaningful to talk about precision in the face of that much noise.

Anyway, whether you agree with my analysis or not, you can see that a question like this can lead to a great deal of discussion in the course of a job interview, cutting across a potentially large number of subject domains. It's a question that leads naturally to more questions. And that's the best kind of question to ask in an interview.

Sunday, May 10, 2009

Can CMIS handle browser CRUD?

I've mentioned before the need for concrete user narratives (user stories) describing the intended usages of CMIS (Content Management Interoperability Services, soon to be an OASIS-blessed standard API for content management system interoperability). When you don't have user stories to tie to your requirements, you tend to find out things later on that you wished you'd found out earlier. That seems to be the case now with browser-based CRUD operations in CMIS.

I don't claim to be an expert on CMIS (what I know about CMIS would fill a very small volume, at this point), but in reading recent discussions on org.oasis-open.lists.cmis, I've come across a very interesting issue, which is that (apparently) it's not at all easy to upload a file, or fetch a file and its dependent files (such as an HTML page with its dependent CSS files), from a CMIS repository using the standard Atom bindings.

The situation is described (in a discussion list thread) by David Nuescheler this way: "
The Atom bindings do not lend themselves to be consumed by a lightweight browser client and for example cannot even satisfy the very simple use-case of uploading a file from the browser into a CMIS repository. Even simple read operations require hundreds of lines of JavaScript code."

Part of the problem is that files in the repository aren't natively exposed via a path, so you can't get to a file using an IRI with a normal file and path name like "./main.css" or "./a/b/index.html" or whatever. Instead, files have an ID in the repository (e.g., /12257894234222223) which is assigned
by the repository when you create the file. That wouldn't be so bad, except that there doesn't appear to be an easy way (or any way) to look up a URL using an ID (see bug CMIS-169).

Based on the difficulty encountered in doing browser CRUD during the recent CMIS Plugfest, David Nuescheler has proposed looking into adding an additional binding based on JSON GETs for reading and multi-part POSTs for writing -- which would make it possible to do at least some CMIS operations via AJAX. The new binding would probably be called something like the web-, browser-, or mashup-binding. (Notice how the name "REST" is nowhere in sight -- for good reason. CMIS as currently implemented is not how REST is supposed to work.)

Granted, CMIS was not originally designed with browser mashups in mind, but the fact is, that's where most of the traction is going to come from if the larger ecosystem of web developers decides to latch onto CMIS in a big way. SOAP has a dead-horse stench about it; no one I know likes to deal with it; but an Atom binding isn't a very useful alternative if the content you need can't be addressed or can't easily be manipulated in the browser using standard AJAX calls.

So let's hope the CMIS technical committee doesn't overlook the most important use-case of all: CMIS inside the browser. Java and .NET SOAP mashups are important, but let's face it, from this point forward all the really important stuff needs to happen in the browser. If you can't do browser CRUD with a new content-interoperability standard, you're starting life with the umbilical cord wrapped around your neck.

Thursday, May 07, 2009

DOM Storage: a Cure for the Common Cookie

One of the things that's always annoyed me about web app development is how klutzy it is to try to persist data locally (offline, say) from a script running in the browser. The cookie mechanism is just so, so . . . annoying.

But it turns out, help is on the way. Actually, Firefox has had a useful persistence mechanism (something more useful than cookies, at least) since Firefox 2, in the so-called DOM Storage API. Internet Explorer prior to version 8 also had a a similar feature called "userData behavior" that allows you to persist data across multiple browser sessions. But it looks like the whole browser world (even Google Chrome) will eventually move to the DOM Storage API, if for no other reason than it is getting official W3C blessing.

The spec in question is definitely a work-in-progress, although the key/value-pair implementation has (as I say) been a part of Firefox and Spidermonkey for quite some time. The part that's still being worked out is the structured-storage part -- the part where you can use some flavor of SQL to solve your CRUD-and-query needs in a businesslike manner.

Why do we need a W3C structured-storage DOM API when there are already such things as Gears LocalServer, Dojo OfflineRest, etc. (not to mention the HTML 5 ApplicationCache mechanism)? A good answer is given here by Nikunj R. Mehta, who heads up Oracle's related BITSY project.

Most of the current debate around how structured storage should be handled in the DOM Storage API revolves around whether or not to use SQLite (or SQL at all). The effort is presently heading toward SQLite, but Vladimir Vukićević has done an eloquent job of presenting the downside(s) to that approach. The argument against SQL is that AJAX programmers shouldn't have to be bothered to learn and use something as heavy and obtuse as SQL just to do persistence, when there are more script-friendly ways to do things (using an approach something like that of, say, CouchDB). As much as I tend to sympathize with that argument, I think the right thing to do is stay with industry standards around structured data storage, and SQL is a pretty universal standard. Of course, then people will argue that "SQL" isn't just one thing, there are many dialects, etc. Which is true, and a good example is SQLite, which the current DOM Storage effort is centering on. I happen to agree with Vladimir Vukićević (and others) who say that SQLite is just too limiting and quirky a dialect to be using in this situation.

I bring all this up not to argue for a particular solution, however, so much as to just make you aware of some of what's going on, and the current status of things. Explore the foregoing links if you want to get involved. If having a common data-persistence and offline/online HTTP-cache API that will work uniformly across browsers means anything to you, maybe you should join the discussion (or at least monitor it).

Getting beyond cookies is essential at this point; the future of web-app development is at stake and we can't wait any longer to nail this one. (We also can't do goofy things like rely on hidden Flash applets to do our heavy lifting for us.) It's time to have a real, standardized, browser-native set of persistence and caching APIs. There's not a moment to lose.

Wednesday, May 06, 2009

How to know what Oracle will do with Java

Oracle Corporation, capitalism's penultimate poster child, has done something strange. It has bought into a money-losing proposition.

The paradox is mind-bending. Oracle is one of the most meticulously well-oiled money-minting machines in the history of computing, yet it finds itself spending $7.4 billion to acquire a flabby, financially inept technocripple (Sun Microsystems) that lost $200 million in its most recent quarter. One can't help but wonder: What is Oracle doing?

What they're doing, of course, is purchasing Java (plus Solaris and a server business; plus a few trinkets and totems). The question is why. And what will they do with it now that they have their arms around it?

The "why" part is easy: Oracle's product line is tightly wed to Java, and the future of Java must be kept out of the hands of the competition (IBM in this case; and perhaps also the community). What Oracle gains is not technology, but control over the future, which is worth vastly more.

What's in store for Java, then? Short answer: Whatever Larry Ellison wants. And Larry wants a lot of things, mostly involving increased revenue for Oracle.

There's a huge paradox here, though, for Ellison and Oracle. On the one hand, Java represents a massive amount of code to develop and maintain; it's a huge cash sink. Sun tried to monetize Java over the years through various licensing schemes, but in the end it was (and is) still a money pit. On the other hand, you don't maintain control over something by letting the community have it. Is there a happy medium?

Because of the huge costs involved in maintaining Java, it makes sense to let the community bear as much of the development and maintenance burden as possible. After all, does Oracle really want to be in the business of maintaining and testing things like Java2D? Probably not. That kind of thing will get thrown over the castle wall.

But chances are, JDBC won't be.

The core pieces of Java that are of most interest to Oracle will be clung to tightly by Oracle. That includes its future direction.

Therefore the most telling indicators of what Oracle intends to do with Java in terms of allowing the community to have a say, versus keeping control of the platform under lock and key, will be how Oracle resolves the Apache/TCK crisis. The issue here (in case you haven't been following it) is that Sun has created additional functionality for Java 7 that goes beyond the community-ready, Apache-blessed, Sun-blessed open source version of Java 6. But Sun is treating its Java 7 deltas as private intellectual property, as evidenced by Sun's steadfast refusal to provide, first, a spec for Java 7, and second, a test kit (TCK) compatible with Apache license requirements. Until this dispute is resolved, there will be no open-source Java 7 and community (Apache) involvement with future versions of Java will ultimately end.

There are some who believe that if Oracle continues Sun's policy of refusing to support open-source Java beyond Java 6, the community will simply fork Java 6 and move forward. This is probably wishful thinking. The forked Java would quickly become a play-toy "hobby" version of Java, missing important pieces of functionality found in the "real" (Oracle) Java, and ultimately lacking acceptance by enterprise. It's tempting to think that with sufficient help from, say, IBM, the community version of Java might eventually become a kind of analog to Linux (the way Linux gathered steam independently of UNIX-proper). But that has to be considered a long shot. The mother ship is still controlled by Oracle. (Even if there's a mutiny, the boat's rudder is owned, controlled, and operated by Larry Ellison and Company.)

It's in Oracle's interest to control the Enterpriseness of Java EE with an iron grip. The non-enterprise stuff means nothing. Oracle will try to find a way to hive off the non-EE cruft and throw it into the moat, where the community can grovel over it if they so wish.

And a good early indication of that will be if Oracle back-burners JavaFX and throws it over the wall, into the moat (which I think it will).

My advice? Step away from the moat . . . unless you want to get muddy.

Tuesday, May 05, 2009

Adobe's Linux Problem

Adobe Systems is at a critical turning point in its long, slow march in the direction of RIA platform domination (which, should Adobe emerge the winner in that sphere, could have profound implications for all of us, as I've blogged about earlier). It is time for the company to decide whether it wants to embrace Linux "with both arms," so to speak. It's put-up-or-shut-up time. Either Linux is of strategic importance to the Adobe agenda, or it is not. Which is it?

"But," you might be saying, "Adobe has made it clear that it is committed to supporting Linux. Look at the recently much-improved Acrobat Reader for Linux, and the effort to bring Flash and Flex to Linux. Adobe is investing heavily in Linux. It's very clear."

Then why has Adobe killed Flex Builder for Linux?

It's funny, if you read some of the blog commentary on this, how many Flex developers are defending Adobe's decision to abandon further development of Flex Builder for Linux, saying (like corporate apologists) there simply isn't enough demand for Flex on Linux to justify the necessary allocation of funds.

I have no doubt whatsoever that a straight bean-counting analysis of the situation will show that the short-term ROI on Flex-for-Linux is indeed poor, and that from a quarterly-earning point of view it's not the right way to satisfy shareholder interests. Agreed, point conceded.

But that's called being shortsighted. The Linux community may be only a small percentage of the OS market, but in terms of mindshare, the Linux developer community is a constituency of disproportionate influence and importance. Also, as a gesture of seriousness about Open Source, the importance of supporting Flex tools on Linux is hard to overestimate.

But it's not just about Flex tools. Adobe has had a schizophrenic Linux "strategy" for years. It back-burnered proper support for Acrobat Reader (and PDF generally) on Linux for years. Flash: ditto. And even a product like FrameMaker (which began its life as a UNIX product, interestingly, and was available in a Solaris version until just a few months ago) has been neglected as a potential Linux port, even though Adobe did, in fact, at one time have a Linux version of FrameMaker in public beta.

Adobe has a long history of going after the lowest-hanging fruit (and only the the lowest-hanging fruit) in the Linux world, and it continues that tradition today. The only problem is, you can't claim to be an ardent supporter of Open Source and ignore the Linux community, nor can you aspire to RIA platform leadership in the Web-app world of the future without including in your plans the fastest growing platform in computing.

Adobe's shortsightedness in its approach to Linux may be good for earnings-per-share (short-term) but is emblematic of the company's inability to articulate a longer-term vision that embraces all of computing. It undermines the company's credibility in the (ever growing) Open Source world and speaks to a mindset of "quarterly profits über alles" that, frankly, is disappointing in a company that aspires to RIA-platform leadership. IBM and others have found a way to invest in Open Source and alternative platforms without compromising longterm financial goals or putting investor interests at risk. The fact that Adobe can't do this shows lack of imagination and determination.

How much can it possibly cost to support Flex Builder on Linux, or (more to the point) to have a comprehensive, consistent policy of support for Linux going forward?

Conversely: How much does it cost not to have it?

Monday, May 04, 2009

The most important job interview question to ask an R&D candidate

I've been thinking about what one question I would ask a job candidate (for an R&D job) if I could ask only one question. This assumes I've already asked my favorite high-level question, which I discussed in yesterday's post.

Most good "R&D job" questions, of course, are open-ended and have no single "right" answer. They're intended as a starting point for further discussion, and a gateway to discovering the reasoning process of the candidate.

One of the better such questions I've heard during an interview came when I applied for a job at a well-known search company. One of the five people who interviewed me asked: "Explain how you would develop a frequency-sorted list of the ten thousand most-used words in the English language." This was an outstanding question on many levels and led to a very lively hour-long discussion. But I'll save that for another day.

To me, if I'm interviewing someone who is going to be involved in writing code, and I can only ask one question in the course of an interview, it would be: "Explain what 'bad code' means to you."

If the person starts going down the road of "See what kind of warnings the compiler gives you," "run it through lint," etc., I would steer the person back on track with: "Aside from that, what would you do if I gave you, say, a couple thousand lines of someone else's code to look at? How would you judge it? What sorts of things would make the code 'good' or 'bad' in your eyes? Assume that the code compiles and actually works."

If the talk turns immediately to formatting issues, that's not good.

Presence or absence of comments: Starts to be relevant.

Coding conventions (around the naming of variables and such): Yeah yeah yeah. That's good. What else?

What about the factoring of methods? Is the code overfactored? Underfactored? Factored along the wrong lines? How can you tell? (This leads also to the question of how long is too long, for a class or method?)

What about evidence of design patterns? Does it look like the person who wrote the code doesn't know about things like Observer, Visitor, and Decorator patterns?

Does the code follow any antipatterns? Is it just plain hard to follow because of methods trying to "do too much," overusage of custom exceptions, overuse of cryptic parent-class methods, constructors or method declarations with 15 million formal parameters, etc.?

What about performance? Does it look like the code might be slow? (Why?) Could the author have perhaps designated more things "final"?

Is code repeated anywhere?

Is the code likely to create garbage-collection concerns? Memory leakage? Concurrency issues?

This list goes on and on. You get the idea.

Special extra-credit points go to the candidate who eventually asks larger questions, like Was this code written to any particular API? Is it meant to be reusable? (Is it part of a library versus plain old application code? How will people be using this code?) Is it meant to have a long lifetime, or will this code be revisited a lot (or possibly extended a lot)?

I'm sure you probably have favorite R&D questions of your own (perhaps ones you've been asked in interviews). If so, please leave a comment; I'd like to see what you've got.

Sunday, May 03, 2009

If I could ask only one job interview question

Someone asked me the other day (in response to my earlier blog about job interviews) what question I would ask during a job interview if I could ask only one question. Which is, itself, a very interesting question.

I initially responded by asking for more context, specifically: What kind of position am I trying to fill? Is the candidate in question applying for an R&D job (e.g., web-application developer)? Or is she applying for a software-industry position that requires no programming knowledge per se?

In general, as a hiring manager, the thing that interests me the most is the person's ability to get work done, and in technology, that, to me, means I want someone who is an incredibly fast learner. There's no way you can come into a job already possessing 100% of the domain knowledge you're expected to have; some on-the-job learning is bound to be necessary. Beyond that, even when you've adapted to the job, constant learning is a requirement. I've never seen a tech job where that wasn't true.

So one of my favorite interview questions is: "If you have a hard assignment that involves subject domains you know little or nothing about, what would be your approach to attacking the problem?"

If the person's first words are "Consult Google," that's fine -- that's a given, actually. But I want to know more. Consult Google, sure (consult Wikipedia, maybe), but then what?

What I don't want to hear as the very first thing is "Go ask someone in R&D" or "come and ask you." If your very first tactic is to disturb busy coworkers (without first doing any homework yourself), it means you're too lazy to at least try to find answers on your own. It also means you're inconsiderate of the value of other people's time. Newbies who ask questions on forums that could easily have been answered with a little prior research tend to get rude treatment on forums, for precisely this reason. You don't bother experts (with non-expert questions, especially) unless you've already done all the work you can possibly do on your own to solve the problem yourself. Only then should you bother other people.

Some good answers to the question of "How would I attack a difficult problem" might include:
  • Go straight to the authoritative source documentation. For example, if the problem involves E4X syntax, go read ECMA-357. If the problem involves XPath, go to the W3C XPath Language Recommendation. And so on. Go to the source! Then work your way out from there.
  • See what's already been done internally, inside the organization, on this problem. That means consulting company or departmental wikis, reading internal documents (meeting minutes, business intelligence reports, etc.), reading source code and/or code comments, and so on.
  • Find out who the acknowledged experts (in the industry) are on the subject in question and go look at their articles and blogs. Consult forums, too, if applicable. Post questions on forums, if you can do so without revealing private company information.
  • If you have a friend who is knowledgeable on the subject, reach out to the person and pick his or her brain (again providing you're able to do that without revealing proprietary information about your current project). I don't care if you bother someone outside the organization with endless questions.
Finally, if you need to, find out who inside the organization is the domain expert on the subject, and ask that person if you could have a little of his or her time.

In summary, I need someone who is smart and a fast learner, but also resourceful and self-motivated.

This post is getting to be longer than I thought, so I'll stop. Tomorrow I want to speak to the issue of what question I would ask an R&D candidate, if I could ask any question during a job interview. That'll be fun, I promise.

Saturday, May 02, 2009

There's a DAM Elephant in the Room


Typically, in my day job as an analyst, I'm on the receiving side of briefings, but the other day I actually gave one to a customer wanting to know more about the Digital Asset Management (DAM) marketplace. I took questions on a wide range of issues. But then, at one point, the customer put forward a really thought-provoking question, something I myself have been wondering for some time: Where is Adobe Systems in the DAM world? What's it doing in DAM?

The reason this is such a good question is that Adobe already has most of the necessary pieces to put together a compelling enterprise DAM story (even if it hasn't yet assembled them into a coherent whole). Some of the more noteworthy pieces include:
  • Some very interesting workflow and rights-management bits in the LiveCycle suite.
  • Adobe Version Cue, which provides a versioning and collaboration server for workgroup scenarios. Version Cue uses an embedded instance of MySQL and has SOAP interfaces.
  • Adobe Bridge, a lightbox file-preview and file-management application with some metadata editing and other tools built-in. This piece is bundled into the Adobe Creative Suite products. (Interestingly enough, Bridge is a SOAP client that can talk to Adobe Version Cue servers.)
And of course, the CS products themselves are used extensively by the same creative professionals whose needs are addressed by conventional DAM products of the Artesia, MediaBin, or North Plains variety. Most of the big DAM offerings try hard (with various degrees of success) to integrate smoothly with Adobe's creative tools, InDesign in particular.

The one piece that's missing from all this is a standards-based enterprise repository. What Adobe could use right about now is a robust ECM repository (CMIS-compliant, of course) built on industry standards, something written in Java that will play well with JRun and offer pluggable JAAS/JACC security, with LDAP directory friendliness, etc. That's a lot of code to write on your own, so obviously it would behoove Adobe to either partner with an ECM player or leverage an open-source project. Or maybe both.

You may or may not remember that back in March 2008, Adobe launched its Adobe Share service, built atop open-source ECM product Alfresco.

Then in June 2008, Adobe and Alfresco announced a partnership to embed Alfresco's content management software into Adobe's LiveCycle Enterprise Suite.

Later, in September 2008, Adobe partnered with Alfresco in a deal that had Alfresco powering the popular Acrobat.com site. (That site is currently on the verge of surpassing LiveMeeting.com and OfficeLive.com for traffic.)

Could Alfresco be the linchpin of a future Adobe DAM strategy? Hard to tell, but the handwriting on the wall, it seems to me, is starting to become legible.

As far as the DAM industry as a whole is concerned, Adobe is clearly the elephant in the room at this point. When and if this giant rich-media pachyderm decides to step forward and enter the DAM world proper, it could cause the ground to shake. It might set off Richter scales as far away as China.

My opinion? Now is not too early to take shelter under a nearby doorway or other structurally reinforced part of the building.