Monday, September 29, 2008

A number that's not equal to itself

All this time, I've been thinking NaN is not a number. What an idiot I've been.

In JavaScript:

   typeof NaN == 'number'   // true

And yet of course, NaN == NaN is false.

There you go. Amaze your friends.

Wednesday, September 24, 2008

Great hack: PNG-compressed text



I only recently stumbled across what's got to be the most outlandish scripting hack I've seen in a long time. Jacob Seidelin tells of how he managed to stuff text into a PNG image, then get it back out with the <canvas> getImageData( ) method. What's neat about it? Mainly the free compression you get with the PNG format. For example, when Jacob put the 124kb Prototype library into PNG format, it shrunk to 30kb. Of course, it makes for an awful-looking image (see above), which one might think of as a degenerate case of steganography, i.e. embedded data in an image, minus the image.

The trick doesn't work for all browsers, since you need canvas for it to work. And it's kind of pointless given that you can use gzip instead. But it's kind of neat in that it opens the door to browser steganography, embedding of private metadata, and potentially lots of other cool things.

Tuesday, September 23, 2008

JavaScript beautifiers suck

I keep looking for an online code beautifier that will convert my distinctly simian-looking Greasemonkey scripts to properly indented, formatted source code. My current favorite code editor (Notepad) doesn't provide proper code formatting. I know what you're thinking: Why aren't you using a proper IDE in the first place? Then you wouldn't have this problem! Well, first of all, I am thinking of upgrading to Wordpad. But it doesn't do formatting either. Second of all, I haven't found a JavaScript IDE worthy of the name, which is why I use Notepad. More on that in a minute.

I spent an hour the other day looking for an online beautifier that would do a makeover on my ugly JavaScript. What I found is that most people point either to this one or this one. (I tried others as well.) They either don't keep my existing newlines, or don't indent "if" blocks properly (or at all), and/or just plain don't indent consistently. Quite unacceptable.

Finally I gave up on the online schlockware and went straight to Flexbuilder (which has been sitting unused on my desktop), and I thought "Surely this will do the trick."

Imagine the look of abject horror on my face when I found that the ActionScript editor could not do the equivalent of Control-Shift-F (for Java in Eclipse). In fact, the formatter built into Flexbuilder's ActionScript editor won't even do auto-indenting: You have to manually grab blocks of code and do the old shift-right/shift-left indent/outdent thing by hand, over and over and over again, throughout your code, until the little beads of blood begin to form on your forehead.

I'm left, alas, with half-solutions. But unfortunately, two or three or ten half-solutions don't add up to a solution. (How fortunate we would all be if it did.)

Monday, September 22, 2008

Firebug on Vista giving problems

Is it just me or does anyone else find Firebug+FF3 on Vista to be flaky? It loses my console code if I switch tabs (not windows, just going to another tab and coming back). Sometimes the FB console stops working or won't execute "console.log( )". And it seems as though weird bugs show up in the Firefox console that don't show up in the Firebug log pane, and vice versa.

Also, I don't appreciate having to manually turn on the console for every web domain I go to. What a PITA. I wonder if that behavior can be disabled somehow? Right now, I'm feeling disabled.

Thursday, September 18, 2008

JavaScript runs at C++ speed, if you let it

The common perception (ignorance of the crowd) is that JavaScript is slow. What I'm constantly finding, however, is that people will hand-craft a JavaScript loop to do, say, string parsing, when they could and should be using the language's built-in String methods (which always run fast).

Example: You need a "trim" function to remove leading and trailing whitespaces from user-entered text in a form. If you go out on the web and look at what people are doing in their scripts, you see a lot of things like:

function trim10 (str) {
var whitespace = ' \n\r\t\f\x0b\xa0\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u200b\u2028\u2029\u3000';
for (var i = 0; i < str.length; i++) {
if (whitespace.indexOf(str.charAt(i)) === -1) {
str = str.substring(i);
break;
}
}
for (i = str.length - 1; i >= 0; i--) {
if (whitespace.indexOf(str.charAt(i)) === -1) {
str = str.substring(0, i + 1);
break;
}
}
return whitespace.indexOf(str.charAt(0)) === -1 ? str : '';
}


I took this code verbatim from a web page in which the author of it claims (ironically) that it's an incredibly fast routine!

Compare with:

function trim(a) {
return a.replace(/^ +/,"").replace(/ +$/,"");
}

In testing, I found the shorter routine faster by 50% on very small strings with very few leading or trailing spaces, and faster by 300% or more on strings of length ~150 with ten to twenty leading or trailing spaces.

The better performance of the shorter function has nothing to do with it being shorter, of course. It has everything to do with the fact that the built-in JavaScript "replace( )" method (on the String pseudoclass) is implemented in C++ and runs at compiled-C speed.

This is an important point. Interpreters are written in C++ (Spidermonkey) or Java (Rhino). The built-in functions of the ECMAScript language are implemented in C++ in your browser. Harness that power! Use the built-in functions of the language. Never hand-parse strings with "indexOf" inside for-loops (etc.) when you can use native methods that run at compiled speed. Why walk if you can ride the bullet train?

The implications here for client/server web-app design are quite far-reaching. If you are using server-side JavaScript, and your server runtimes are Java-based, it means your server-side scripts are running (asymptotically, at least) at Java speed. Well-written client-side JavaScript runs (asymptotically) at C++ speed. Therefore, any script logic you can move to the client should be moved there. It's madness to waste precious server cycles.

Madness, I say.

Wednesday, September 17, 2008

Getting Greasemonkey to work in Firefox3 on Vista

Wasn't happening for me until I started with a fresh (empty) FF3 user profile. Vista seems to be the problem in all of this. GM on FF3 on WinXP works fine, but with Vista, GM doesn't install properly unless you zero out your FF3 profile first. At least, that's the state of things today as I write this (17 Sept 2008). Hopefully it will get fixed soon. Until then ...

The procedure is:

1. In FF3, go to Organize Bookmarks and export your bookmarks as HTML so you don't foolishly lose them.

2. In the Vista "Start" panel, choose Run...

3. Launch Firefox with a command line of "firefox -profilemanager".

4. When the profile manager dialog appears, create a new profile.

5. When FF launches, install Greasemonkey.

6. Import your bookmarks.

7. Exit Firefox. Return to step 3. When profile manager dialog appears, delete your old profile. (Or else leave it and have to contend with logging in to one or the other profile whenever FF launches.)

Whew + sheesh.

Wednesday, September 10, 2008

How to use JS 1.7 in Greasemonkey?

Problem: I need to be able to use the 'yield' keyword in a Greasemonkey script. This is a Javascript 1.7 language feature available in Firefox 2 and later. You must explicitly "turn on" support for this feature, however, by specifying

<script type="application/javascript;version=1.7"/>

in the HTML page.

That's not what I need. I need to turn it on in Greasemonkey's execution context.

Others have run into this problem. It appears, however, that the Greasemonkey guys won't do anything about it.

I was hoping there'd be some clever back-door way to do this, but that seems unlikely. There appears, alas, to be no workaround, short of the usual (for Greasemonkey) expedient of vulturing the unsafeWindow, which is of course repulsive and unacceptable.

If anyone knows of a non-ugly solution to this problem (the problem of how to use 'yield' in Greasemonkey scripts), please advise.

Tuesday, September 09, 2008

Selection object in Firefox

I've learned some interesting things about the way selections work in Mozilla.

Every window has a singleton selection object, even when the user has selected no items on the rendered page. Therefore, window.getSelection( ) always succeeds.

If you simply want user-selected text as a string, getSelection( ).toString( ) will work. But if you really intend to walk the selected DOM nodes, or process the selection in any non-trivial way, you will need access its Range objects with

window.getSelection( ).getRangeAt( i );

There is a "rangeCount" property on the Range object, so that you can know how many Ranges were selected by the user. In Firefox 2.0 and prior, the rangeCount was never more than one. But in Firefox 3, the user can do multi-selection of page contents. (Try it: Hold the Control key down as you swipe across various pieces of a page.) That means the range count can be more than one.

If you need to process a Range's contents, be sure to use the cloneContents( ) method, not the extractContents( ) method. The latter will actually remove nodes from the DOM tree, affecting the rendered page's appearance. (That is to say, content suddenly disappears!)

This is all spelled out at the Moz Developer Center page on Ranges.

Friday, September 05, 2008

XPath Query in Sling

I've been playing with Sling lately, and I was pleasantly surprised to find that Sling comes with a JSON query servlet that exposes SQL and XPath query capability through a RESTful HTTP GET syntax. (Thanks to Moritz Havelock for pointing this out.)

But I quickly ran into a small problem. (And just as quickly, the solution.) Allow me to explain.

The problem: I want to search for nodes in the repository that have a (multivalued) "pets" attribute containing the value "dog." Note that the "pets" attribute might have multiple values. I want to filter against just one. Therefore I can't do an equality test. I must use the XPath contains() function.

My test query was:

http://localhost:7402/content.query.json?
queryType=xpath&statement=//*[contains(@pets,'dog')]


This produced an InvalidQueryException, with a message of "Unsupported function: contains (500)".

I was a bit surprised that the servlet seemed to know nothing about any contains() function. A true "WTF moment."

Taking my hint from the stack trace, I quickly ran a Google Code Search on org.apache.jackrabbit.core.query.xpath, and immediately found the answer in XPathQueryBuilder.java: It turns out you have to use the function's qualified name, jcr:contains(). Like so:

http://localhost:7402/content.query.json?
queryType=xpath&statement=//*[jcr:contains(@pets,'dog')]


I'm so much of an XPath newb that I don't even know if I should have been surprised by this, but it did stymie me briefly. Anyway, it works now and I'm thrilled to be able to do XPath queries right from the GET-go.

Tuesday, September 02, 2008

Google Chrome: nice console, ugly browser


I downloaded Chrome today and immediately started using the JavaScript console. It's pretty nice, but if you're already accustomed to Firebug in Firefox, it's no substitute. Also, what good is Chrome if you can't use Greasemonkey scripts with it?

The JS engine is presumably based on Spidermonkey (since the Chrome guys apparently used a lot of Mozilla code to slap this thing together). But they forgot to include E4X. And so help me, I haven't figured out how to enter a newline in the console without triggering an eval( ). In other words, I can only enter one line of code at a time, and then I have to execute it. As soon as I hit Enter, CR, Control-Enter, etc., the code on the current line executes. Oh well...

As a browser, this thing is not terribly impressive, from what I can tell.

In any case, Chrome itself strikes me as too fugly to deal with. I'm not sure which I'd rather do: spend a work-day using Chrome as my main browser, or jam prickly-pears into both my eyes at once.

I think I'll stay with Firefox until Chrome gets out of beta. Which (if it's like Gmail) it never will.

Friday, August 29, 2008

Pretty-print serialized DOM

Another great Mozilla feature: pretty-format a serialized DOM tree. The following code will serialize an entire web page and pretty-format the markup:

var serializer = new XMLSerializer( );
var str = serializer.serializeToString( document.documentElement );
var pretty = XML( str ).toXMLString( );


As mentioned in my earlier post about XMLSerializer, the XML you get isn't perfect: element names come out ALL CAPS for some weird reason. And you get a bunch of automatic entity substitutions, most of which you probably want, others of which will simply break things if you try to deserialize the text back into a DOM later. (Forget about easy roundtripping.) But overall, it's a really useful trick.

I was hoping maybe this trick would also (as a free bonus) pretty-format any embedded scripts inside CDATA sections, but of course no such luck. In fact, due to automatic entity substitution, <![CDATA[ gets converted to &lt;![CDATA[, which is hilarious in a sad kind of way.

Serializing DOM nodes to XML in Firefox

I keep having to re-teach myself this, so I might as well post it where I can always find it!

Let 'd' be any arbitrary DOM node. To serialize the node (and its descendants) via JavaScript:

var serializer = new XMLSerializer();
var str = serializer.serializeToString( d );


Having a top-level XMLSerializer object in Mozilla is so nice. So, so nice.

But sadly, the output from the serializeToString( ) method is not the kind of XML I'd like to see. Element names come out ALL CAPS whether or not that's what you want (and it's never what I want). It also converts the greater-than and less-than symbols inside scripts to their entity equivalents, even if you enclose your scripts in CDATA sections. To me, entity substitution inside a CDATA section makes no sense whatsoever.

Still, it's handy to be able to serialize a DOM tree. Even if the tags come out ALL CAPS.

Tuesday, February 12, 2008

Stackless Stack

I made mention of the Stackless Stack in my CMS Watch blog the other day. I need to write a followup blog on it, explaining the OSGi connection.

Who knows if someday I won't also need to write a blog about the Javaless JVM?

Friday, December 21, 2007

Schema-Typed Languages

BEA Systems may have invented something quite novel and useful. In a recent patent application, BEA's John Schneider proposes using XML schema definitions as data types in, say, ECMAScript. The main intuition is that you would use an import statement to make the interpreter aware of a particular schema definition. From that point on, you could instantiate whole objects based on that schema def, or (if it's a simple data type) declare variables to be of type "MyElement.xsd" and manipulate them directly. Type-checking is delegated to the schema validator; and suddenly you have a scripting language that acts like a strongly typed language and groks XML to boot.

At first blush, it sounds and feels a lot like a new twist on object relational mapping, but it's actually a bit more than that. This goes to the heart of language design and behavior.

Neat. I wonder what BEA plans to do with it next?

Friday, December 14, 2007

IE8 and XHTML

I ran across an interesting post by Mary Jo Foley talking about Internet Explorer 8. It mentions that IE8 probably still won't support XHTML properly.

This is all so very wrong.

Wednesday, December 12, 2007

Google Charts

Projects under the heading "Google Apps" don't tend to excite me very much these days, but this one is too good to go unmentioned.

Google Charts is a simple REST-style API for creating graphs and charts on the fly, such as this one:



Details here:
http://code.google.com/apis/chart/#encoding_data

Monday, July 23, 2007

Menus as Non-Modal Dialogs

I was thinking the other day about how best to keep the details of application logic hidden from Swing widgets (in the spirit of Martin Fowler's Presentation Model), the main intuition being that a user app can/should (arguably) be modeled as a set of nonvisual capabilities to which utterly dumb GUI widgets can later be mapped. Achieving this in a clean way is incredibly difficult. (Or at least for me it is.)

I had an epiphany of sorts. When you design a standalone user app (a menu-driven desktop app), what's the first piece of UI you design? The menu system. And what is a menu? In Swing (Java), it's a series of nested buttons. (JMenu and JMenuItem inherit from javax.swing.AbstractButton.)

The menubar never goes away. Some apps let you hide it, in which case it's merely made invisible (it doesn't actually get released from memory). There's a name, of course, for collections of buttons that never go away: a non-modal dialog. My epiphany was/is that a menu system is a collection of non-modal dialogs. (And I hate non-modal dialogs, both as a user and as a programmer.)

In the typical menu-driven app, menus are non-modal dialogs in which each button "knows too much" about deep application internals. The ever-changing state of the entire app is controlled through this collage of interdependent buttons, and managing the underlying ill-formed dependency graph is difficult, and this is why menu apps are a pain the ass to write.

Friday, March 09, 2007

Fractal-Dimensional Transforms

I was on the back porch thinking about image transforms the other morning, and it occurred to me that we just assume that many types of data are either one-dimensional, two-dimensional, or three-dimensional, etc. (with nothing in between), despite the fact that fractals are everywhere in nature. And we apply transformations and convolutions (2-dimensional DCT, in the case of JPEG) to the data without regard for the data's true dimensionality.

So I'm left wondering: how do you do, say, a 2.2D DCT or DFT? What if I want to convolve the fractal residue of a time series?