Tuesday, September 09, 2008

Selection object in Firefox

I've learned some interesting things about the way selections work in Mozilla.

Every window has a singleton selection object, even when the user has selected no items on the rendered page. Therefore, window.getSelection( ) always succeeds.

If you simply want user-selected text as a string, getSelection( ).toString( ) will work. But if you really intend to walk the selected DOM nodes, or process the selection in any non-trivial way, you will need access its Range objects with

window.getSelection( ).getRangeAt( i );

There is a "rangeCount" property on the Range object, so that you can know how many Ranges were selected by the user. In Firefox 2.0 and prior, the rangeCount was never more than one. But in Firefox 3, the user can do multi-selection of page contents. (Try it: Hold the Control key down as you swipe across various pieces of a page.) That means the range count can be more than one.

If you need to process a Range's contents, be sure to use the cloneContents( ) method, not the extractContents( ) method. The latter will actually remove nodes from the DOM tree, affecting the rendered page's appearance. (That is to say, content suddenly disappears!)

This is all spelled out at the Moz Developer Center page on Ranges.

Friday, September 05, 2008

XPath Query in Sling

I've been playing with Sling lately, and I was pleasantly surprised to find that Sling comes with a JSON query servlet that exposes SQL and XPath query capability through a RESTful HTTP GET syntax. (Thanks to Moritz Havelock for pointing this out.)

But I quickly ran into a small problem. (And just as quickly, the solution.) Allow me to explain.

The problem: I want to search for nodes in the repository that have a (multivalued) "pets" attribute containing the value "dog." Note that the "pets" attribute might have multiple values. I want to filter against just one. Therefore I can't do an equality test. I must use the XPath contains() function.

My test query was:

http://localhost:7402/content.query.json?
queryType=xpath&statement=//*[contains(@pets,'dog')]


This produced an InvalidQueryException, with a message of "Unsupported function: contains (500)".

I was a bit surprised that the servlet seemed to know nothing about any contains() function. A true "WTF moment."

Taking my hint from the stack trace, I quickly ran a Google Code Search on org.apache.jackrabbit.core.query.xpath, and immediately found the answer in XPathQueryBuilder.java: It turns out you have to use the function's qualified name, jcr:contains(). Like so:

http://localhost:7402/content.query.json?
queryType=xpath&statement=//*[jcr:contains(@pets,'dog')]


I'm so much of an XPath newb that I don't even know if I should have been surprised by this, but it did stymie me briefly. Anyway, it works now and I'm thrilled to be able to do XPath queries right from the GET-go.

Tuesday, September 02, 2008

Google Chrome: nice console, ugly browser


I downloaded Chrome today and immediately started using the JavaScript console. It's pretty nice, but if you're already accustomed to Firebug in Firefox, it's no substitute. Also, what good is Chrome if you can't use Greasemonkey scripts with it?

The JS engine is presumably based on Spidermonkey (since the Chrome guys apparently used a lot of Mozilla code to slap this thing together). But they forgot to include E4X. And so help me, I haven't figured out how to enter a newline in the console without triggering an eval( ). In other words, I can only enter one line of code at a time, and then I have to execute it. As soon as I hit Enter, CR, Control-Enter, etc., the code on the current line executes. Oh well...

As a browser, this thing is not terribly impressive, from what I can tell.

In any case, Chrome itself strikes me as too fugly to deal with. I'm not sure which I'd rather do: spend a work-day using Chrome as my main browser, or jam prickly-pears into both my eyes at once.

I think I'll stay with Firefox until Chrome gets out of beta. Which (if it's like Gmail) it never will.

Friday, August 29, 2008

Pretty-print serialized DOM

Another great Mozilla feature: pretty-format a serialized DOM tree. The following code will serialize an entire web page and pretty-format the markup:

var serializer = new XMLSerializer( );
var str = serializer.serializeToString( document.documentElement );
var pretty = XML( str ).toXMLString( );


As mentioned in my earlier post about XMLSerializer, the XML you get isn't perfect: element names come out ALL CAPS for some weird reason. And you get a bunch of automatic entity substitutions, most of which you probably want, others of which will simply break things if you try to deserialize the text back into a DOM later. (Forget about easy roundtripping.) But overall, it's a really useful trick.

I was hoping maybe this trick would also (as a free bonus) pretty-format any embedded scripts inside CDATA sections, but of course no such luck. In fact, due to automatic entity substitution, <![CDATA[ gets converted to &lt;![CDATA[, which is hilarious in a sad kind of way.

Serializing DOM nodes to XML in Firefox

I keep having to re-teach myself this, so I might as well post it where I can always find it!

Let 'd' be any arbitrary DOM node. To serialize the node (and its descendants) via JavaScript:

var serializer = new XMLSerializer();
var str = serializer.serializeToString( d );


Having a top-level XMLSerializer object in Mozilla is so nice. So, so nice.

But sadly, the output from the serializeToString( ) method is not the kind of XML I'd like to see. Element names come out ALL CAPS whether or not that's what you want (and it's never what I want). It also converts the greater-than and less-than symbols inside scripts to their entity equivalents, even if you enclose your scripts in CDATA sections. To me, entity substitution inside a CDATA section makes no sense whatsoever.

Still, it's handy to be able to serialize a DOM tree. Even if the tags come out ALL CAPS.

Tuesday, February 12, 2008

Stackless Stack

I made mention of the Stackless Stack in my CMS Watch blog the other day. I need to write a followup blog on it, explaining the OSGi connection.

Who knows if someday I won't also need to write a blog about the Javaless JVM?

Friday, December 21, 2007

Schema-Typed Languages

BEA Systems may have invented something quite novel and useful. In a recent patent application, BEA's John Schneider proposes using XML schema definitions as data types in, say, ECMAScript. The main intuition is that you would use an import statement to make the interpreter aware of a particular schema definition. From that point on, you could instantiate whole objects based on that schema def, or (if it's a simple data type) declare variables to be of type "MyElement.xsd" and manipulate them directly. Type-checking is delegated to the schema validator; and suddenly you have a scripting language that acts like a strongly typed language and groks XML to boot.

At first blush, it sounds and feels a lot like a new twist on object relational mapping, but it's actually a bit more than that. This goes to the heart of language design and behavior.

Neat. I wonder what BEA plans to do with it next?

Friday, December 14, 2007

IE8 and XHTML

I ran across an interesting post by Mary Jo Foley talking about Internet Explorer 8. It mentions that IE8 probably still won't support XHTML properly.

This is all so very wrong.

Wednesday, December 12, 2007

Google Charts

Projects under the heading "Google Apps" don't tend to excite me very much these days, but this one is too good to go unmentioned.

Google Charts is a simple REST-style API for creating graphs and charts on the fly, such as this one:



Details here:
http://code.google.com/apis/chart/#encoding_data

Monday, July 23, 2007

Menus as Non-Modal Dialogs

I was thinking the other day about how best to keep the details of application logic hidden from Swing widgets (in the spirit of Martin Fowler's Presentation Model), the main intuition being that a user app can/should (arguably) be modeled as a set of nonvisual capabilities to which utterly dumb GUI widgets can later be mapped. Achieving this in a clean way is incredibly difficult. (Or at least for me it is.)

I had an epiphany of sorts. When you design a standalone user app (a menu-driven desktop app), what's the first piece of UI you design? The menu system. And what is a menu? In Swing (Java), it's a series of nested buttons. (JMenu and JMenuItem inherit from javax.swing.AbstractButton.)

The menubar never goes away. Some apps let you hide it, in which case it's merely made invisible (it doesn't actually get released from memory). There's a name, of course, for collections of buttons that never go away: a non-modal dialog. My epiphany was/is that a menu system is a collection of non-modal dialogs. (And I hate non-modal dialogs, both as a user and as a programmer.)

In the typical menu-driven app, menus are non-modal dialogs in which each button "knows too much" about deep application internals. The ever-changing state of the entire app is controlled through this collage of interdependent buttons, and managing the underlying ill-formed dependency graph is difficult, and this is why menu apps are a pain the ass to write.

Friday, March 09, 2007

Fractal-Dimensional Transforms

I was on the back porch thinking about image transforms the other morning, and it occurred to me that we just assume that many types of data are either one-dimensional, two-dimensional, or three-dimensional, etc. (with nothing in between), despite the fact that fractals are everywhere in nature. And we apply transformations and convolutions (2-dimensional DCT, in the case of JPEG) to the data without regard for the data's true dimensionality.

So I'm left wondering: how do you do, say, a 2.2D DCT or DFT? What if I want to convolve the fractal residue of a time series?

Wednesday, January 24, 2007

Privacy Leakage Patent

Identity data-mining disturbs me. What disturbs me even more is that you can patent a technique for, say, guessing someone's age based on their purchasing habits (which is what Amazon has succeeded in doing).

Evil, evil, evil.

Thursday, January 11, 2007

jrunscript

It turns out JDK 6 comes with a JavaScript console facility so that you can play with Rhino interactively from a command line. Look for a file called jrunscript.exe in your JDK's /bin directory.

A pretty good article on Java/JavaScript integration in Java 6 can be found on the Sun Developer Network site right here.

Wednesday, January 03, 2007

OpenOffice.org Dev Hurdles

Over the holidays I decided to wade into the murky waters of OpenOffice development. I was quickly up to my neck in mud.

It turns out I'm not the only one. Key OOo insiders are acutely aware that the barriers to participation in OOo development are way too high (keeping community participation in OOo development way too low).

It's not just that finding all the code is hard or that the C++ codebase is around 7 million lines of code. It's that a full compile-and-build of OOo takes 15 hours on a typical desktop PC. If you can get it to build at all.

Some of the entry-barrier issues are more fully discussed in Jens Heiner Rechtien's 31 Dec 2006 blog.

Friday, December 22, 2006

JRuby for OpenOffice Development

Juergen Schmidt (who gave a talk at last week's Javapolis conference on why Java programmers should get more involved with OpenOffice) blogged yesterday about the prospect of using JRuby for OOo development:

I also met two Sun colleagues Thomas Enebo and Charles Oliver Nutter, two of the JRuby core developers, and brainstormed a little bit with them about the support of JRuby in OpenOffice.org. JRuby comes directly with Java in the future and the integration work into NetBeans is ongoing. So it would be great to have a good support for JRuby from UNO as well. JRuby as one of the main scripting languages for OpenOffice.org with a smart integration in NetBeans is a really cool idea and I hope that we can deliver something in this direction. We will see what's possible and when!

Some interesting podcasts from Javapolis (including quite a few on agile development) are here.

Wednesday, November 08, 2006

Project Tamarin

By now everyone has heard the news that Adobe will donate code for its ActionScript VM to the Mozilla Foundation for use in Firefox. For a quick snapshot of what's going on, see:

A lot of the blog commentary around this has centered on Flash. IMHO this has little to do with Flash. It has everything to do with ECMA4/JS2 (see my blog entry previous to this one) and the future of AJAX. It will also keep Adobe honest in terms of making sure ActionScript doesn't continue on the path of becoming its own bastard variant of JavaScript (a la JScript), which is to say a not-quite-compliant dialect of ECMA-262.

The ability to run JIT-compiled JavaScript on a VM is killer, because it knocks down all complaints of JS being slow. And it also opens the door to ultra-fast JS on the server (and pure-JS doublesided AJAX).

The VM architecture looks like this:




But again, it's not really about .swf, it's about compiling JS2 into bytecode, which is an incredibly important advancement.

Brendan Eich held an IRC chat yesterday in which he and Kevin Lynch of Adobe fielded questions about Tamarin. A few interesting factoids came to light:

  • Acrobat's JS engine will move from Spidermonkey to Tamarin.
  • The expansion factor for jitting bytecode to x86 is roughtly from 5X for strongly typed, early-bindable code, to 20X for loosly typed, unbindable code. Thus, you pay a price in memory hunger for the ability to JIT-compile JS, but JS2's new typing system mitigates it somewhat.
  • The Tamarin codebase comprises 135,000 lines of C++ (smaller than I would have thought). This is sure to grow but Brendan Eich indicated very strongly that Firefox needs to shrink, not grow, hence there will be pressure to keep Tamarin as lean and efficient as possible.
  • Tamarin is not 64-bit-ready. But if the project gets the kind of (huge) traction that it appears it will get in the community, the "64-bit Flash" question may finally get solved. And maybe ES4/JS2 will get a "long" data type in addition to int/uint/double. ;^)

Thursday, November 02, 2006

New ECMA Draft

ECMA's 262 revision-4 working group just published a draft spec of what will hopefully become (by next summer) JavaScript 2.0. This is the first major upgrade to the JavaScript language in almost a decade. Guaranteed to take Ajax to the next level.

Tuesday, October 24, 2006

Fuzzing

I learned about fuzzing today. Think of it as fault discovery by random input. The underlying assumption: If unexpected input makes an app produce unexpected behavior, you're hosed. Hackers rely on fault-injection to find vulnerabilities. QA can use it to find bugs.

There's a list of open-source fuzzers here.