Thursday, July 08, 2010

Using E4X in Acrobat JavaScript

With all the talk of AJAX in yesterday's blog, I somehow managed to never once talk about XML. (How ironic is that?) But it turns out that one of the pleasures of doing JavaScript programming in Acrobat is the ease with which you can manipulate XML, thanks to the Acrobat interpreter's built-in support for E4X.

If you're not familiar with E4X -- the EcmaScript extensions for XML (otherwise known as ECMA-357) -- you really should take time to look into it. It's a very handy grammar for working with XML. You'll find that E4X is implemented in SpiderMonkey (Gecko's JavaScript engine) since version 1.6.0 and in Rhino (Mozilla's other JavaScript engine) since version 1.6R1. It's also supported in ActionScript, hence is available in Flash CS3, Adobe AIR and Adobe Flex. Add to that list Adobe Acrobat.

How or why would you ever want to use E4X in Adobe Acrobat? Let me give a brief example. The other day, I wrote a short script that harvests all of the annotations from a PDF document. I wanted to put the data in XML format. The script that formats the data as XML looks like this:

// Pass this function an array of Annotations
function getAnnotationsAsXML( annots ) {


var xmlOutput = <annots></annots>;
for ( var i = 0; i < annots.length; i++ )
{
// get the properties for each annotation

var props = annots[i].getProps();

// add an <annot> element
xmlOutput.* += <annot/>;

// start adding child elements
var parent = xmlOutput.annot[i];
parent.* = <author>{props.author}</author>;
parent.* += <contents>{props.contents}</contents>;
parent.* += <page>{props.page}</page>;
parent.* += <creationDate>{props.creationDate}</creationDate>;
parent.* += <type>{props.type}</type>;
}

return xmlOutput.toXMLString()
}

To test the above code, first open a PDF (using Acrobat Pro) that already contains some annotations. Next, open the JavaScript console in Acrobat (Control-J), copy and paste the code into the console, then add a line:

// Use AcroJS API call 'getAnnots()'
// to harvest all annotations
getAnnotationsAsXML( this.getAnnots( ) );

Highlight (select) all of the code and execute it in the console by hitting Control-Enter. Assuming the document you've got open contains annotations, you should see some XML appear in the console. In my case, I got:

<annots>
<annot>
<author>Admin</author>
<contents>We need to strike this.</contents>
<page>729</page>
<creationDate>Tue Jul 06 2010 14:43:57 GMT-0400 (Eastern Daylight Time)</creationDate>
<type>Highlight</type>
</annot>
<annot>
<author>Admin</author>
<contents>I am underlining this.</contents>
<page>729</page>
<creationDate>Tue Jul 06 2010 14:44:12 GMT-0400 (Eastern Daylight Time)</creationDate>
<type>Underline</type>
</annot>
<annot>
<author>Admin</author>
<contents>I liked this.</contents>
<page>57</page>
<creationDate>Tue Jul 06 2010 15:04:21 GMT-0400 (Eastern Daylight Time)</creationDate>
<type>Highlight</type>
</annot>
<annot>
<author>Admin</author>
<contents>This does not seem right.</contents>
<page>57</page>
<creationDate>Tue Jul 06 2010 15:04:32 GMT-0400 (Eastern Daylight Time)</creationDate>
<type>Text</type>
</annot>
<annot>
<author>Admin</author>
<contents>Is this the correct copyright date?</contents>
<page>1</page>
<creationDate>Tue Jul 06 2010 18:22:39 GMT-0400 (Eastern Daylight Time)</creationDate>
<type>Highlight</type>
</annot>
</annots>

Of course, there are many more properties on Annotations than I've captured here. But you get the idea.

Not bad for a dozen lines of code. Sure beats messing around with DOM methods, DOM serialization, etc.

4 comments:

  1. Bid on Power Bracelet now! Find Bracelets. New Colors & Sizes. power balance wholesale Guarantee & Secure Ordering.Power balance USA, Canada, UK Free Shipping. Rosetta Stone allows headstones to provide genealogical and historical information about the deceased directly to cemetery site visitors.

    ReplyDelete
  2. Our CAD Training Institute motto is to focus on building up a decent training condition with reasonable expenses, testing great characterized course parts, well-prepared CAD CAM Training labs, and world-class offices which by and large helps understudies for a superior training. APTRON, the best CAD Training Franchise in Gurgaonbecomes the pioneer in placing understudies in the top MNCs.

    for More Info:- AutoCAD course in Gurgaon

    ReplyDelete
  3. Jual Obat Aborsi Asli | Obat Cytotec Asli | 082241083319
    jual obat aborsi pills cytotec asli manjur untuk menggugurkan kandungan usia 1 - 6 bulan gugur tuntas.CYTOTEC OBAT ASLI sangat efektif mengatasi TELAT DATANG BULANdan menjadikan anda gagal hamil , CYTOTEC adalah OBAT ABORSI 3 JAM BERHASIL GUGUR TUNTAS dengan kwalitas terbaik produk asli pfizer nomor 1 di dunia medis
    JUAL OBAT ABORSI DI NGAWI
    JUAL OBAT ABORSI DI PAMEKASAN
    JUAL OBAT ABORSI DI PACITAN
    JUAL OBAT ABORSI DI PASURUAN
    JUAL OBAT ABORSI DI BANGIL


    ReplyDelete

Add a comment. Registration required because trolls.