“XML in Firefox 1.5, Part 3: JavaScript meets XML in Firefox”

“XML in Firefox 1.5, Part 3: JavaScript meets XML in Firefox”

Subtitle: Learn how to manipulate XML in the Firefox browser using JavaScript features
Synopsis: In this third article of the XML in Firefox 1.5 series, you learn to manipulate XML with the JavaScript implementation in Mozilla Firefox. In the first two articles, XML in Firefox 1.5, Part 1: Overview of XML features and XML in Firefox 1.5, Part 2: Basic XML processing, you learned about the different XML-related facilities in Mozilla Firefox, and the basics of XML parsing, Cascading Style Sheets (CSS), and XSLT stylesheet invocation.

Continuing with the series this article provides examples for loading an XML file into Firefox using script, for applying XSLT to XML, and for loading XML with references to scripts. In particular the latter trick is used to display XML files with rendered hyperlinks, which is unfortunately still a bitt of a tricky corner of the XML/Web story. I elaborate more on this trick in my tutorial “Use Cascading Stylesheets to display XML, Part 2”.

See also:

[Uche Ogbuji]

via Copia

"XML in Firefox 1.5, Part 2: Basic XML processing"

"XML in Firefox 1.5, Part 2: Basic XML processing"

Subtitle Do a lot with XML in Firefox, but watch out for some basic limitations

Synopsis This second article in the series, "XML in Firefox 1.5," focuses on basic XML processing. Firefox supports XML parsing, Cascading Stylesheets (CSS), and XSLT stylesheets. You also want to be aware of some limitations. In the first article of this series, "XML in Firefox 1.5, Part 1: Overview of XML features," Uche Ogbuji looked briefly at the different XML-related facilities in Firefox.

I also updated part 1 to reflect the FireFox 1.5 final release.

This article is written at an introductory level. The next articles in the series will be more technically in-depth, as I move from plain old generic XML to fancy stuff such as SVG and E4X.

[Uche Ogbuji]

via Copia

Four Mozilla/XML bugs to vote on (or to help with)

In a recent conversation with colleagues some of the limitations of XML processing in Mozilla came up. I think some of these are really holding Mozilla and Firefox back from being a great platform for XML processing, and so I wanted to highlight them here. Remember that the key to bringing attention to an important bug/request is to vote for it in the tracker, so please consider doing so. I already have done.

18333: "XML Content Sink should be incremental". The description says it all:

Large XML documents, such as the W3C's XSLT spec, take an incredibly long time to load into view source. The browser freezes/blocks (is "not responding" according to Windows) while it processes, and finally unlocks after the entire source of the document is ready for display.

Firefox will never really be a friendly platform for XML processing until this is addressed. There is not really a problem addressing this using the Mozilla's underlying parser, Expat. Worst case one could use that parser's suspend/resume facility (we recently took advantage of this to allow Python-generator-based access to 4Suite Saxlette parsing). The real issue is the amount of work that would need to be done across the Mozilla code base. Unfortunately, Mozilla insiders have been predicting a fix for this problem for a while, and unless there's a sudden boost in votes or better yet resources to help fix the problem, I'm not feeling very optimistic.

69799: "External entities are not included in XML document". Using Betty Harvey's example,

<!DOCTYPE myXML[
<!ENTITY extFile SYSTEM "extFile.xml">
]>
<myXML>&extFile;</myXML>

Is rendered as if Mozilla read

<myXML></myXML>

Of course you have to watch out for XSS type attacks, but I imagine Mozilla could handle this the same way it does loaded stylesheets: by restricting to same host domain as the document entity.

193678: "support exslt:common". The node-set extension function is pretty much required for complex XSLT processing, so support from Mozilla would really help open up the landscape of what you can do with XSLT in the browser.

98413: "Implement XML Catalogs". A request to implement OASIS Open XML Catalogs. This could do a lot to encourage support for external entities because some performance problems could be reduced by using a catalog to load a local version of the resource.

A few on my personal would-be-nice-to-fix-but-not-essential list are:

See also:

[Uche Ogbuji]

via Copia

"Tip: Use data URIs to include media in XML"

"Tip: Use data URIs to include media in XML"

There are many ways to link to non-XML content within XML, including binary content. Sometimes you need to roll all such external content directly into the XML. Data scheme URIs are one way to specify a full resource within a URI, which you can then use in XML constructs. In this tip, Uche Ogbuji shows how to use this to bundle related media into a single file.

I also touch a bit on unparsed entities and notations in this brief article.

Side note: Of course URLs are a subset of URIs, but I did want to mention that I prefer to use the term "URI" for the data scheme because it feels to me much more of an identifier-by-value than a locator. (I suppose it could be considered a trivial locator.)

[Uche Ogbuji]

via Copia

Switching from MH and Evolution to maildir, Dovecot IMAP and Thunderbird

Updated. New details added.

I got fed up with Evolution recently. My main beef with it is that it's all about magic on the back end. All sorts of details, such as what you've deleted or what you've marked as spam are flagged using proprietary metadatabases. The persistence of this information also seems to be dodgy, which means that often, when Evolution crashes it forgets such details, and makes you re-delete, re-file and re-flag things. To make this worse, Evolution crashes a lot, and it seems to especially crash after you've done a lot of message deleting and moving. Over time, Evolution has cost me a lot of tedious work (I get a lot of spam and throwaway mail) and I got to the point where I could no longer stand a tool that increased my workload for little gain.

The main reason I stuck with Evolution for so long is that I have used the MH system to manage my mail for a long time. I like the idea of one message per file. Evolution was the only very modern MUA I could find that supported MH folders. I'd used EXMH for a long time, but it's showing its age, and Sylpheed-claws has way too many rough edges.

So it was time for a change on my Ubuntu workstation, and this is just my notes about what it took to complete the switch of mail management systems.

MH to Maildir

In the end I decided to switch to Maildir, which is like MH in its one-file-per-message philosophy, but seems to be supported by more modern MUAs. I also decided to hedge my bets by moving to a local IMAP server to expose the Maildir folders to just about any MUA.

For the main conversion I used Dr. Jürgen Vollmer's mh2maildir. It did the trick very well. All my MH folders were stored in a directory Mail, and I wanted to have the resulting Maildir in .maildir. I invoked the script as follows:

./bin/mh2maildir.sh -r -R ~/Mail/ ~/.maildir

I knew I had no unread e-mail, thus the -r option. Most users will need the -R option. The original MH diectory is left alone, as far as I can tell, but of course I always recommend backing up your mail folders before any bulk action.

Update. You may need to make some structural adjustments in the resulting .maildir folder. Firstly, many MH tools use an explicit folder named "inbox", which is thus converted into a Maildir folder named "inbox". Most Maildir tools, however, treat the top folder as the implicit inbox. I addressed this as follows:

cd ~/.maildir
mv tmp tmp.save
mv inbox/* .
rm -rf inbox

Also, some Maildir tools such as Mutt and Dovecot expect to see folders as hidden filees (with a leading dot). I handled this by renaming folders, for example:

mv Writing/ .Writing/

If you have nested folders, you have to use a dot to separate levels. A folder "eggs" within "spam" is represented by a top-level directory called ".spam.eggs". I had to do the likes of:

mv XML/XML-DEV .XML.XML-DEV
mv XML .XML

Of course leave alone the "cur", "new" and "tmp" directories which make up the internal structure of each Maildir folder.

Dovecot for IMAP

I went with the Dovecot server for IMAP. I just used Synaptic to install dovecot-common and dovecot-imapd, and followed the article "Setting up an IMAP server with dovecot". The trickiest line in setting up /etc/dovecot/dovecot.conf seems to be the default_mail_env.

default_mail_env = maildir:/home/%u/.maildir

Update. I also had to enable imap with the lines:

protocols = imap
imap_listen = *

And for some reason I had to manually start dovecot the first time:

sudo dovecot

After this, the usual sudo /etc/init.d/dovecot restart seems to work.

Thunderbird

The thunderbird set-up was very straightforward. In fact, I can't really think of anything to say about it. Just set it up with localhost as an IMAP server, and you're set.

Finally, updating .fetchmail

I use fetchmail to download my mail from my external IMAP server, and procmail to put it into my local folders. I had to change procmail to work with Maildir rather than MH folders. The changes turned out to be very simple. In my .procmailrc having defined MAILDIR=$HOME/.maildir I changed lines such as

|/usr/lib/mh/rcvstore +Writing

to

$MAILDIR/.Writing/

Nested folders are similar. You use periods rather than slashes for level separators, not forgetting the leading period. Thus

|/usr/lib/mh/rcvstore +Writing/Tech

becomes

$MAILDIR/.Writing.Tech/

Since Mutt/Dovecot don't use explicit "inbox" folders,

|/usr/lib/mh/rcvstore +inbox

becomes just

$MAILDIR/

The trailing slashes are essential. They indicate that the destination is a Maildir folder and so procmail does the right thing with it.

Mutt is useful for testing that all is OK with your IMAP set-up. Use regular slashes rather than periods for folder separators in IMAP urls, for example:

mutt -f imap://uogbuji@localhost/Writing/Tech

[Uche Ogbuji]

via Copia

Thunderbird crash recovery of composed messages

Dare laments Firefox's lack of text area content savings upon crashing. At first I found this strange because Firefox does save text area content in my experience. Then I remembered that I always install SessionSaver. I suspect that's where I might be getting my protection from. It did make me wonder whether XForms content is similarly protected. These days I like to use Chime's XForm document with the FireFox XForms extension to post to copia, and I should test how it handles crashes.

But the main point of this entry is to make a related rant and lazyweb request about Thunderbird. I learned the hard way that unlike Evolution, Thunderbird does not auto-save messages you are composing. That means that my habit of starting drafts and then switching to another task is very dangerous. If I do not manually save the draft and Thunderbird crashes, I lose my work. This is stupid. Evolution would save all compose window content in files named ".evolution-<opaque-id>", and would offer to restore these windows upon restart. If I can't find an extension along the lines of SessionSaver for Thunderbird, I'll have to ditch it. Do any of my LazyWeb friends know of such an extension? Googling and other searching turned up blanks.

[Uche Ogbuji]

via Copia

Finding xForms.xpi for FireFox 1.5

The maintainer of the official extensions page for Mozilla XForms has been very slow to update the main link to point to an xforms.xpi that works with the FireFox 1.5 release. A comment on this page as well as the XForms project page point to a nightly FTP location as the up to date source for the XPI. I've used the XPI from that link on one Windows box and two Ubuntu boxes. It worked on all but one Ubuntu box. Today I tried a reinstall for the problem case, but when I tried the above link I got an FTP 550 error ("Failed to change directory"). The directory is still in the index for its parent, so I'm not sure what's up. Indeed I'm not able to change to any of the children of ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/ with Firefox or ncftp.

I did some searching and found this Australian mirror. The nightly date is different, so I can't be sure it's the same xpi (I don't have the other xforms.xpi to checksum), but perhaps it will help someone else. It didn't help me; that xpi doesn't seem to work any more than the earlier one I tried. XForms simply don't render.

I do hope Mozilla gets its XForms act together. There is some hope now that lead developer Beaufour has found a temporary sponsor (woohoo!). He says:

I guess we should release a new version of the XPI soon too, to include some of the stuff that has been in the trunk for a while now, and I should try to get back to my weekly/bi-weekly “XForms Status Updates” too.

Yes to all that, but do please also make sure people can actually find the extension for use.

[Uche Ogbuji]

via Copia

XForms Submission to Copia (Mozilla / FormsPlayer)

Uche recently setup Copia to accept HTTP PUT submission of content as atom entry instances. I wrote 3 XForms documents which collect the data from a form and submit it to the service (each for a seperate XForms implementation):

This post was submitted using the Mozilla XForms implementation

Forms Player is the most compliant of the 3 (it supports full XForms 1.0 and some aspects of XForms 1.1) but functions as an Internet Explorer plugin.

Mozilla XForms is an up and coming effort to build native XForms support into Mozilla. The supported feature set has now reached a point where a majority of the useful capabilities are supported.

FormsFaces is a javascript library that attempts to implement XForms functionality completely independent of the browser. Unfortunately I wasn't able get the submission action to fire properly in order to submit new content from a FormsFaces XForm.

The FormsPlayer implementation is available here and the Mozilla implementation is available here. The primary difference is styling (specifically the CSS neccessary to style forms individually) and the mechanism for invoking the XForms processor.

With FormsPlayer, the following bits are needed:


    
FormsPlayer has failed to load!

<?import namespace="xf" implementation="#FormsPlayer" ?>
FormsPlayer has failed to load!

With the Mozilla implementation, nothing is needed (since the support is native to the browser).

With FormsFaces, the following would be needed to include the javascript library that facilitates XForms support:

<script type="text/javascript" src="/path/to/formfaces.js"></script>

The current features of XForms supported by FormsFaces are listed here

Below are the two stylesheets used to style the FormsPlayer XForms and the Mozilla XForm followed by screenshots of the rendered XForms in IE, and Firefox 1.5b2.

FormsPlayer XForms CSS

* {
            margin:0;
            padding:0;
        }

        .title {
            text-align: center;
        }

        xf\:input,xf\:switch {
            display: block;
        }
        .author_input .value {                
            width: 7em;
        }

        .title_input .value {                
            width: 30em;
        }

        xf\:input xf\:label {
            font-weight: bold;
            padding-right: 5px;
            width: 100px;
            float: left;
        }

        xf\:textarea xf\:label {
            font-weight: bold;
            padding-right: 5px;
            width: 100px;
            float: left;
        }

        xf\:secret xf\:label {
            font-weight: bold;
            padding-right: 5px;
            width: 100px;
            float: left;
        }            

        .textarea-value {
            width: 50em;
            height: 30em;            
        }

        .leftPadded {
            padding-left: 100px;
        }            

        .category_input .value {
            width: 20em;
        }

Mozilla XForms CSS

@namespace xf url("http://www.w3.org/2002/xforms");
        * {
            margin:0;
            padding:0;
        }

        .title {
            text-align: center;
        }                        

        xf|secret.author_input {
            display: table-row;                
        }

        xf|secret.author_input secret {                
            width: 7em;            
        }

        xf|secret.author_input > xf|label span {
            display: table-cell;
            width:100px;
            font-weight: bold;               
        }

        xf|input.category_input {
            display: table-row;
        }

        xf|input.category_input > xf|label span {
            display: table-cell;
            width: 100px;                
            font-weight: bold;               
        }            

        xf|input.category_input input {
            width: 20em;
        }

        xf|textarea.content_input {
            display: table-row;
        }

        xf|textarea.content_input > xf|label span {
            display: table-cell;
            width: 100px;                
            font-weight: bold;
            vertical-align: top;                
        }            

        xf|textarea.content_input textarea {
            width: 50em;
            height: 30em;
        }            

        #show_content xf|trigger {
            display: block;
            padding-left: 200px;                
        }

        xf|input.title_input {
            display: table-row;
        }

        xf|input.title_input > xf|label span {
            display: table-cell;
            width: 100px;                
            font-weight: bold;
        }            

        xf|input.title_input input {
            width: 30em;
        }


        .leftPadded {
            padding-left: 200px;                
        }

FormsPlayer Screenshot

Copia Entry Submission FormsPlayer XForms

Mozilla Screenshot

Copia Entry Submission Mozilla XForms

The Mozilla implementation's support for XForms CSS styling is discussed here (briefly)

Chimezie Ogbuji

via Copia

Today's XML WTF: Internal entites in browsers

This unnecessary screw-up comes from the Mozilla project, of all places. Mozilla's XML support is improving all the time, as I discuss in my article on XML in Firefox, but the developer resources seem to lag the implementation, and this often leads to needless confusion. One that I ran into recently could perhaps be given the summary: "not everything in the Mozilla FAQ is accurate". From the Mozilla FAQ:

In older versions of Mozilla as well as in old Mozilla-based products, there is no pseudo-DTD catalog and the use of entities (other than the five pre-defined ones) leads to an XML parsing error. There are also other XHTML user agents that do not support entities (other than the five pre-defined ones). Since non-validating XML processors are not required to support entities (other than the five pre-defined ones), the use of entities (other than the five pre-defined ones) is inherently unsafe in XML documents intended for the Web. The best practice is to use straight UTF-8 instead of entities. (Numeric character references are safe, too.)

See the part in bold. Someone either didn't read the spec, or is intentionally throwing up a spec distortion field. The XML 1.0 spec provides a table in section 4.4: "XML Processor Treatment of Entities and References" which tells you how parsers are allowed to treat entities, and it flatly contradicts the bogus Mozilla FAQ statement above.

The main reason for the "WTF" is the fact that the Mozilla implementation actually gets it right. That it should. It's based on Expat. AFAIK Expat has always got this right (I've been using Expat about as long as the Mozilla project has been), so I'm not sure what inspired the above error. Mozilla should be touting its correct and useful behavior, rather than giving bogus excuses to its competitors.

This came up last week in the IBM developerWorks forum where a user was having problems with internal entities in XHTML. It turns out that he was missing an XHTML namespace (and based on my experimentation was probably serving up XHTML as text/html which is generally a no-no). It should have been a clear case of "Mozilla gets this right, and can we please get other browsers to fix their bugs?" but he found that FAQ entry and we both ended up victims of the red herring for a little while.

I didn't realize that the Mozilla implementation was right until I wrote a careful test case in preparation for my next Firefox/XML article. The following CherryPy code is a test server set-up for browser rendering of XHTML.

import cherrypy

INTENTITYXHTML = '''\
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html
  PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
         "http://www.w3.org/TR/xhtml/DTD/xhtml1-strict.dtd" [
<!ENTITY internal "This is text placed as internal entity">
]>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US">
  <head>
    <title>Using Entity in xhtml</title>
  </head>
  <body>
    <p>This is text placed inline</p>
    <p>&internal;</p>
    <abbr title="&internal;">Titpaie</abbr>
  </body>
</html>
'''

class root:
    @cherrypy.expose
    def text_html(self):
        cherrypy.response.headerMap['Content-Type'] = "text/html; charset=utf-8"
        return INTENTITYXHTML

    @cherrypy.expose
    def text_xml(self):
        cherrypy.response.headerMap['Content-Type'] = "text/xml; charset=utf-8"
        return INTENTITYXHTML

    @cherrypy.expose
    def app_xml(self):
        cherrypy.response.headerMap['Content-Type'] = "application/xml; charset=utf-8"
        return INTENTITYXHTML

    @cherrypy.expose
    def app_xhtml(self):
        cherrypy.response.headerMap['Content-Type'] = "application/xhtml+xml; charset=utf-8"
        return INTENTITYXHTML

cherrypy.root = root()
cherrypy.config.update({'server.socketPort': 9999})
cherrypy.config.update({'logDebugInfoFilter.on': False})
cherrypy.server.start()

As an example, this code serves up a content type text/html when accessed through a URL such as http://localhost:9999/text_html. You should be able to work out the other URL to content type mappings from the code, even if you're not familiar with CherryPy or Python.

Firefox 1.0.7 handles all this very nicely. For text_xml, app_xml and app_xhtml you get just the XHTML rendering you'd expect, including the correct text in the attribute value with the mouse hovered over "Titpaie".

IE6 (Windows) and Safari 1.3.1 (OS X Panther) both have a lot of trouble with this.

IE6 in the text_xml and app_xml cases complains that it can't find http://www.w3.org/TR/xhtml/DTD/xhtml1-strict.dtd. In the app_xhtml case it treats the page as a download, which is reasonable, if not convenient.

Safari in the text_xml, app_xml and app_xhtml cases complains that the entity internal is undefined (??!!).

IE6, Safari and Mozilla in the text_html case all show the same output (looking, as it should, like busted HTML). That's just what you'd expect for a tag soup mode, and emphasizes hat you should leave text_html out of your XHTML vocabulary.

All this confusion and implementation difference illustrates the difficulty for folks trying to deploy XHTML, and why it's probably not yet realistic to deploy XHTML without some sort of browser sniffing (perhaps by checking the Accept header, though it's well known that browsers are sometimes dishonest with this header). I understand that the MSIE7 team hopes to address such problems. I don't know whether to expect the same from Safari. My focus in research and experimentation has been on Firefox.

One final note is that Mozilla does not support external parsed entities. This is legal (and some security experts claim even prudent). The relevant part of the XML 1.0 spec is section 4.4.3:

When an XML processor recognizes a reference to a parsed entity, in order to validate the document, the processor MUST include its replacement text. If the entity is external, and the processor is not attempting to validate the XML document, the processor MAY, but need not, include the entity's replacement text. If a non-validating processor does not include the replacement text, it MUST inform the application that it recognized, but did not read, the entity.

I would love Mozilla to adopt the idea in the next spec paragraph:

Browsers, for example, when encountering an external parsed entity reference, might choose to provide a visual indication of the entity's presence and retrieve it for display only on demand.

That would be very useful. I wonder whether it would be possible through a Firefox plug-in (probably not: I guess it would require very tight Expat integration for plug-ins).

[Uche Ogbuji]

via Copia

New life for PyXPCOM?

Way back in the day I wrote about PyXPCOM, a means for using Python to script Mozilla browser. and the project had a lot of promise.

Mark Hammond was the considerable brains behind PyXPCOM, as well as the Win32 and .NET APIs through Python, and many other things. Indeed, he received the 2003 ActiveState Active Award in Python (the same year Mike Olson and I got one for XSLT). Unfortunately, he has been way below the radar for the bast couple of years, and no one has really picked up the torch on PyXPCOM. The project has been largely languishing for so long that it was quite exciting to see Brendan Eich, keeper of the Mozilla roadmap, including "Mozilla 2.0 platform must-haves":

8. Python support, perhaps via Mono (if so, along with other programming languages).

I'm not sure just how Mono would fit in. Would they build a little CLR sandbox into Mozilla so that Python.NET code could run?

Anyway, if you care about being able to script Mozilla through Python (and I think you should), please leave a comment on Brendan's article. Here's a note about some of the comments already in place on the matter:

#8 scares me only for the potentially huge installer file. If it were optional this would be incredibly cool. If it were optional developers would have a headache.

I think it should be enough for Mozilla to include the PyXPCOM stubs, and use the user's own installed Python, which should alleviate this fear.

Hmm. What do you think about Parrot (Perl 6) support? Soon, Parrot will be something like [stable], and the hope is that it will support a lot of languages, includes Python. I would give it a chance, sounds good.

From what I've followed about Parrot and its intended use as a basis for other languages such as Python, I'm not comfortable with such an approach.

Python support can be provided via Jython which is much older than the .NET python implementation.

It seems people want to offer up every VM incarnation on the planet as a possible base for Mozilla/Python, but I'm spoiled by the potential I saw through Hammond's work, and I really would want the project to at least try picking up from there. I was therefore glad to see Brendan's response:

We already have Python integrated with XPCOM, thanks to Mark Hammond and Active State. If nothing better comes along in the way of a unified runtime, we will fully integrate Mark's work so you can write <script type="application/x-python"> in XUL.

Whether Python support will be bundled in libxul or not, I'm pushing for a scheme that lets extension languages be loaded dynamically. So if you have connectivity or can deploy an extra file, you should be able to use Python as well as JS from XUL. That's my goal, at least.

See my next entry for the Mozilla 2.0 "managed code" virtual machine goals that any would-be universal runtime has to meet, or come close to meeting, to win.

This sounds just right, and I'll keep my eye open for the follow-up article he mentions. Another poster mentions:

I would like to see the ability to talk to Mozilla from outside Python code. A program I am writing allows importing contacts from various data sources. I can do Outlook and Evolution easily, but have given up on Mozilla contacts.

In theory I need to use XPCom with the PyXPCom wrapper but I challenge anyone to actually get that working on Windows, Linux and Mac and have a redistributable program. (There are no binaries of PyXPCom for example).

Yes, PyXPCOM does allow this in theory, and i think Brendan's entire point is that it's important for Mozilla developers to put in the work to address the problem stated in the second paragraph.

If you're trying to work with PyXPCOM, keep an eye on the mailing list. Folks have been posting their problems, and others have been sharing their recipes for getting PyXPCOM to work, including Matt Campbell and Scott Robertson and Jean-François Rameau, and Michael Thornhill (1 2).

[Uche Ogbuji]

via Copia