I'll cut your ass in half and leave you with a semi-colon
—Mr. Man
QNames in content have been on my brain today. See a follow up posting
for more on why.
First of all I think we should find a new name for this phenomenon,
because I don't think QNames qua QNames are key to the problem. For one
thing, you have the problem even if you only use a prefix in content, as
XSLT does in, say the extension-element-prefixes
, or XPath in
html:*/html:span
(the second step is a QName, but not the first). I
think a better name for this problem is "hidden namespaces" because
that's exactly the problem: the document depends on a construct that is
hiding a namespace in a separate layer where generic processing cannot
see it.
Whatever the name, I re-read today a couple of important documents
regarding the issue. First of all there is the TAG finding on
QNames, which is
unfortunately not much more than an agglomeration of existing wisdom.
Norm Walsh, the editor of that document wrote of a more radical
direction as part of his "XML
2.0" article. I like his
ideas (though I'm partial to Jeffrey Yasskin's ampersand
variation, and I
hope conversation soon drives towards something along those lines. XML
is almost ten years old, and I see nothing wrong with a bit of a shake-up.
Until then, I think we can deploy two safeguards to protect ourselves
from the subtle problems of namespaces. I call them: "sanity within the
document, and registries without". The two components are very
different in character.
Firstly, we need to discard the idea of in-document scoping of
namespaces. It seemed a great idea at the time, even to me, but in
practice it's a mess, and Joe English was the first one to illuminate
the mess in the light of a brilliant
metaphor (Google cache of original since XML-DEV is down now).
(See my article "Principles of XML design: Use XML namespaces with
care"
for more on this). If we can rely on sanity in XML documents we can at
least simplify state processing a good deal. Ideally all the XML
sources in an XML processing pipeline would emit sane XML.
Secondly, I think the time has come for namespace registries. It would
definitely be nice to build on the unfortunately stalled
RDDL, but whereas the goal of RDDL is to
provide human readable information, what I think we really need in a
namespace registry is a little nugget of machine-readable data.
Drumroll please...
A list of preferred prefixes for a namespace (supporting lookup of
namespace name to well-known prefix, and vice versa). I know this will
be controversial. Prefixes are supposed to be insignificant. Users
should have flexibility to use whatever prefix, blah blah blah. I'm
sorry, but that's all theoretically nice, but we have practical problems
to solve. The fact that the most powerful constructs in XPath depend
for their semantics on the whimsy of prefix choices should bother you a
bit. The fact that Canonical XML had to abandon the idea of normalizing
prefixes should bother you even more. It's time to just say that xsl
means "The XSLT namespace" (yeah, yeah: "what version?" etc.—hard
problems would still remain) and that if you choose to use it for a
different namespace, you're technically compliant to namespaces, but
you're asking for a heaping help of trouble, buddy.
For now I'm just throwing out ideas to help organize my thoughts, and
for discussion. It seems to me that if we could rely on authors and
tools, supported by a registry, to produce sane documents that (wherever
possible) used essentially reserved prefixes, including, of course, for
hidden namespaces, we could simplify namespace-aware processing a great
deal. I can think of some practical hurdles for the registry idea, but
I can't think of any reason why it's not even worth a try.
[Uche Ogbuji]