RobertBachmann stopped by
#atom to mention
that he'd tried to run an Atom file on the non-normative RELAX
NG for the Atom RFC draft (I
haven't seen an RNC for the final RFC itself). It failed because he
xml:lang in an
atom:name child of
contradicts the Atom spec, which says:
Any element defined by this specification MAY have an xml:lang attribute, whose content indicates the natural language for the element and its descendents.
The RNC did not specify this attribute in a couple of cases. The RNC is non-normative, but in this case there is no reason for divergence from the spec. I whipped up an atom.rnc that fixes the bug. Here's the diff from the version I found on-line.
This did set up a discussion between Anne van
Kesteren and me. I feel that
only makes sense for some Atom elements, and that perhaps allowing it on
all of them could be confusing. What, for example, does it mean to have
xml:lang on the
atom:uri child of
atom:author? I suppose an
outlandish (pun intended) interpretation could be references to
localized sites, but that's really the province of the likes of XHTML's
hreflang attribute. Moreover, I'm a bit puzzled by the bit from the
Atom spec that seems to support my leaning:
The language context is only significant for elements and attributes declared to be "Language-Sensitive" by this specification.
So if it's not significant, why allow it? I think maybe there should
have been a split in attribute sets between
atomCommonAttributes and a
atomCommonLanguageSensitiveAttributes, where the former would omit
Also, I'm used to the convention where
xml:lang is used with content
models that allow a language-sensitive element to be repeated, providing
for multiple language versions in the same document. There are many
cases in Atom where this would not be possible. For example, you could
not have an English
atom:title and a French one within the same
atom:entry element. You could get tricky with by using a single
type="xhtml" and multiple language versions within
xhtml:div, but this feels a bit constricting.
Anne doesn't mind
xml:lang everywhere, and pointed out that
xml:lang="" is an option for specifying no language context (rather
than language context inherited from parent). I think in the end I
could go either way on
This discussion also made me think of
xml:space. This special
attribute might get a mention right in the XML
spec, but that doesn't
mean it doesn't have to be addressed in XML applications. Even in the
case of DTD, the spec says
In valid documents, this attribute, like any other, must be declared if it is used.
The same goes for RELAX NG, the conventional schema language for Atom.
There is no
xml:space to be found in either the normative RFC or
non-normative schema, but the rules for Atom
allow for this attribute (as well as
xml:id and just about any other
XML or 'global' attribute). I assume that the intention is for
applications to treat this attribute using the suggested semantics in
the XML 1.0 spec. I do wish Atom had been explicit about this as is,
for example, the XSLT 1.0 spec.