RobertBachmann stopped by
#atom to mention
that he'd tried to run an Atom file on the non-normative RELAX
NG for the Atom RFC draft (I
haven't seen an RNC for the final RFC itself). It failed because he
used xml:lang
in an atom:name
child of atom:author
. This
contradicts the Atom spec, which says:
Any element defined by this specification MAY have an xml:lang attribute, whose content indicates the natural language for the element and its descendents.
The RNC did not specify this attribute in a couple of cases. The RNC is non-normative, but in this case there is no reason for divergence from the spec. I whipped up an atom.rnc that fixes the bug. Here's the diff from the version I found on-line.
This did set up a discussion between Anne van
Kesteren and me. I feel that xml:lang
only makes sense for some Atom elements, and that perhaps allowing it on
all of them could be confusing. What, for example, does it mean to have
xml:lang
on the atom:uri
child of atom:author
? I suppose an
outlandish (pun intended) interpretation could be references to
localized sites, but that's really the province of the likes of XHTML's
hreflang
attribute. Moreover, I'm a bit puzzled by the bit from the
Atom spec that seems to support my leaning:
The language context is only significant for elements and attributes declared to be "Language-Sensitive" by this specification.
So if it's not significant, why allow it? I think maybe there should
have been a split in attribute sets between atomCommonAttributes
and a
atomCommonLanguageSensitiveAttributes
, where the former would omit
xml:lang
.
Also, I'm used to the convention where xml:lang
is used with content
models that allow a language-sensitive element to be repeated, providing
for multiple language versions in the same document. There are many
cases in Atom where this would not be possible. For example, you could
not have an English atom:title
and a French one within the same
atom:entry
element. You could get tricky with by using a single
atom:entry
with type="xhtml"
and multiple language versions within
the xhtml:div
, but this feels a bit constricting.
Anne doesn't mind xml:lang
everywhere, and pointed out that
xml:lang=""
is an option for specifying no language context (rather
than language context inherited from parent). I think in the end I
could go either way on xml:lang
everywhere.
This discussion also made me think of xml:space
. This special
attribute might get a mention right in the XML
spec, but that doesn't
mean it doesn't have to be addressed in XML applications. Even in the
case of DTD, the spec says
In valid documents, this attribute, like any other, must be declared if it is used.
The same goes for RELAX NG, the conventional schema language for Atom.
There is no xml:space
to be found in either the normative RFC or
non-normative schema, but the rules for Atom undefinedAttribute
do
allow for this attribute (as well as xml:id
and just about any other
XML or 'global' attribute). I assume that the intention is for
applications to treat this attribute using the suggested semantics in
the XML 1.0 spec. I do wish Atom had been explicit about this as is,
for example, the XSLT 1.0 spec.