Dave Pawson asked for help with using Python's os.walk() to emit a
nested XML representation of a directory listing. The semantics of
os.walk make this a bit awkward, and I have a good deal to say on the
matter, but I first wanted to post some code for David and others with
such a need before diving into fuller discussion of the matter. Here's
the code.
import os
import sys
root = sys.argv[1]
from Ft.Xml import MarkupWriter
writer = MarkupWriter(indent=u"yes")
def recurse_dir(path):
for cdir, subdirs, files in os.walk(path):
writer.startElement(u'directory', attributes={u'name': unicode(cdir)})
for f in files:
writer.simpleElement(u'file', attributes={u'name': unicode(f)})
for subdir in subdirs:
recurse_dir(os.path.join(cdir, subdir))
writer.endElement(u'directory')
break
writer.startDocument()
recurse_dir(root)
writer.endDocument()Save it as dirwalker.py or whatever. The following is sample usage
(in UNIXese):
$ mkdir foo
$ mkdir foo/bar
$ touch foo/a.txt
$ touch foo/b.txt
$ touch foo/bar/c.txt
$ touch foo/bar/d.txt
$ python dirwalker.py foo/
<?xml version="1.0" encoding="UTF-8"?>
<directory name="foo/">
<file name="a.txt"/>
<file name="b.txt"/>
<directory name="foo/bar">
<file name="c.txt"/>
<file name="d.txt"/>
</directory>
</directory>[uogbuji@borgia tools]$ rm -rf foo
$Notice that the code is really preempting the recursiveness of os.walk in order to impose its own recursion. This is the touchy issue I want to expand on. Check in later on today...
via Copia