Dave Pawson asked for help with using Python's os.walk()
to emit a
nested XML representation of a directory listing. The semantics of
os.walk make this a bit awkward, and I have a good deal to say on the
matter, but I first wanted to post some code for David and others with
such a need before diving into fuller discussion of the matter. Here's
the code.
import os import sys root = sys.argv[1] from Ft.Xml import MarkupWriter writer = MarkupWriter(indent=u"yes") def recurse_dir(path): for cdir, subdirs, files in os.walk(path): writer.startElement(u'directory', attributes={u'name': unicode(cdir)}) for f in files: writer.simpleElement(u'file', attributes={u'name': unicode(f)}) for subdir in subdirs: recurse_dir(os.path.join(cdir, subdir)) writer.endElement(u'directory') break writer.startDocument() recurse_dir(root) writer.endDocument()
Save it as dirwalker.py
or whatever. The following is sample usage
(in UNIXese):
$ mkdir foo $ mkdir foo/bar $ touch foo/a.txt $ touch foo/b.txt $ touch foo/bar/c.txt $ touch foo/bar/d.txt $ python dirwalker.py foo/ <?xml version="1.0" encoding="UTF-8"?> <directory name="foo/"> <file name="a.txt"/> <file name="b.txt"/> <directory name="foo/bar"> <file name="c.txt"/> <file name="d.txt"/> </directory> </directory>[uogbuji@borgia tools]$ rm -rf foo $
Notice that the code is really preempting the recursiveness of os.walk
in order to impose its own recursion. This is the touchy issue I want to expand on. Check in later on today...
via Copia