Re: parsing config files from an api... xml?

From: Måns Rullgård (mru_at_mru.ath.cx)
Date: 10/03/04


Date: Sun, 03 Oct 2004 00:02:38 +0200

Christopher Browne <cbbrowne@acm.org> writes:

> Martha Stewart called it a Good Thing when juliagoolia@eml.cc (julia) wrote:
>> I was wondering if there is project or library out there which
>> allows one to parse and edit, update various config files on linux.
>> For example, I'd like an API to parse the apache config files to add
>> new virtual servers.
>>
>> The problem of course is that all the files follow a different
>> syntax. It would be nice if there was a standard like XML or
>> something. That way, someone could easily edit which ever config
>> they wanted using the same API.
>>
>> I want to do this because I'd like to be able to easily configure a
>> linux server via the web. Webmin is pretty good, but I'd like to
>> simplify it further.
>
> XML would by no means make things "simpler;" it conspicuously
> introduces a way to make fatal formatting errors easier to introduce
> and harder to find.

The way XML has developed, the only way to use it is by generating it
programmatically, as well as parsing it. Even then, there are subtle
ways to get things wrong and end up with garbage. A little horror
story may illustrate.

I need to create some simple images (buttons etc. for a web page) from
data in an XML file. I do this by applying an XSLT transformation
producing SVG files, which are then rasterized by Batik. At first, I
did this by running the xslt program from the Apache Xalan package,
and then running batik-rasterizer on the files produced. Everything
worked well, but it was slow, and not automated. So I decided to do
it programmatically from a Java servlet in the Apache tomcat web
server. This is where things get weird.

Performing the XSLT transform was simple. It required only half a
dozen or so temporary objects to feed the garbage collector. What I
thought seemed like the best approach was to have the transformation
produce a DOM tree and hand this over to Batik to produce PNG files.
Not so. It produced files, for sure. There was just this little
annoyance that they were all the correct dimensions, but completely
transparent. There are ways to produce transparent PNG files using
somewhat less CPU cycles.

Being a natural way to debug the problem, I saved the intermediate SVG
DOM trees to files, hoping to find some error there. Curiously
enough, these were perfect, and were rendered properly by Batik.
Continuing my bug hunt, I modified the Java code to feed these SVG
files to the rasterizer, instead of using the in-memory DOM tree. As
if by magic, the files came out exactly the way I intended them to
look.

A google session turned up some pages explaining how saving a DOM tree
to disk and reparsing it will under certain circumstances produce a
slightly different tree, which somehow is also the correct one.

I have always had my doubts about the merits of XML, and this
experience shows me that there is indeed reason to be careful about
thoughtlessly jumping on the band wagon, just because XML happens have
received a bit more hype than it deserves.

-- 
Måns Rullgård
mru@mru.ath.cx