Re: Please brainstorm: Word-processor compatible with version control

On Tue, 10 Feb 2009 13:20:26 -0600, Ron Johnson wrote:

On 02/10/2009 12:59 PM, Hendrik Boom wrote:
I'd like a word processor compatible with version control systems
(hereafter abbreviated VCS) Having been duly impressed for decades now
how useful VCSs are for programming, I'd like to use them for writing as

I use monotone as my VCS. but I don't suppose my trials are unique to

There are a few other requirements, too, such as ability to export to
file formats often demanded by publishers (such as pdf, Word, and plain
ASCII text)

AbiWord's XML is probably close to what you want.

Last time I looked, Abiword's XML was absolutely stuffed with
obsessive-compulsive layout information. Presumably to make sure that
what I get is exactly what I got the first time by accident.

Has it changed?

Here are more details. Most of the problems is that the file formats
inflate tiny changes to huge changes.

(1) When I arrive at two versions of a document (maybe one has spelling
error corrected, and the other is rewritten from a different POV), I'd
like to be able to merge the changes. Now often there are one-word
changes that appear on the same line of text. Conventional merge tools
just register this as a conflict, even though it's trivial to resolve.
THis is because VCS's tend to be line-oriented.

A user might want to, for example, change the margins, or convert from
single-column to multi-column. That's why single-line-per-paragraph is
so useful.

useful, yes, if your editor does word-wrapping without entering the extra
line-breaks into the file.

(2) Word processors tend to insert an overkill of layout information.
Often a simple change of layout policy causes every line of the text to
be changed, leaving proper merging hopeless. In the past, Abiword
suffered form this. I have no idea if it still does. Precise layout
information belongs in a style sheet, not in the main text. I thought
this was understood since the days of SGML.

I think it still does. But a line in a paragraph, so maybe it's better

(3) Word processors that leave text in a human-readable form (properly
word-wrapped, for example) cause insertion of a single character (such
as a spelling change) to affect the layout of entire paragraphs.

(4) Word processors that use a binary file format are hopelessly
inaccessible to a VCS. Word and WordPerfect are examples of this. So
is the ODT file format used by Open Office.

ODT is zipped XML. Otherwise, they'd be *huge*.

There's .fodt, too, which is similar, but everything's in one file,and is
*not* compressed. Unfortunately, the entire document now appears to be
one line of text. It seems it would be easy for OpenOffice to insert
gratuitous newlines into the text in standard places, perhaps after every
sentence, or after every markup tag, without changing the semantics of the

I'm currently using an ad-hoc notation in UTF-8, edited in emacs,
formatted by homebrew code. I'm careful never to change the source
layout significantly while editing, but even so I have trouble merging
multiple independent changes within a line. Breaking it all up into a
sequence of one-word lines is technically feasible,m and will work with
most VCS's, but is a holeless way to edit.

I suspect I'll be able to hack up something to export to *some* of the
more conventional file-formats. I'm alreday producing Postscript my
printer will take, and a weird mark-up that cuts and pastes well into

Isn't there something that already does most of what I really need?

I'd take another look at AbiWord. And maybe file a couple of specific
bugs against it regarding integration with VCS.

-- hendrik

To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx
with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx