« "I push my senses out..." | Main | Eunice on ODF Politics »


Stephane Rodriguez

Great paper. I'll add my own comments to it.

First some terminology,

"ECMA 376" is a set of file formats subject to ECMA and now to ISO.

"Office 2007" is a set of file formats which extend "ECMA 376" file formats.

Office 2007 file formats are undocumented per se. ECMA 376 are.

ECMA 376 file formats are documented but only at a syntactic level. To realize the true meaning of every single attribute is to realize that the documentation is more like 600,000 pages, not 6,000. Of particular difficulty is to keep some kind of control over the virtually infinite combinations of such attributes.

Quick analysis of the underlying schemas reveals that simple concepts such as text formatting is expressed in no less than 6 different and incompatible ways. This leads to thinking that the file formats were only designed to comply with existing legacy formats that themselves are the result of 15 years of inside/outside library aggregation (some of the libraries were bought from non-Microsoft vendors). In fact, the truth is, ask any reverse engineer third-party who worked with legacy formats, they'll tell you Microsoft essentially added angle brackets around the binary serialization in legacy formats. This makes for a very cool XML-based file format, not an international standard.

ECMA 376 documents do not grow spontaneously, Microsoft has arranged a few migration scenarios, especially from their legacy formats (i.e. Word 97-2003, Excel 97-2003, Powerpoint 97-2003). None of which are documented. The mappings are left for one to discover, therefore it is not possible to read the ECMA 376 paper and infer the mappings.

ECMA 376 documents also support round-trip scenarios, by which new formats "bits" are stored as secret records within legacy formats, kept as is, and whose objects will reappear if those legacy formats are migrated again to newer formats. None of those secret records are documented, making it impossible for a vendor outside Microsoft to infer these from reading the ECMA 376 paper.

ECMA 376 formats contain many US-centric coordinate systems and measure units, which by definition cannot survive analysis neither the needs of an international standard.


Minor nit: "standard ultimately wrests with one organization" - should that be "rests"?

Jason G

The amount of FUD in this "paper" might be forgivable if it was written 12 months ago. However, all the points raised have been met. There is nothing new here. Just a vigerous rehashing of the same old religious arguments we've been hearing for the last year from the ODF "side" Why not just go read the Weir blog if you want to read this drivel? I didn't check, but maybe the paper is just a selective cut and paste job from the same?

Funniest moment of the whole thing? Talking about the lack of technical speceficity on OOXML. Mr. Pot, meet Mr. Kettle

With the focus on "open," I'm surpirised a detailed look at the Sun IPR over ODF wasn't offered. What about a look at Sun's promise not to sue? Talk about "hole-y..."

Sam Hiser


Thanks for your comments, and I'm glad you've been following the issue closely.

While the paper is an intentional recapitulation of old arguments, I think the quality of the arguments rules out your FUD claim.

"Much of the information in this paper has appeared before, but not in a synthesis on the openness theme."

Yes, Rob Weir gets a single reference in the document -- one very apt piece he wrote is referenced among the 32 other endnotes. No doubt this is an important negotiating chip for IBM; and Rob originated some of the better arguments against OOXML, but he did not originate all of them, nor common sense itself.

It would be useful if you would comment on the merit of each of the arguments. That would add quite a lot here.

The comments to this entry are closed.

Sam Hiser




This is a Flickr badge showing public photos from swhiser. Make your own badge here.

Locations of visitors to this page

Search PlexNex


View Sam Hiser's profile on LinkedIn

Powered by TypePad