"Trying to pipe into ODf as it's currently written is the equivalent of trying to pour 100 gallons of jet fuel into an 85 gallon tank."
ODf isn't the target of MS-OOXML. HTML is.
- Gary Edwards
Gary Edward, Founder & President of the OpenDocument Foundation, made this response to a discussion yesterday on ZDNet.
It's a long post, but highly informative on this topic where good information is not easy to find. It's the way things went from Gary's, the Foundation's, point of view. [I note (**) where I've added a few helpful links to our old materials that support the narrative. -SH] ...
*******************************************************************
da Vinci here
Thank you John. Just because we are garage challenged doesn't mean we can't find the back door to the big house :)
The larger issue at stake here is not whether or not we have a garage, or what our contribution to ODF has been over the course of five years as active members of OASIS ODF. What it really comes down to is the implementation of ODF in the real world.
The chickens came home to roost when Massachusetts started a year long pilot study regarding the implementation of ODF. The study began shortly after the OASIS approval of ODf 1.0, and ended in May of 2006. The results were nothing short of a disaster for ODF.
Not wanting to give up on ODF, Massachusetts came up with another plan; a hail Mary cry for help known as the RFi.** The Request for Information concerning the feasibility of an ODF plug-in for MSOffice.
They couldn't implement ODF using the many OpenOffice variations that participated in the year long pilot study, so they came up with the idea of an ODF clone of the MS-OOXML Compatibility Pack plug-in for MSOffice. The thinking being that if Microsoft could convert documents, applications and processes to MS-OOXML using a plug-in, then maybe it's possible to do the same with ODF?
The ODf vendors had of course advised Massachusetts that such a plug-in could not be done. But then CIO, Louis Gutierrez, and his staff persisted, releasing the public plea for help. We responded to the RFi based on an interest myself and some other Foundation members had dating back to the initial release of the MSOffice 2003 beta when we were first amazed that an XML conversion could actually be done using the plug-in architecture.
The one thing about the Massachusetts ODf implementation problem that people refuse to admit is that the hard barrier was not political. Sure, the politics were nasty. But Louis and his staff were determined, even though their entire budget was cut out from under them. The hard barrier that was uncovered during the pilot study was that of a wide proliferation of MSOffice bound workgroup-workflow business processes. Processes demanding the high fidelity exchange of documents commonly demanding perfect application version synchronization throughout the chain. It was too disruptive to rip out and replace MSOffice because of these bound business processes. And it was too costly to re engineer the business processes for ODf ready alternatives. Hence the idea of an ODf plug-in for MSOffice.
Even though we proved it was possible to write an ODf clone of the MS-OOXML Compatibility Pack plug-in, and demonstrated this on June 19th, 2006 at a day long event hosted by IBM, there still remained the problem of adapting ODf to a situation it was not designed to deal with. Because we were dealing with business processes, there was an absolute need for a high fidelity “round trip” capable conversion process.
"PLUGIN: Transparently Open a File" ** | fr0mat.net | 17 July 2006
This statement that ODf was not designed to meet the Massachusetts market requirements comes as a shock to the faithful. Yet, most would admit that if Microsoft were to join the OASIS ODf TC, there would no doubt be a massive wave of compatibility eXtensions and application specific eXtensions required for MSOffice apps to implement ODf. From our point of view though, we only needed five generic eXtensions to get the job done.
"PLUGIN: Transparently Save a File" ** | fr0mat.net | 18 July 2006
Doesn't sound like a big deal does it? But it is. The issue of eXtending ODf to meet the compatibility – interoperability needs of MSOffice desktops was destined from day one to tear the ODf community to shreds. It's kind of like the issue of slavery was to the original framers of the Constitution. They couldn't agree as to how to deal with the issue when the country was founded, and it later came back to tear the Republic apart in a horrific civil war. I'll try to explain, so bear with me.
To start with, this much must be understood. The core Massachusetts market requirements are expressed as compatibility with existing file formats (including Microsoft binary documents), and interoperability with existing applications (including MSOffice apps). There is a third requirement called grand convergence, but we'll save that for later. Others, like Sun with their ANSI – ISO vote in favor of MS-OOXML, expressed this same issue as: ”We wish to make it completely clear that we support DIS 29500 becoming an ISO Standard and are in complete agreement with its stated purposes of enabling interoperability among different implementations and providing interoperable access to the legacy of Microsoft Office documents.
The ODf compatibility-interoperability with Microsoft documents and applications has a long and troubling history at OASIS. The very first instance of this issue came on December 16th, 2002, at the first OASIS Open Office XML TC meeting, when the proposed charter came up for vote. Keep in mind that the first phase OASIS ODf group was overwhelmingly comprised of enterprise publication, content, and archive management systems representatives. Representatives from Stellent, ArborText, Boeing, Corel, SpeedLegal, the Australian National Archives, and the Society of Biblical Literature were prominent members. The desktop office suites were a minority.
Throughout those first 15 months of phase one work, the interests, influence and persuasive arguments of this “enterprise” side of the original OASIS ODF group greatly influenced the Foundation's goals of a single universal file format. A vision that went beyond the limits of desktop office productivity suites, and into the realm of what has been called the ”grand convergence” of desktop, server, device and web systems.
"PLUGIN: Default to ODF" ** | fr0mat.net | 18 July 2006
So what happened at that first meeting that is so significant to the later events in Massachusetts? When the proposed charter was brought up for discussion, Phil Boutros, the legendary conversion - reverse engineering - Über document expert from Stellent, proposed that the charter be amended to include as a primary objective, “compatibility with existing file formats”. Including Microsoft file formats.
Phil argued that if we didn't include in the charter this important objective, in the end our work might prove to be irrelevant to the needs of the marketplace. Compatibility with existing file formats would be absolutely essential to the process of converting those documents to ODf XML. So spoke the legend.
There was wide agreement on this. But being a phone conference, all i can say is that only one member attending put forward an objection. Sun's Michael Brauer, who was also the Chairman, postured that we had to be careful with the wording of this “compatibility” objective. His concern being that people might think we were specifically referring to Microsoft binary documents, and that would compromise our goal of providing an application independent specification.
Phil argued that of course we were referencing MS documents because those are exactly the documents that the world would be most interested in converting!
The issue of amending the charter was tabled pending future review. A review that never came even though it was proposed and discussed on many occasions. In later years, whenever the issue of compatibility with Microsoft file formats or interoperability with Microsoft applications came up, which was often, the final word ending discussion was invariably, ”That's outside the charter and out of scope”.
And there you have it. So we knew full well, as Louis and his staff explained the facts of life, that we would have a very difficult time getting any compatibility-interoperability with Microsoft eXtensions through the OASIS ODf TC. Louis took the position that the ODf community and vendors would not allow ODf to fail in Massachusetts. He fully expected everyone to support what came to be called the ODf iX proposals.
Louis of course was wrong. The ODf community didn't even show up. Massachusetts was hung out to dry. But he did spend that summer of 2006 doing everything he possibly could to successfully implement ODf using our da Vinci plug-in. Including signing off on the first three ODf iX proposals submitted to OASIS for discussion in July-August of 2006. Between July of 2006 and February of 2007, there were a total of five major iX proposals submitted to OASIS for discussion. Another very important, but alternative set of RDF metadata iX initiatives was approved by the OASIS ODf Metadata SC as part of the metadata requirements in August of 2006.
The long and the short of it is that ODf could not meet the Massachusetts requirements without the five generic eXtensions known as ODf iX. The five generics addressed the OpenOffice – MSOffice application specific differentials concerning lists, tables, fields, sections, and page dynamics. It is the application unique implementation models for these document structures that accounted for most of the problematic conversion fidelity loss.
Many people believe that the only way ODf can establish high fidelity “round trip” conversion compatibility – interoperability with Microsoft documents, applications and processes is to have the secret binary blueprints. The thing is, our da Vinci plug-in was designed as a clone of the MS-OOXML Compatibility Pack plug-in, and, as such, leveraged the internal conversion process native to MSOffice apps. We don't need the secret blueprints. Microsoft does a good job of converting the binaries for us ;-). We even released ACME 376 as proof that we could in fact hit a conversion fidelity the equivalent of the MS-OOXML plug-in.
The important thing to understand is that an internal conversion process differs greatly from trying to import and break an external stand alone MS binary document. There are two aspect to internal conversion – both of which are critical to conversion fidelity. The first aspect is that of triggering, capturing and decoding the internal application conversion of in-memory-binary-representations. The second is that of piping the decoded structure into the target file format.
For short we call these two aspects ”conversion” and piping. And we have cute names for the actual components that perform these tasks within da Vinci.
Acme 376 is the first aspect of the da Vinci process. The second aspect is called ”InfoSet”, and it is here that the piping into a target file format occurs.
ACME 376 can do the job it was intended to do. But without ODf iX, we are unable to pipe the results into ODf without loss in the areas of our five document structures. Trying to pipe into ODf as it's currently written is the equivalent of trying to pour 100 gallons of jet fuel into an 85 gallon tank.
The amazing thing is that we can pour that 100 gallons of jet fuel into the W3C's CDF, with plenty of room to spare. We don't need any eXtensions or special exceptions to do this. As this dawned on us after months of testing, we we're speechless. If you've spent any time converting MSOffice binaries and xml, CDF will take your breath a way. All i can say is that the W3C CDF Workgroup has done an amazing job of writing a flexible and expansive framework able to handle everything we know MSOffice has to throw at it. Even with the years of business development on the MSOffice platform that must be accommodated.
And oh yeah, the 130 critical workgroup test documents that ODf failed with in Massachusetts are not a problem for the W3C's CDF. Wow. If only we knew this in July of 2006.
The failure of ODf in Massachusetts has had a devastating world wide impact. People accuse of us of failing in Massachusetts. We did. No question about it. Yet one has to wonder, where were the faithful when ODf hung by a thread?
On October 4th, 2006, with the resignation of CIO Louis Gutierrez, ODf was branded in those hushed CIO conversations where pragmatic solutions trump both ISO and legislative mandate efforts every time, as difficult and perhaps impossible to implement. Just as the legend Phil Boutros had warned near four years earlier.
Following the resignation, we stopped all work on ODf da Vinci, and went back into the OASIS ODf process to finish the iX work we had started. The da Vinci plug-in and ODf iX were inextricably tied. Neither could move without the other. But the subsequent events at OASIS turned out to be one disaster after another. By April of 2007 we had failed with the “List Enhancement Proposal”, our metadata requirements were dropped and discarded, our interoperability issues brushed aside, and even the OpenDocument Foundation's purpose for being had been totally gutted.
This ended our work on ODf (April of 2007). And the search for a worthy alternative to MS-OOXML began.
Our feeling was that the marketplace of 550 million MSOffice bound workgroup desktops would migrate to MS-OOXML unless we were able to offer them an alternative. The thing is, the world is going to migrate to XML, regardless of what happens at ISO. It's a pragmatic reality. The only question is, “Which XML?”
The hard truth is that ODf was not designed for the conversion of existing MSOffice documents, applications and processes. MS-OOXML on the other hand, was designed exactly for this purpose. And that's the challenge we're up against. We need an alternative to MS-OOXML that meets the same critical market requirements. Otherwise, we might as well all pack it in as we stand by and watch Microsoft migrate those MSOffice bound business processes over to the core of the emerging MS Stack, the Exchange/SharePoint developers hub.
We believe the W3C's CDF can fill that role as an effective, easy to implement and transition to alternative. We know for certain that ODf can not fulfill this task, as events in Massachusetts proved.
We also believe that we need to neutralize and re purpose MSOffice with CDF for another reason. The Exchange/SharePoint juggernaut is now at 65% marketshare and accelerating. The advantage E/S has over all other competitive alternatives is superior integration into existing MSOffice bound business processes through the systems level binary <> MS-OOXML conversion function. If E/S has 65% marketshare, so does MS-OOXML.
Furthermore, it's becoming ever more obvious that W3C Web platform technologies are being relegated to the bottom by a whole raft of higher level MS Stack proprietary “foundation” replacements. ODf isn't the target of MS-OOXML. HTML is. With the prize being the future of the open Internet.
[Steve Ballmer: "We will win the Web...We will win the Web!" | "Trouble Exists at Microsoft" | BusinessWeek | 26 Sept 2005 ** ]
We might not be able to rip out and replace MSOffice, but at least we can neutralize and re purpose the business process mainstay using CDF and the available plug-in architecture.
My last point is that there were three primary market requirements in Massachusetts: compatibility with existing file formats, including Microsoft documents; interoperability with existing applications, including MSOffice; and the grand convergence, of desktop, server, device, and web systems.
Of these requirements, the first two relate to the legacy situation much of the world finds themselves having to deal with. The third requirement of grand convergence however represents the Massachusetts vision of the future. And it is here that the real strength of CDF comes into play. Grand convergence is the sweet spot for CDF, exactly what it's designed for. What we have to now do is enable the great herd of 550 million desktops so that they can easily, effortlessly, and without disruption, consider the transition to CDF as a fully capable alternative to MS-OOXML.
Because so much of the world's attention was focused on Massachusetts, we were privileged to have had many conversations with CIO's around the world. We know the reality of these requirements is undeniable and universal. So much so that we often have to answer the question as to how it came to be that ODf was not designed to meet these requirements? In recent conversations concerning our work with the W3C's CDF, one analyst commented that, “ODf is a fine format for an alternative universe where MSOffice doesn't exist”. And while many wonder how it came to be, we really don't have the luxury of time to figure out how it is that five years of work on ODf could have missed the mark for 550 million desktops. But miss we did.
IMHO, the reasonable course of action going forward has to be an all out effort to provide pragmatists, who find themselves bound to MSOffice workgroups and business processes, with an alternative to MS-OOXML. Otherwise, it doesn't matter what happens at ISO. MS-OOXML will be the default file format for the majority of businesses and organizations going forward. If the W3C's CDF can be used as an effective alternative to MS-OOXML, this ought to be done yesterday. If those who disagree with Massachusetts and the market requirements continue to insist that ODf can be implemented as a non-disruptive alternative to MS-OOXML, then they really need to get out of the blogosphere and into the trenches where software does the talking.
Hope this helps.
~ge~
Hey buddy can you spare me a garage?
*******************************************************************
please, give me a break, stop your rethoric and just *DO* things ( disclaimer: i'm a computer and office software user and "true open standards" fan, and i support ODF ).
When you were spreading all around the world the "binary key" myth, i sent you an email and asked you where do you have a concrete proove of what you were saying (why? because i feared that this binary key thing could backfire ), and you answered me:
google for gary edwards and you will find the binary key prove ( !!??? )
Conclusion: you are a dishonest person and i don't trust your words. More: you are a dangerous person to open standards ( Microsoft is smilling and thanking you all this anti-ODF blah blah ... )
Make me a favor: get a life.
dario
Posted by: dario | November 08, 2007 at 02:29 PM
Hi dario,
Thanks for responding. The “binary key” is not a myth. When Microsoft released the first beta of MSOffice 2003, it included the earliest instances of what we now know as MS-OOXML; wordprocessingML and spreadsheetML.
With these early instantiations of XML, Microsoft separated the content from the presentation (or what ODf would call styles).
The thing is, the content package could be opened in a text editor and the markup examined, but the presentation package could not. This is where the ”binary key” reference applies. Only Microsoft applications were able to open the presentation package. They were said to have had the ”binary key”.
Microsoft had a very rational explanation for the situation. They argued that their XML was useful to enterprise publication, content and archive management systems in that the content was fully accessible, and these systems would apply their own presentation anyway. This argument is actually very accurate. When documents are re purposed, or content objects reused, the content remains unchanged as higher level applications apply their own presentation templates. Separation becomes very useful wherever reports and searches of large document libraries is important to particular business processes.
The same arguments are made for ODf's separation of content and styles. Maybe even more so. Document processing experts have this view of the world where each application represents a specific processing purpose. Preserving content is one thing. Replacing the presentation aspects of a document another. They fully expect and anticipate that each application in a processing chain will implement their own presentation.
This is why the publics expectations for a PDF quality preservation of a documents presentation isn't recognized by ODf document processing experts as an achievable objective. If each application is expected to NEED to change a document re purposing or document object reuse, as determined by a particular processing business requirement, then you design the file format to meet those expectations and needs. Which they did.
The publics expectation of PDF presentation quality comes a bit after the fact. Capice?
Getting back to the binary key”. Contrary to popular opinion, i didn't coin the term ”binary key”. The Valoris Group did.
As i'm sure your well aware, the Valoris Group conducted the three year study for the EU-IDABC concerning the construct of a highly interoperable information infrastructure for the future. They focused on XML, SOA, and the emerging Web Platform, and the best way of making the transiton. They also created the fictional ”Open Document” as a means of describing how their vision of this highly interoperable – SOA -SaaS – web ready world would work.
When the Valoris Group saw Microsoft's first run at XML, they wanted both the content and the presentation markup packages accessible. They wanted the ”binary key” removed. Which, actually did happen. By the time MSOffice 2003 was released, both packages were open and accessible. End of controversy.
The odd thing is that the web was subsequently scrubbed of any Microsoft defense of this application lock. That's too bad, because those arguments are very relevant to today's very difficult interoperability discussions.
The traditions of file formats are that they are bound to specific applications. The first run at XML, whether you see it from the Microsoft view point or the ODf viewpoint, recognized that content could easily be separated from specific applications. But presentation could not.
This is as true today as it was back in 2002 -2003. The presentation aspects of a document are very much dependent and bound to the internal in-memory-binary-representation layout model of any particular application. Hence, converting content is easy. But converting presentation always entails a loss of fidelity due to application specific layout differentials.
In Massachusetts it was proven that anywhere MSOffice workgroups dominated a particular business process, the round trip fidelity presentation issues were an impossible problem for ODf. End users responsible for critical day to day business processes demand near PDF presentation quality.
What we did in Massachusetts was try to figure out what changes could be made to ODf to meet these MSOffice bound requirements. For the most part this means identifying those application specific presentation or layout differentials between OpenOffice – ODf and MSOffice binary / xml.
It turns out we could solve the problem with the use of five generic elements for the structural elements of lists, tables, fields, sections and page dynamics. We had two approaches.
The first approach was called ODf iX, and consisted simply of adding the five generics to ODf. The second approach however was far more elegant and ingenious. This involved using the then new metadata RDF/XML approach.
In August of 2006, with ODf hanging by a thread in Massachusetts, California and the EU-IDABC, we actually got our metadata model approved as part of the ODf Metadata SC requirements. It turns out this wasn't enough to save ODf in those governments, but we thought the concept worthy enough to pursue long after the ODf collapse that followed the October 4th, 2006 resignation of CIO Louis Gutierrez.
The metadata approach was based on the concept that all presentation is just metadata for particular content objects. We thought that the only way we could reconcile the publics PDF expectations with the model pursued by document processing experts would be to shift the binding burden of presentation from application specific packaging to a content object metadata model. It seemed to us that the RDF – RDFa properties model would be ideal for this transition. We also believed that this approach was critical to solving the feature set differential problems between lightweight web platform applications, and heavy office suite legacy applications. And, the model had a gradualism aspect to it that might actually work in a world where interoperability problems between legacy applications and emerging web platform oriented applications were certain to clash.
Anyway. Needless to say, by April of 2007 it was very clear that we would never get our ODf iX or metadata changes through OASIS. End of story. Time to move on.
Public awareness of the presentation issue perhaps began with the infamous binary key discussion. It continues however because the issues of application – file format dependencies have yet to be resolved.
Hope this helps,
~ge~
Hey buddy can you spare me a garage?
Posted by: Gary Edwards | November 08, 2007 at 06:10 PM