Is Open XML a one way specification for most people?

Print Friendly

I have been accused in the past of using a “weight” argument against the Open XML specification because it is several thousand pages long. While some people may think that is cute or funny, it is a real concern and is an obvious problem that programmers recognize. That is, it is hard enough to implement a standard correctly, and one that is that long will be virtually impossible to do right. Mark my words on this and let the buyer beware.

Who will implement Open XML correctly and fully? Maybe Microsoft. Why? Since it is essentially a dump into XML of all the data needed for all the functionality of their Office products and since those products are proprietary, only they will understand any nuances that go beyond the spec. The spec may illuminate some of the mistakes that have been made and are now being written into a so called standard for all to have to implement, but I’m guessing there might be a few other shades of meaning that will not be clear.

Fully and correctly implementing Open XML will require the cloning of a large portion of Microsoft’s product. Best of luck doing that, especially since they have over a decade head start. Also, since they have avoided using industry standards like SVG and MathML, you’ll have to reimplement Microsoft’s flavor of many things. You had better start now. So therefore I conclude that while Microsoft may end up supporting most of Open XML (and we’ll have to see the final products to see how much and how correctly), other products will likely only end up supporting a subset.

That means that other products and software, in practice, will NOT be able to understand arbitrary Open XML that might be thrown at them. There is just too much. Therefore they will only create a bit that they need and send that off. Send it off to whom? The only software that might understand it, namely Microsoft Office.

So this is how I see this playing out: Open XML will be nearly fully read and written by Microsoft products, but only written in subset form by other software. This means that data in Open XML form will be largely sucked into the Microsoft ecosystem but very little will escape for full and practical use elsewhere.

In my opinion, suggesting “choice” among ODF and Open XML by governments who are seeking control, true choice, and interoperability is nothing more than maintaining the status quo — a requirement for Microsoft products under the guise of supporting a “standard.”

All standards are not the same and providing support for all standards is not the same. Think about the implications of what is going on here. Open XML is not about interoperability in general, it is about moving or keeping a vendor’s products at the center of a universe. Nice marketing tactic. And that’s what this is about: we hear nice words about “open” and “xml” and “standard” but the reality and problems of producing real implementations are left to technical folk. Listen to what your technical people tell you before you make policy decisions that are not open and not particularly practical.

To be clear, people can choose to implement Open XML or not. I think some will try. Let me know after you do.

Finally, do this little thought experiment: imagine how thick a ream, or 500 sheets of paper is. Double that to get the thickness of a thousand pages, make that 4 times thicker to see how thick 4000 pages is. That’s how many pages were in the last draft of the Open XML spec. How many people will you need to implement that fully and correctly, much less read it? I believe the final version is around six thousand pages (correction?). I think we’re already past feasibility for most people unless you’ve already implemented and debugged the software over a period of years.

Also see:

Also See: An “OOXML is a bad idea” blog entry compendium


  1. (sarcasm
    But heeey! It’s XML! Everyone is crazy after it! Just write some XSLT and transform it into something that other programs can interpret!)

    MS might actually be doing a favour to everybody with this. Finally will people realize that XML is NOT a panacea as it is being presented. And finally, programmers will MAYBE realize that there are often better alternatives than XML (I wrote about this on my blog). Let XML die slow and painful death it deserves.

  2. I loathed XML for a while. I started coming around, though, after a while — as long as it’s used in the right place (and the data format for an IM protocol is NOT the right place). Then, I noticed the similarity in structure between S-expressions and XML, and collected that together with the greatly increased difficulty of parsing XML (by eye or technologically) and, worse, writing it using automated tools. Now, I’m back to loathing XML. As such, I kinda sympathize with Zeljko in this case.

    As for the OpenXML spec, I agree that a document spec longer than War And Peace, Atlas Shrugged, and Cryptonomicon combined is a bit obscene and is probably intended in part to discourage complete compliant implementation. That sort of technical motivation is right up Microsoft’s alley.

  3. Bob, there’s nothing wrong with a government requiring Open XML compliance for purchasing, as long as they keep with the historical tradition of government acquisition: create a compliance suite that tests for every single line item in the specification, then reject anything that doesn’t comply exactly.

    IBM played that game at Grand Master level for decades. I don’t see any reason why both OpenDocument and Office Open XML shouldn’t both be allowed — assuming products _fully_ comply with the relevant specifications.

  4. We often tell people what we did a long time ago, and how we learned from our mistakes.

    I don’t think governments need to require multiple standards in this area. Interoperability and choice of applications is what is important here. Having two standards unnecessarily complicates things. As I pointed out, having Open XML alone overly complicates things. ODF should win then and we should leave Open XML to what it really is, a shadow of Microsoft’s huge and sole implementation of their office products. The goal is not to have one product suite that implements one standard. The goal is to have a standard that can be widely implemented and then to give the government choice among them.


  5. You know what they say: “XML is like violence. If at first it doesn’t solve the problem, then use some more”


  6. If ISO would be so mistaken as to ratify MOOX, then ISO is in fundamental jeopardy. Microsoft’s new XML format is so far from meeting a reasonable man’s criteria for a standard that it puts the question to the standards process.

  7. @apotheon, XML is part of a great evolution of human/machine readable protocols (Tcp/Ftp/Http/Smtp/Html). It’s not the best language for humans, and it doesn’t have the best performance for computers but it surely helps with Rapid Application Development (with more developers, platforms or systems). And of course it is very open in its nature. I don’t know exactly why you failed in your XML-endeavour but I certainly enjoy it. Your quarrel with XML might be because you tried to use the wrong tools. I can say on Windows MSXML is perfect for creating, reading and transforming XML. With Php you can use SimpleXml, which is also very easy. And of course when you have any experience with generating valid (x)html, you should also be able to generate XML (they are just strings!).

  8. Well, but what is the solution? What would you suggest Microsoft should do? Remove features from Office?!? That seems the only solution to make the spec of the file format shorter, isn’t it?

  9. @davidacoder

    What we expect (I think that’s the wrong word) is that they try to go as much as possible towards standardised functionality. They should, for example, eliminate their own graphics format, and use SVG. Where their graphics format has something that doesn’t fit SVG they should make a clear minimalist extension which provideds that functionality.

    This would a) make the standards” text much shorter (since just saying “use SVG” / “use Unicode” / “use XXX” would cover a large chunk of the standard) b) mean other companies could reuse their existing code and c) make a contribution to extending SVG to be more generally useful.

    Now, of course, it’s true that some of their extra little standards would probably not be accepted by everyone; possibly other people would have a better way to do the same thing but for their next release they could work towards that by agreeing with those people.

    A good way to start off would be to adopt ODF and then propose extensions to that which allowed all of the functionality that they need for word. It might take a little time, but it would get to one single global standard.

    Now, I wonder if I used the wrong meaning of expect. I thought you meant “think Microsoft should do”. If you meant “think Microsoft was likely to do”, then the answer is: exactly what they are doing; try to lock people into a proprietary format so that they can leverage their illegal monopoly (as described in US vs Microsoft) to eliminate other products from the market.

  10. It sort-of depends. If you think Microsoft’s definition of innovation is ‘extend the standard’ (currently running to 6000 pages) then the purpose of that is to allow for prettier, more-highly-differentiated, but shorter-lived office documents.

    If you think ‘PowerPoint’, then that describes quite well what a salesman wants from his sales pitch. He doesn’t need longevity, or standards, or interoperability.

    Pitch, sell, and leave.

    Next month’s pitch can use Microsoft’s new features. The faster the feature set changes, the better … for that purpose.

Comments are closed