In the United States we have a magazine called Consumer Reports. Their mission is
Consumer Reports® and ConsumerReports.org® are published by Consumers Union, an expert, independent nonprofit organization whose mission is to work for a fair, just, and safe marketplace for all consumers and to empower consumers to protect themselves. To achieve this mission, we test, inform, and protect. To maintain our independence and impartiality, CU accepts no outside advertising, no free test samples, and has no agenda other than the interests of consumers. CU supports itself through the sale of our information products and services, individual contributions, and a few noncommercial grants. Consumers Union is governed by a board of 18 directors, who are elected by CU members and meet three times a year. CU’s President, James Guest, oversees a staff of more than 450.
If you are thinking about buying a refrigerator, a home theater system, or even a car, you get a Consumer Reports and read their ratings. There you will get an idea about safety, energy efficiency, quality, and cost. There are other similar services. Even the US government has a website that will give you the latest information about automobile fuel efficiencies.
If you’ve ever bought anything from Amazon, you’ve probably seen product ratings by people who have presumably bought them, along with comments. This is particularly useful if you are comparing, say, two books on the same topic or two CDs from a musical artist whose work you have never before purchased.
Let’s say you are considering buying a house and you are comparing two of them. What is your criteria? Assuming they are in the same neighborhood and school district, you look at the architecture, the quality of construction, and the maintainability of the buildings and landscaping. There are other factors, of course. You might not be especially quantitative in your analysis, but you would probably have a home inspector look at the buildings to tell you if there were any problems, especially if you are not an expert. The inspection reports that I have seen are check lists that systematically rank the quality and current state of the house from top to bottom.
We have nothing like this for standards.
What would it possibly mean for something to be a “one star standard” versus something that is a “five star standard”?
We have folks who do standards compliance testing for a business, but this is not evaluating the quality of the standards themselves. I will note that we do have the Web Services Interoperability Organization which looks at existing standards and best practices in using them, and then recommends both profiles for deploying the standards well and future changes to the standards that will improve them.
We have thousands of standards and no clear way to decide which of them are good and which are not. Instead, we more of less go by the organizations that create the standards, whether we are actually required to implement them (say, by law or customer requirements), or if the market leaders use them.
I’m going to tackle the issue of quality and standards organizations in a future entry, but let me say that
- Standards organizations are not all equal in quality, though it doesn’t seem that everyone knows that.
- A given standards organization can produce two standards of wildly divergent quality.
- In my opinion, the key measurement of a standards organization is not the quantity of standards produced but the quality of standards produced.
As a disclaimer, I’m very aware that when IBM is involved in the creation of standard, we probably want people to use that. The same goes for everybody else.
In some cases there may be only one standard for a particular purpose. Do we just accept that or can we apply some set of metrics to it to help the maintainers evolve it into something better?
Let me propose some possible criteria for measuring standards quality. Please comment on these or suggest your own, though see my rules about comments at the close. Note that in any given criterion it may be reasonable to consider a numeric value between 0.0 and 1.0 for ranking how a particular standard does. That is, do not assume answers can only be “yes” or “no.” (You don’t get to abstain …)
- Openness criteria:
- Was the standard developed by an independent community of experts in a way that did not advantage any one software provider’s products or projects?
- Will the standard be actively maintained by an independent community of experts in a way that will not advantage any one software provider’s products or projects?
- Is the standard document freely available to everyone?
- Is the standard itself freely implementable by everyone?
- Can a subset of the standard be used or can the standard be used as a component of another standard?
- Is the standard designed to maximize use of pre-existing high quality standards?
- Is the standard well factored to allow modular implementation and maximum re-use of common components? For example, we have only one way for representing an address or bold text. That is, does the standard have the right granularity?
- Are modern best practices for design used, or are legacy formats employed even though better, equivalent, and more advanced alternatives are available?
- Is the standard designed to maximize its use by reference or inclusion in other high quality standards?
- Does the standard have a well designed extension architecture?
- Does the standard adequately represent the semantics of the information encoded within it?
- Are the methods for describing metadata complete and themselves based on high quality standards?
- Can information represented via the standard be processed efficiently by modern tools?
- Does the standard encode product-specific attributes that others are expected to implement?
- Does the standard have gaps or ambiguities that will lead to divergent interpretations and incompatible implementations?
- Does the standard make reference to required or likely use of other standards or proprietary specifications that may not be generally available or considered to be high quality?
I’m sure there are others, please add your suggestions and I’ll do a future entry that pulls them all together.
One criterion which I did not mention explicitly is “Is the standard elegant?”. I mean this in the way mathematicians use the term about theorems and other results: well designed, minimal, clever, and absolutely obvious once you see it. That is, just right.
Here are the rules about the comments:
- Keep them civil.
- Mention no standards by name. I do not want this to devolve into a “this is why that standard by those people is terrible” discussion.
- Talk about qualities and best practices that we should apply to standards today so that they will be useful into the future.
- Think about and discuss metrics and other aspects of quantitative analysis that can apply to the criteria.
I’m less interested in preserving the status quo of standards creation than setting up a framework for understanding the quality of what we have and improving the quality of what we have yet to build.


How about the following suggestions:
Has the standard been fully implemented by several independent parties?
Has the standard been proven to ensure interoperability or portability, whichever the case may be? (Note: This relates to the gaps and ambiguities question but goes beyond by relying on testing of actual implementations.)
Does the standard come with a test suite that developers can use to help them ensure compliance?
Building on Arnaud’s comments:
Does the standard have multiple independant implementations _from the spec_ for _every_ feature?
Does the standard have a test suite so that developers, standards bodies and end-users can test compliance and interoperability?
Do sub-setting, extensions and unusual/non-compliant inputs cause well-defined behaviour?
Is the standard a point solution to a specific problem, or is it a general solution for the problem space, which will handle future requirements?
Was the right standard development method used?
Was the standard developed rapidly using “rough consensus and running code” to be refined / expanded later after experience gained using the standard?
Or is it the product of a slower one-representative-from-each-company/country committee that includes _every_ feature that _any_ member wants.
* Is this an “enabling” standard (i.e. infrastructure, suitable for ISO standardization and convergence) or an “application” standard (suitable for industry consortia and competition)
* Does the standard fulfill its stated goals?
* Does the standard have clear scoping that allows it to be distinguished from other standards in the same space?
* Does the standard have a pre-existing implementation? (a la RFCs and W3C)
* If it is a standard for an existing technology, how well does it describe that technology?
* If it is a standard that brings together existing approaches, what proof is there that all parties (technologies) were represented?
* What anti-trust and anti-cartel procedures were in place in the committee or organization that made the standard?
* If it is a standard for something new, are the use cases adequately specified?
* What provision is there for objective verification: if it is an application standard, does it have an (abstract) test suites; if it is a document standard, does it have schemas?
* For a document standard, are constraints beyond that expressible by simple schema languages expressed using a higher-level rule language (e.g. ISO Schematron) or just as text/diagrams? ?
* Are syntaxes modeled using ISO BNF?
* Are references adequate and complete?
* Does it have a security statement or policy? Has this been audited?
* Does it have an accessibility statement or policy? Has this been audited?
* Does it have an internationalization statement or policy? Has this been audited?
* Does it have adequate formal data model or formal activity diagrams (or other UML: state diagrams, etc) or formal mathematical or logic description?
* Does it have adequate informal descriptions?
* Does the format give a mechanism to support plurality, so that subsequent layers of technology can be swapped with experience and time?
* Does the format make profiles easy? Does it clearly identify extensible points that work against interoperability? Do schemas have unnecessary required element that would make profiling difficult?
* Are multiple levels of conformance clearly spelled out and named?
* If for a document standard, is the standard clearly organized with a view to best “cohesion and coupling”? Are namespaces given separate parts of the standard? Are schemas modularized to reduce coupling?
‘Mainframe Linux at SHARE’ is reported here http://www.linux.com/feature/118624 .
So, about a good document representation standard. “Could I do something useful with an office productivity document, if I had a lot of them on an IBM mainframe ?”
The mainframe won’t have a screen, and it may not be on the same continent as its users; but you still might like to analyse the contents of all the documents, to work out what they mean, to take better business decisions based on what they say.
Of course, interoperability cuts both ways. “Could I do something useful with my airline ticket pricing system, if I wanted to run it on a Personal Computer ?”
Still, we take the rough with the smooth. That’s business.
A more simpler set of questions could be:
1. How well does the standard meet its scope?
2. How well does the standard meet its purpose?
3. How specialised is the standard to meets its purpose?
4. How well written is the standard?
5. How well does the standard ensure interoperability?
6. How appropriate and relevant are other standards used by the standard?
7. How well does the standard/standard process ensure a fair playing field for all?
All these questions can be rated in a numeric scale of:
0.00 – Could be better
0.25 – Good
0.50 – Fair
0.75 – Very Good
1.00 – Excellent
I suggest splitting the various criteria into two groups: those which describe the standards organization and those describing the standard itself. You alluded to this when you said that an organization can produce standards of differing quality, and I’ll suggest further that it would be difficult for a “bad” organization to produce a good standard.
For the organization and the committee process which governs the development and approval of standards, the accepted principals of openness, balance, transparency, consensus, due process, etc. should all apply. These principals are defined in the U.S. Standards Strategy (www.ansi.org), the U.S. government Office of Management and Budget Circular A-119 (http://www.whitehouse.gov/omb/circulars/a119/a119.html), the World Trade Organization’s agreement on Technical Barriers to Trade (http://www.wto.org/english/tratop_e/tbt_e/tbt_e.htm), and elsewhere.
The standard itself should reflect the various criteria suggested by the other responders above.
I’ve written at length on this topic of best practices for standards organizations in my blog at http://www.kavi.com/blog.
Very good! I discuss it with my peers for some time. What really matters is to get a multidimensional concept of standard evaluation.
Difficult to quantify, but usually the standard has to ensure that for any article claimed to be conforming or compliant, a set of test measures can be produced that will test that conformance or compliance.
In short, the strength of the standard is known by the test suite. If none such exists, one has a rough but adequate measure of the consumability of the articles and this is the actual answer one needs.
I couldn’t agree with Bob’s points more.
Within many of the standards organizations in which I participate, there has been a gradual, but steady progression towards producing higher quality specifications. At OASIS, there is a criteria for becoming an OASIS Standard that there be three statements from TC members that they were “using” the specification. The running joke was that the meaning of “use” could be that you were “using” the specification to raise a leg on your kitchen table so it didn’t wobble. We’ve come a long way since those days. Unfortunately, the OASIS TC process hasn’t improved this requirement (despite my constant rantings that they should:-). However, it is now typical that a TC will hold an interoperability event(s) to test multiple independent implementations of the specification(s) prior to advancing to OASIS Committee Specification status. It would be really nice if OASIS would make such interoperability testing a requirement, and change the requirement from a “Statement of Use” to something a bit more substantial, such as the requirements that the W3C imposes on specifications advancing to W3C Recommendation status.
Within the W3C there is a requirement for advancing a specification to W3C Recommendation, there must be (typically) a minimum of two or more interoperable implementations of each feature in the specification. Those features that the working group feels might not meet the requisite exit criteria for the CR phase (also known as the Call for Implementations phase), are marked as “at risk” and must be removed from the specification should they not achieve the stated exit criteria.
In the WS-I, the Board recently adopted a resolution that in order to advance to WS-I Final Material status, that a Profile had to have 5 independent interoperable implementations of the profile. Some have made a stink about this new requirement, but IMO, it is a “Good Thing(tm)” because for a Profile to improve interoperability, it needs to be broadly adopted/implemented.
But, I digress…
The addition of test cases/assertions, as suggested by Arnaud and others above, that can be used to measure conformance to the specification are, IMO a key to delivering a higher quiality standard. I’d like to see more of this across the standards landscape. This is something that I have been pushing for in WS-I Profiles, and something that will manifest itself when we publish the WS-I Basic Profile 2.0 draft (real soon now, I hope).
Without a more specific case, I would in say that 99% of the time, “standards for standards” places an undue burden on the process. Generally the companies involved in Standards committee consortiums have too much interest in the possibility of “hijacking” standards- Often a giant Jackpot for certain rare cases.
An sad example is Microsoft’s history with Open Group’s Distributed Computing Environment, including DCE/RPC, which is used by MicroSoft in their MSRPC. In fact the D in DCOM (which MS graciously “donated” to the Open Group, without the specifications of those libraries that make it useful) comes from utilization of DCE/RPC.
I stumbled upon this blog, when noticing Consumer Reports has neglected to mention the “open source” ClamWin antivirus in its reviews. I Googled CR and “open source”. This page is one of the top results. In fact, Consumer Reports, an otherwise excellent publication, consistantly gears its software reviews for the lowest common denominator. They barely ever mention “open source” products.
One thing that helps is professional support such as ISO editing and management. While there are the usual terror tales of company hijacks, immature specifications seeking status, amateur efforts, etc., these are not typically ISO efforts. ISO doesn’t work in a vacuum. The arrangement I’ve seen that has worked well is when a vendor consortium teams with a professional standards organization. The consortium has participation agreements that specify the IP conditions for submissions and can independently create consortium specifications under those conditions to enable work to move faster. Once the specifications are mature and testable (big emphasis on test suites developed for specifications), they can begin the ISO process. This inner wheel/outer wheel separation of concerns preserves the most options and enables the smoothest administration.
For example, XML binaries are a sore subject for some given other preferences for open transparent human-readable texts and the claims that XML verbosity and parsing don’t have a significant impact on performance. There are isolate cases where the common wisdom doesn’t apply and graphics files can be one of these. To work this problem, the Web3DC, responsible for the real-time 3D graphics standards for ISO and in liaison with the W3C enables X3D with three encodings over a common abstract object model, thus addressing the polar requirements of rendering fidelity and real time behavior. The first two encodings are for Classic VRML (curly bracket and terse) and XML (pointy bracket and verbose but DOM compatible). The third, a binary for cases where verbosity and size are a real performance penalty, just received IS status.
“ISO/IEC JTC 1/SC 24
Information technology—Computer graphics, image processing and environmental data representation—Extensible 3D (X3D) encodings—Part 3: Compressed binary encoding advances status to International Standard”
This has been worked professionally cooperation from vendors, consortia members and multiple standards groups. Sometimes the cause of a standard is NOT helped by the loudly yelping crowd that is incapable or unwilling to do the due dilligence required of a technicals standard. It is much easier to go to the blog, put stickers on pages, yelp in the press and use the horde effect (what the wisdom of uninformed crowds of endorphin charged zealots becomes). So I advise groups to go about the business of standards making quietly and directly without the need to get buy-in from the web, in effect, to understand that the commons is uncommon, is in reality, lots of little information ecosystems with local rules.
Waiting for the web or MS or IBM or Oracle or Sun or Red Hat to incorporate a technology or create a fair standard is like waiting for bears to make your breakfast. It is better to crack the eggs yourself and keep the bears out of the fridge.
In my opinion, the main items missing from the list are:
“Is the standards document well written and free from errors?”
“Was the standard thoroughly reviewed by enough parties before acceptance?”
In an ideal world, these would be unnecessary. A standard ought to be well written and correct in the interest of all parties, and its review should always be conducted with care to result in a good and well-endorsed standard. However, certain recent events suggest that this criterion can be applied to filter out some particularly bad apples which have been forcefully rushed through the system, e.g. for commercial reasons. In fact, this could be given more emphasis than other criteria. I see no reason to tolerate such a thing as a badly and incorrectly described standard, or a standard which has not been subject to proper review before acceptance.
As long as the parties are just ‘reviewing’ and there is some accepted protocol for submitting and responding to comments, that is well and good. When it becomes a process of ladling more requirements in an effort to stall progress, or to submit comments which have little substance but delay a formal protocol/process, it isn’t good.
IOW, it is not what you will tolerate. It is what the standards organization members and the cooperating liaisons such as other standards organizations and the vendor consortia have declared is tolerable. Just as the standard requires a test suite, a process requires a judge. If the judge is the chair and the chair is biased, you get what you get. If the chair is fair but the administrators are biased, you get what you get. That is why transparency and testability are the key requirements. That is why this is different from a network (opaqueness and validity).
Let me add one more vote for compliance tests. One of the better things about the IEEE 1394 spec (from when I was on the committee) was its C-language reference implementation for all of the levels above the physical layer. That’s admittedly not an easy act to follow for anything claiming visual fidelity.
However, I’ll generalize that to a larger principle: “Are all normative aspects of the specification objectively falsifiable?”
If you can’t objectively determine whether an implementation complies with a part of the spec, it’s just waste verbiage and should be left out.
–
D. C. Sessions,
past chair JEDEC JC-16
Compliance tests and testability are crucial. I like Sessions idea of falsifiability (Carl Popper defined science that way), but I think the following is more realistic:
1. How much effort is required to certify that a product is fully compliant?
Also, I would add the following:
2. How much effort is needed to implement the standard compared to its scope?
3a. Can the standard be separated into near-orthogonal components?
3b. If 3a, how difficult would it be to replace a component?
3c. If 3a, how difficult would it be to use a component is a separate standard?
4. How mature is the technology covered by the standard? (Immature technologies should not be standardized IMO.)
One approach to making sure that you have objectively verifiable statements is, for a document language, to generate the standard from the schema. For a demonstration of this approach, see http://www.oreillynet.com/xml/blog/2007/08/autogenerating_standards_from.html
One thing that has become apparent from the recent discussions on certain standards, is that there is a divide between what is needed to conform to a standard and what is needed to implement a standard. And there is a difference between ‘required’ in the sense of ‘not optional’ (e.g., the usage of the ISO Directives) and ‘required’ in the sense of ‘including manatory and discretionary behaviours” (e.g., the legal sense used in some licenses). These gaps are in part differences in domain, but also leads to confusion and talking-at-cross-purposes when people on different sides of the divide discuss standards.
So there is a key question “How well does this standard accord to the expectations of people in that it provides the kind of information they expect a standard to provide?” And that question is highly unanswerable, because the people may be smart or idiots, the standard may be aimed at a niche different from them, etc. To judge a standard on mass-market acceptability is to judge a book or film by its blockbuster status: not a very reliable guide. So with standards, we need to identify the target use group (or groups).
The “openness criteria” should not be part of a set of standards *quality* criteria, unless the purpose of the standard in question is, in part, to be open. Some standards are proprietary. In some scenarios, proprietary standards are not inherently evil.
I understand the frustrations over OOXML, but let’s not shoe-horn an agenda into what should be a broader establishment of general practices that should exist in all standardization bodies for ensuring the production of a meaningful, high-quality standard created by a “best practices” process of standardization.
Alphadog, I don’t think it’s as simple as that. A supposedly high quality standard that is also highly proprietary and only available at extortionist royalties fails the test as far as I am concerned. By the same token, a bad but free standard also fails.
I think of openness as one category of how people measure standards. You might not care about that category, but other people passionately do. If it can be analyzed and measured, people can decide for themselves how important it is to them.
Given the number of whitepapers I have seen trying to justify the value of “proprietary standards,” it is clear to me that there is an active agenda at play there as well.
“Alphadog, I don’t think it’s as simple as that. A supposedly high quality standard that is also highly proprietary and only available at extortionist royalties fails the test as far as I am concerned. By the same token, a bad but free standard also fails.”
Fails what test? Looks like two tests to me: quality and openness. The fact that you can easily imagine a high-quality, proprietary standard supports the notion that quality and openness should not be conflated. My point is that openness is orthogonal to quality. Can both co-exist in one standard? Obviously. Can neither? Again, yes. Would I prefer an open standard to a proprietary one, given similar “five-star quality”? Obviously yes.
The problem I see with combining your laudable goals of establishing better processes and practices for generating quality standards from standards organizations with that of generating open standards is that of scope creep, muddled focus and/or limiting your audience.
If you can pull it off, kudos to you though! :)
Openness is not a test or quality of a standard. It is a test of the goal of the organizations that create the standard. Propriety is an issue of IP and IP can only be contractually assigned by agreements that set conditions for participation or contribution. IP is a legal constraint not a characteristic of the artifact.
If one doesn’t get this, one isn’t really studying the problem or the solution. One is begging a question for the sake of creating controversy. Keiretsu such as the Web3D/W3C/ISO use the participation agreements of their consortia to constrain the actions of contributors. ISO can then operate free of controversy over proprietary restrictions. Now it is possible to focus on the qualities of testability and applicability without the distractions of intellectual property.
Rick gets this. Bob knows it. We have to walk the walk or the talk is just market politics as usual. Standards are the outcomes of such political choices.
We live in a time when the reach of the media creates an immediacy of understanding revealing all attempts to obscure understanding. As a result, the semantic and semiotic games we play create cynicism and paradoxically passionate but fruitless debate. If we continue to use our skills to create a fog around these issues for the sake of competitive advantage, we lose both opportunity and credibility in pursuit of the goals of open products and fair markets.