Office XML's IP-Infringement Specter, I: Copyright (long)
- Consult <http://orcmid.com/writings/W050601c.htm> for the current status and electronic copies of the full material.
- This material is at the point where there’s enough about copyright use cases to make a blog entry that others can review. Meanwhile, I will continue with the patent-license case for follow-up in a later blog entry.
Professor von Clueless in the Blunder Dome: Microsoft Cracks Open the Word, Excel, and PowerPoint Formats in XML. I’m excited about the June 2 announcement of the Office XML Open Format to become the default Microsoft Office document format. It is particularly pleasing that support for the format will be retrofitted to Offices 2000, XP, and 2003.
I’m also concerned that the Office 2003 XML Reference Schemas license may not cover use cases that are important to me and, I think, that may be even more important to Microsoft. In this lengthy entry, I unwind as much as I can figure out about how the copyright elements of the license can be made to work and where the restrictions of the license seem to pinch too much.
I end with the observation that it’s great that Microsoft is opening up its approach so far ahead of the next Microsoft Office release. There is a sizable window where everyone’s concerns can be addressed before the formats and their license are locked down.
1. Introduction: The Specter Thing
I’m fascinated by the opportunity for other applications and services to interwork with the “OX” formats (my term for the new DOCX, XLSX, and PPTX). The same license also covers the current but non-OX XML formats usable with Excel, InfoPath, OneNote, Project, Research Services (from Office applications to research resources, including Encarta), Visio, and Word (i.e., WordML). The OX formats are judged by their developers to be good enough and rich enough to become the default formats in place of the current DocFiles of key Microsoft Office applications Word, Excel, and PowerPoint. Some of the other XML formats in Office are for specialized usage (e.g., imports into OneNote).
What’s this specter thing? I find myself having a weird emotional reaction (“feeling dirtied” off-and-on) over the license, and it appears to be because its language raises the specter of infringement. By that I mean that I am now wary of the prospect of committing an infringing act. That’s despite the license’s excusing of actions that could be infringing acts in the absence of the license. That’s what I mean about the IP-Infringement Specter.
The prospect of unintended (or IP-ignorant) infringement is the same as it always has been. Yet the fact that Microsoft provides a royalty-free, conditional and limited license that points out the prospect in its terms actually raises my anxiety level. My gut reaction is to keep my distance and not embark on something that would have me be tainted by the license strictures. (This aversion led me finally to destroy the CD that comes with the Shared Source CLI Essentials book, once I realized I could be tainted by examining its contents.) I have no idea how I end up in that mood, but I will bet small sums that I am not the only one who reacts in this way, and some will accompany that with speculations of dastardly Microsoft conduct and perpetuation of property-centered evils.
Because I think OX is a big deal, and I would love to find out that it is safe to play with the formats and the opportunities they represent for novel application, extensions, and variants, I want to slay this demon of mine.
I propose to take the license apart, piece by piece and satisfy myself that I can work with it. This post deals with the copyright license. The patent license is thornier for me, and I'll address that in a later post.
Microsoft Senior Vice President Stefen Sinofsky identifies the intended benefits in his Microsoft PressPass Q&A:
“We have used [XML] as the foundation for the new Office XML Open Format, which is an open, published document format. In addition, we are publishing with it a royalty-free license, so any customer or technology provider can use the file format in its own systems without financial consideration to Microsoft. This will ensure that the new file format can be used by everyone to create, access, and modify documents in this format.”
I have no idea what there is about a document format that requires a license, and that’s all right. There’s not much harm in Microsoft being over-generous and providing a royalty-free license to something that maybe can’t be owned. I as licensee then don’t have to worry about whether or not there is a property right and which bits of the whole happen to be that property. I am very fond of licenses that allow me to remain unworried and have very simple compliance conditions.
Microsoft presents the offer as an important one, so it is useful to find out what it provides for. I’m aligned with the stated outcome: ensuring the ability to use the format(s) to create, access, and modify documents. I am learning, however, that “in this format” is actually limiting in unexpected ways.
My natural inclination is to discover the actual terms of the license, its limitations, and any conditions that must be satisfied. So I have gone looking. Here’s what I make of it. You can check the original sources, review my rationale, and apply your own yardstick.
When I think about licenses having to do with software, my first thought is about copyright and the range of licenses that interest the developer community. I include the range of Creative Commons licenses as well as those that satisfy the Open Source Definition (1.9).
Microsoft reserves all copyright in the specifications of the Office XML formats and of the XML Schema Definitions for those formats.
Along with Microsoft’s copyright notice, there is the grant of a perpetual, non-exclusive, limited copyright license that begins:
Permission to copy, display and distribute the contents of this document (the “Specification”), in any medium for any purpose without fee or royalty is hereby granted, provided that you include the following notice on ALL copies of the Specification, or portions thereof, that you make.
The specified notice is exactly the same as the one that Microsoft uses in its own copies, linking to the same license document. The required-notice exhibit is followed by this additional stipulation, making it clear what is not being granted:too:
No right to create modifications or derivatives of this Specification is granted herein.
The simplest well-known license with comparable limitations (based on copyright alone) is the Creative Commons Attribution-No Derivatives license.
I have no problem with honoring these conditions exactly. Well, the specter comes up and I have to keep reminding myself that it is an illusion. As I work through some of these cases, I notice that the specter is fading.
There are some difficulties when I want to use the material in ways that I think Microsoft wants to encourage and where literal preservation doesn’t work. For those cases, a statement akin to the Creative Commons Attribution-Share Alike license would have me be more confident that I don’t need to negotiate a specific license for each such occasion. In a situation where someone isn’t willing to share derivatives, they are no worse off with a share-alike license than with the Microsoft no-derivatives version. Also, as the owner of the copyright, a share-alike provision does not constrain Microsoft’s use of its own material in any way whatsoever. And if Microsoft isn’t willing to accept the share-alike offerings of others, they are no worse off than anyone else in that situation. (For that matter, Microsoft is in a far better position to negotiate alternative licenses than are many smaller operations. If there's a non-specter downside to this for Microsoft, I'm not the one who can say what it is.)
I want to illustrate the practices that are required where these Microsoft-licensed materials are distributed as part of collective/composite works covered by different over-all licenses. I'm drawing on my own practices:
- For the literary aspect of digital materials, I prefer to offer the Creative Commons Attribution license. An example of how I do that is at the bottom of each page here, and the specific practices and their motivation is described in an InfoNote here.
- For software, I prefer to apply the open-source BSD License. An example of such application is here. One motivation for this license has to do with honoring community contributions that preserve the ability of that generous community to make proprietary use of their own work and its upgrades (as in the case of ODMA). I also want to make it easy for people to act in ways that do not raise the specter of copyright infringement. There is more about that in an incomplete license drafting here.
I want people to be able to make use of my works and, by applying very simple practices, to be unconcerned whether or not their use constitutes creation of a derivative work.
I am not lobbying for the adoption of this approach by others. These choices are merely single instance of the wide range of exclusive rights that copyright holders can exercise as they see fit. I bring up my approach here because it provides a grounded, worked set of comparative examples that I am completely familiar with.
When materials having different licenses are commingled in an electronic packaging, it is easy for a recipient to overlook the additional restrictions that may accompany some of the items. There is a fair amount of carelessness about that in open-source packages (and some commercial ones) that I have encountered. I don't want any recipient of my work to be led astray by how I package materials together. I apply two complimentary practices for making differences in licensing of companion materials very clear:
- Materials with different licenses and conditions that must be noticed can be isolated in a way that makes the existence and applicability of the separate conditions clear and unambiguous. One way is to use an embedded container (such as a Zip archive) that contains a separate, well-identified license statement along with links to more-detailed information. I would do this, for example, if I wanted to distribute a selection of the Office XML Reference Schemas that are used in some package of mine that allows derivative works. I would also make sure that the manifest and license statements for the overall package emphasized and identified the portions that carried different licenses, especially more-restrictive ones.
- Substantial restrictions can be made known and offered for review without requiring the related content to be accessed in any way without first knowing the restrictions. I want to avoid stealth exposures of recipients to materials whose license conditions might be unacceptable to them. I already make package manifests and usage requirements available for independent review before electing to download a package of materials. This same device could be applied for redistribution of a full set of the Microsoft XML Reference Schema materials, for example. I would take the same precautions in making a CD-ROM compilation of materials. I would include the manifest and license restriction information, along with instructions for how to obtain the materials. I would not include the more-restricted material on the CD-ROM at all.
Finally, when I am not authorized to redistribute material (or the recipient is not similarly entitled—the redistribution right is not transfered), I will apply the same methodology that I use in public software-development efforts that rely on available but non-redistributable materials. For example, the ActiveODMA development tree is being organized with a nodist subsection
- that identifies materials that licensees or recipients are not permitted to (re-) distribute,
- that are required in order to verify, duplicate, and rebuild the work, and
- that are freely available (e.g., the Microsoft Visual C++ Toolkit 2003 and the Microsoft Platform SDK February 2003 releases).
The nodist subsection is for instructions on how to obtain the material and install it in a way that works for the open-source ActiveODMA development projects.
In this regard, the Microsoft license presents no greater burden for me than any other license that permits redistribution and is materially different than the one the containing contribution is under. I have the same difficulty packaging software that is created with the GNU Public License (GPL) in my distributions as I do packaging material under the Microsoft Office XML Reference Schema license.
The prohibition on creation of derivative works is a different problem. There is nothing I can do, short of negotiating a separate license (one that would likely not be transferable to recipients of my work) if I have a compelling interest in creating a derivative work. Unless there is some sort of share-alike provision, I can't pass it on and I would be reluctant to engage in such an arrangement. I would have to step out of the Open Source Definition and I am unwilling to do that.
There is no bind here without a compelling interest in making derivative use (including derivative works). In the case of OX and the Office XML Reference Schemas, I think there are three important cases:
- Appeal to elements of the Office XML Reference Schemas, and their namespaces, in derivative metadata based on Office XML elements. Search results delivered as XML (non-OX) documents are a simple case. Where OX-format elements are being provided in the search result, it would be great to identify them as such in the (software-derived) schema for the search result itself. I think it would take extraordinary, counterproductive contortions to avoid creation of a derivative work. Interoperability and coherence in reliance on OX would be undermined.
- Support for micro-content based on OX-format elements. There is excitement about the ability to exploit the internal structure of OX documents in applications involving semi-structured and micro-structure content. This comes through in Brian Jones's Channel 9 presentation and on his blog and in the announcement materials. Again, one wants to preserve the OX nomenclature, identification, and schema elements in preserving the nature, formatting, and distribution of such micro-content material. It is difficult to avoid having this be seen as creation of derivative works in persistent, recorded forms.
- Creation of Specialized, Custom Application-Specific Documents. Users of Microsoft Office applications are accustomed to the idea of templates as a device for casting documents in particular forms that fit into work processes and specialized activities. The introduction of OX formats creates a potential for specialized software applications to produce and ingest Microsoft Office-compatible documents that
- have derivative XML schemas that limit and simplify the content,
- have derivative XML schemas that force particular limited structure on documents of the application, and
- introduce application-specific extension elements of the XML in ways that preserve the ability to employ the documents in Microsoft Office under the OX format specifications.
It strikes me that these are valuable cases that can be claimed to involve the creation of derivative works. It might not be in Microsoft's interest to discourage some or all of these cases. I favor allowance of derivative works with an appropriate weakening of the license's restrictions. In particular, it provides a zone of safety for developers who are unclear when and whether an use constitutes creation of a derivative work.
Whatever the concerns that Microsoft has in this area, I notice two important opportunities that result from the early announcement and discussion:
- It is easier to relax a narrow license than it is to retract a too-liberal one. There is opportunity to evaluate relaxations in the copyright license that address Microsoft concerns around preservation of the integrity of the OX formats and any other concerns that are identified. Even experimental, limited relaxations can be undertaken simply to confirm where the ideal fit might be and to mitigate speculative risks.
- There is ample time to explore and experiment with the OX formats, identifying the important use cases that are in Microsoft's self-interest to encourage by liberalized copyright licensing terms.
Microsoft proposes to make the OX formats available under the same terms that existing (non-OX) Office XML documentation and schemas are provided. Those materials and their license information can be accessed and downloaded in a couple of ways:
- Office 2003: XML Reference Schemas (version 4 of 2005-01-14 in a 5.56 MB Windows Installer file, xsdref.msi). The package installs on Windows 2000 SP3 and later. There is no registration requirement. No EULA is presented. Click-through acceptance of a license is not required. Installation on my Windows XP configuration adds a folder at “Start | All Programs | Microsoft Office 2003 Developer Resources | Microsoft Office 2003 XML Reference Schemas” which organizes 24 XML Schema Definitions (in folders of .xsd files) along with one documentation file in Microsoft HTML Help (.chm) format.
- Office 2003 XML Reference Schemas License Overview (undated version accessed 2005-06-04T05:01Z). This provides a link to the 2003-11-17 legal-notice instructions for use of the schemas and their related specifications, to the patent license agreement, and to an undated license-update memorandum from Microsoft Senior Directory of XML Architecture, Jean Paoli. The Paoli memo provides some minor clarifications on the license without material impact on the aspects that have my attention.
- Office 2003 XML Reference Schemas Product Information page (undated version accessed 2005-06-04T05:23Z). In addition to other elements, this “portal” onto the reference schemas provides links to an overview of the reference schemas, to a Frequently-Asked Question (FAQ) page on the reference schemas, and to a Jean Paoli memorandum celebrating the announcement of the Microsoft Office XML Open Formats (what I have been calling OX here).
- Word 2003 XML Software Development Kit (January 2005 version). This is an on-line SDK reference on the use of the Word 2003 XML format. There is a version available for download (version 0205 of 2005-02-04 in a 3.85 MB Windows Installer file, wdxmlsdk.msi).
At the bottom of the Office XML Software Development Kit web pages and the HTML Help pages, there is a consistent notice:
- A Microsoft Corporation copyright notice.
- Accompanying text:
“Permission to copy, display and distribute this document is available at: http://msdn.microsoft.com/library/en-us/odcXMLRef/html/odcXMLRefLegalNotice.asp”
In the XML Schema files themselves, there is also license text. The text is the same as that in the reference HTML version in the on-line MSDN Library linked just above. the following declaration is typical (from visio.xsd dated 2004-03-04–10:39):
<xsd:annotation> <xsd:documentation> Permission to copy, display and distribute the contents of this document (the “Specification”), in any medium for any purpose without fee or royalty is hereby granted, provided that you include the following notice on ALL copies of the Specification, or portions thereof, that you make:
Copyright (c) Microsoft Corporation. All rights reserved. Permission to copy, display and distribute this document is available at: http://msdn.microsoft.com/library/en-us/odcXMLRef/html/odcXMLRefLegalNotice.asp?frame=true.
No right to create modifications or derivatives of this Specification is granted herein.
There is a separate patent license available to parties interested in implementing software programs that can read and write files that conform to the Specification. This patent license is available at this location: http://www.microsoft.com/mscorp/ip/format/xmlpatentlicense.asp.
THE SPECIFICATION IS PROVIDED "AS IS" AND MICROSOFT MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, OR TITLE; THAT THE CONTENTS OF THE SPECIFICATION ARE SUITABLE FOR ANY PURPOSE; NOR THAT THE IMPLEMENTATION OF SUCH CONTENTS WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS.
MICROSOFT WILL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF OR RELATING TO ANY USE OR DISTRIBUTION OF THE SPECIFICATION.
The name and trademarks of Microsoft may NOT be used in any manner, including advertising or publicity pertaining to the Specification or its contents without specific, written prior permission. Title to copyright in the Specification will at all times remain with Microsoft.No other rights are granted by implication, estoppel or otherwise.
</xsd:documentation>
</xsd:annotation>
There are a variety of on-line resources with coverage of the Microsoft Office XML Open Format (OXOF or OX for short, and yes, I have been known to read Piers Anthony).
Brian Jones: Office XML Formats. Discussions about XML in Office and the Microsoft Office Open XML File Formats. An MSDN Blog.
Microsoft Makes XML the File Format for the Next Version of Microsoft Office. Q&A: Senior Vice President Steven Sinofsky explains how making XML the default file format is likely to help customers cut costs for data storage and bandwidth, improve security and boost data recovery. Microsoft Press Pass, June 1, 2005. The sidebar provides useful links to available white papers and the initial press release.
Brian Jones – New Office file formats announced. MSDN Channel 9 video interview by Robert Scoble. 2005 June 1, 21:18 pdt. This video conveys much of the excitement of the developers for what is being accomplished in this work.
Office 2003 XML Reference Schemas Frequently Asked Questions. Microsoft Office System Product Information, Office XML Reference Schemas Licensing. 2005 January 17 update.
The Future of Microsoft Office: Be the First to Know. Microsoft Office Online. Being set up in time for the 2005 June 6 kick-off of the Microsoft Tech-Ed 2005 Conference, this site will accumulate more material over time. The available white papers and an extensive FAQ on the new format are already available.
[updated: 2005-06-05T15:08Z The page needs to be republished in order to enable anonymous comments and correct the form of datestamp on comment postings. I promise not to do this again, and to complete this in smaller pieces.]
posted by orcmid
at 6/5/2005 12:49:53 AM