Almost a year ago, Google seemed to have hammered out a settlement about its prior scanning of books that were covered by copyright as part of its Google Books effort. Google had been sued by individuals and groups representing authors and publishers. The settlement would have established payments, a clearinghouse for handling who gets what, and fees for book portions already viewed. I spelled out what all this means in "," 2008-10-29.
At the time, I thought the settlement could be a win for all the parties involved because copyright holders would establish a right of control over digitizing copies of their works, and Google would widely disseminate many books that are otherwise unavailable without significant effort. Authors and publishers would accrue additional incremental revenue, earning money from page views and downloads, but also likely additional print book sales.
But three big issues have emerged in the intervening months that might scuttle the arrangement, and I might support the settlement collapsing.
Google Adopts Orphaned Works That Actually Have Parents -- The settlement gives Google what is essentially a court-provided license that allows the firm to scan all books without explicit opt-in permission from the authors and publishers who own the rights. Google can also make snippets and downloads available, while charging fees, collecting its share, and paying the remainder to a book rights clearinghouse.
Authors and publishers involved in the settlement may have an option to include or exclude specific works or their entire oeuvres or catalogs. However, for the vast majority of books covered, those who own rights will have to opt out explicitly, if they so wish. (There are such compulsory licenses carved out of copyright law by Congress that require copyright holders to allow certain works, but those licenses are limited and narrow.)
This includes all orphaned works, which are out-of-print titles for which the rights holders are not readily known even though the works remain under copyright protection.
Marybeth Peters, the Register of Copyrights, a job I was unaware existed until a few days ago, said in before the U.S. House of Representatives Committee on the Judiciary on 10-Sept-2009 (the testimony is all available in PDF form):
Although Google is a commercial entity, acting for a primary purpose of commercial gain, the settlement absolves Google of the need to search for the rights holders or obtain their prior consent and provides a complete release from liability.
While a court can't forestall any lawsuits by authors and publishers who claim such orphaned (or just ignored) works, the settlement would make it far more difficult for any individual or small publisher to sue Google for scanning and selling works without permission.
Exclusive Rights to Scanned Works -- If the settlement is approved, then Google winds up lucking into a deal that would be impossible otherwise: getting an agreement with thousands of parties all at once, and receiving a copyright exemption - albeit from a court that wouldn't seem to have the right to grant such an exemption.
As an Amazon exec, Paul Misener, noted at the congressional hearings noted above,
If a potential competitor to Google engaged in "massive copyright infringement" in the hopes of getting sued by the same plaintiffs in the Google litigation and making the same settlement deal, why would the rightsholders settle on the same terms when they already have a distribution partner and would stand a reasonable chance of obtaining massive statutory damages?
Even massive firms like Microsoft, Sony, or Amazon would have to work on thousands or tens of thousands of individual deals to achieve similar results - without the inclusion of orphaned works. Microsoft, lest we forget, had its own massive book-scanning project at one time, but worked nearly entirely with titles that fell outside copyright protection, or where the issues were at least murky, providing them some cover. And even still, Microsoft dropped the project, probably because the works it scanned just didn't offer enough potential value given the high cost of scanning and processing.
The exclusivity means that Google would have de facto ownership of the idea of turning printed books into digitally available versions, offering a library far larger than any potential competitor could create.
The settlement imposes pricing tiers and structures on books offered for searching and download, with the clearinghouse in charge of that. But that puts control of pricing in the hands of the clearinghouse board, which will be dominated by authors and publishers, and who may use Google's monopoly to set artificially high non-competitive prices. That's not good for readers, nor for authors and publishers who want alternatives for their works while still making them widely available. (Amazon's Misener described the clearinghouse as "a cartel of rightsholders that, for sales of books to consumers, would set prices to maximize revenues to cartel members.")
Copyright Settlement without Borders -- Finally, this settlement applies just in the United States, and there's a simmering and growing anti-settlement sentiment emerging outside America. Many authors based outside the United States sell book rights to U.S. publishers. The settlement would ostensibly sweep books published under those rights into an agreement that non-U.S. authors had no part in shaping.
Google might feel justified in scanning and offering books that are orphaned and out of print in the United States, even if the rights holders were reachable elsewhere in the world, but not through U.S. agents. Google has not to include any books that are in print in Europe, but that's likely not enough.
Dreaming of an Opt-In Future -- In the little cloud kingdom in which I apparently live, the logical course would be for authors and publishers to choose a different route. Instead of allowing Google to own the digital scans the company makes and have all decisions about dissemination, the settlement should allow Google and any other parties to scan as much as they want, but the ownership of the scans would remain in the hands of the copyright holders and held in trust (for those that opt in) by a clearinghouse. Whoever scanned the works would receive some kind of compensation or royalty to offset the work performed, otherwise there would be no incentive to continue to scan.
While this might seem insane for Google to hand off its hard-wrought work, the original lawsuit was partly over whether Google had the right at all to scan works to which it lacked copyright, extract text and images, and make snippets available for searching over the Internet. That right was never established, and a settlement could involve a transfer of ownership, while Google would retain its own copies and all the work performed. Google still gets first-mover advantage, and it doesn't have to delete any work it has carried out to date.
The process should also be opt-in, requiring copyright holders to agree to participation. The issue of orphan works is much larger than Google Books. The U.S. Copyright Office and its Register, Peters, has backed legislation that would with a clear process in place. Requiring opt-out registration wouldn't suffice.
As Peters said in her congressional testimony, "Under copyright law, out-of-print works enjoy the same legal protection as in-print works. To allow a commercial entity to sell such works without consent is an end-run around copyright law as we know it."
Again, a clearinghouse that was the repository of digital copies could become the central authority for making good-faith efforts to reach authors and others. Peters said, "a compulsory license for the systematic scanning of books on a mass scale is an interesting proposition that might merit Congressional consideration."
This approach would allow any party with sufficient resources to register and pay reasonable cost-recovery fees to acquire the original scans for later OCR or other kinds of processing, such as image extraction or even typographical analysis. It wouldn't be fair nor sensible to require firms to hand over both scans and converted text, as the OCR will be part of the value added, and represent a fair amount of cost.
Different for-profit and non-profit organizations might carve out parts of a large collection, and negotiate different terms for payment based on lending, rental, and sale models, instead of a one-size-fits-all payment model.
The clearinghouse, too, should be an independent non-profit with interests of commercial firms, authors, publishers, libraries, and academic institutions represented; the current clearinghouse planned, while a non-profit organization, is focused on stakeholders without a broader sample of representatives who focus on the public good.
Decisions over pricing should be non-discriminatory, but in the best interests of balancing both mass dissemination and the rights holders' desires (whatever it may be) in earnings. Rights holders could set absolute terms or allow the clearinghouse to set terms based on policies.
Days after I wrote the first draft of this story, Google - in the above-mentioned Congressional hearing on digital bookselling competition - offered the ability to resell any of the orphaned works it's scanned to its competitors, or anyone. Not quite what I envisioned, but an interesting offer. Amazon was dismissive about the offer,.
Will any of my blue-sky ideas come to pass? It's hard to tell. The Google settlement will likely not proceed to conclusion in its current form; or, in the event a judge approves, will be subject to additional lawsuits and regulatory involvement worldwide.
The fact that tens of millions of pages of human thought remains so inaccessible seems a crime, but any decision made has to look at the widest benefit to all the parties involved, not just anointing Google king and authors and publishers as a council of nobles.
[Disclosure: The Authors Guild is a party to the settlement, and I have been a member of the guild for several years. Because of a number of the guild's recent actions, including its support for this settlement, I have chosen to let my membership expire this year.]