How Google Book Search transformed from impossible to inevitable

English: Google Digitization signs are all ove...

English: Google Digitization signs are all over the Michigan engineering library. (Photo credit: Wikipedia)

In a widely reported copyright fair use decision, Judge Denny Chin ruled that the Google Books program constituted fair use, denying claims of the Authors Guild that the scanning of 20 million library books and posting snippets of those works online infringed the rights of authors.

The litigation history reflects the transformation that has taken place on the internet in the past decade. In 2004 Google entered into an agreement with several universities, beginning with University of Michigan.

Google began the process of digitizing books at the nation’s great libraries, starting at the University of Michigan, the alma mater of company co-founder Larry Page. “Even before we started Google, we dreamed of making the incredible breadth of information that librarians so lovingly organize searchable online,” said Page. A 2005 lawsuit resulted in three years of negotiation and a proposed settlement in 2008. That settlement collapsed among antitrust concerns and fairness of the representatives of the plaintiffs’ sub-classes.

As the Google Books program evolved, two discrete projects operated. In the Partner Program “works are displayed with the permission of the rights holder.” The rights holders had the ability to opt out of the scanning, but in 2011 the Association of American Publishers settled with Google. According to the decision, “As of early 2012, the Partner Program included approximately 2.5 million books, with the consent of some 45,000 rights holders.” The participation suggests an industry voting with its feet.

Under the publisher agreement, Google stopped displaying ads with the publisher’s books. In turn, the publishers provide Google with the books. This settlement, even more than the two district court decisions, effectively ended the dispute – leaving the two lawsuits as mop-up activities.

In the HathiTrust litigation, Judge Harold Baer determined Google’s Library Project partners who comprised the HathiTrust partnership were entitled to fair use protection for the digitization of the 20,000,000 volumes copied and used by the libraries. The decision highlighted the benefits to visually-impaired students and researchers who had access to content not previously available through audio readers or braille, the benefits of digital search functionality, and the importance of protecting the library collections from physical harm and erosion.

In both opinions, the courts highlighted the new research opportunities created by the digital database:

Mass digitization allows new areas of non-expressive computational and statistical research, often called “textmining.” One example of text mining is research that compares the frequency with which authors used “is” to refer to the United States rather than “are” over time. Quoting the brief of the Digital Humanities amicus, “it was only in the latter half of the Nineteenth Century that the conception of the United States as a single, indivisible entity was reflected in the way a majority of writers referred to the nation.”).

The Google decision followed the same path, highlighting the benefits of digital search, the limits placed on commercial exploitation by Google, and the pro-market effects agreed to by the publishers. “Google Books expands access to books.” With this simple sentence, the court highlights the essence of the eight years of litigation. In looking at the transformative nature of the fair use test, the court explained, “Google Books does not supersede or supplant books because it is not a tool to be used to read books.”

The court does not discuss the tremendous value the Google Books program benefits the search engine, speech recognition and other algorithms operated by Google. It also dismisses the intermediary copying as a necessary function to enable the research and archival function to be exploited. But it does highlight that Google “does not run ads on the About the Book pages that contain snippets” and that Google “does not engage in the direct commercialization of copyrighted works.”

Google’s settlements and decisions not to commercialize the Google Books program likely tipped the scales with the publishers and may have strongly influenced the courts. Unlike Judge Baer, Judge Chin does not even discuss the potential to license the digitized database to Google. Baer rejected the potential to license the database as speculative. Moreover, since new works are added by voluntary participation with the publishers, the licenses for new works are included.

The decision appears a simplistic fair use summary that could lead casual observers to wonder why it required eight years of litigation. But changes to the conduct of both parties are what really led to this simple decision. Google adapted its behavior to limit its commercialization of the works. Publishers shifted their position from one of demanding opt-in, ex ante control to recognizing that the opt-out partnership met their needs. Eight years of experience did not produce significant evidence of authors being harmed as a result of snippet-searches replacing library purchases of academic texts.

In addition, the role of digital texts has changed. The Amazon Kindle and Apple iPad have paved the way for a fundamental shift in the relationship authors have with electronic texts. Market forces proved Google correctly anticipated a highly reconstructed book industry. Google was only one of the players bringing about this change.

Both the HathiTrust litigation and the Authors Guild v. Google litigation will likely be appealed, but there is little appeal in undoing the transformations to publishing that the Google Books program began.

Advertisements