Protecting and
Licensing Internet Content Databases
Eric
Goldman
Marquette
University Law School
1.
Introduction
w
The challenge of protecting non-copyrightable
data in a digital era
w
Information aggregators and scraping, harvesting
and extraction
n
The
age-old build v. buy question; but here building means constructing a way to
steal
w
For complete protection, clients need to
consider law, technology and business models
n
Licensing
requires foresight
2.
Legal Protection—Copyright
w
Copyright protects original works of
authorship—not facts or ideas
w
Some cases find copyright in data that is a
product of judgment (e.g., CDN v. Kapes)
w
Even if individual items aren’t copyrightable, should
be able to protect compilation (selection, arrangement, coordination)
w
Ways to “manufacture” copyright protection:
n
“Meta”
info (classifications/taxonomies)
n
Software
for formats or transfers
n
Copyright
mgmt information (17 USC 1202)
3.
Legal Protection—Hot News
w
Misappropriation of intangible information
usually preempted by copyright
w
But hot news doctrine:
n
Information
generated/collected at some expense
n
Information
is highly time-sensitive
n
Defendant
free-rides on plaintiff’s efforts
n
Defendant’s
use directly competes with plaintiff
n
Free-riding
reduces production incentives so as to substantially threaten production
w
Examples: Headlines, scores, weather, prices?
4.
Legal Protection—Contract
w
Contracts can provide excellent protection
(except against after-acquirers)
w
Online formation: mandatory non-leaky
clickthrough
n
Bootscreen
process should work
n
Other
placement can work if notice and call to action done carefully
w
Subject to all standard contract defenses
n
Incapacities,
unconscionable, public policy
5.
Legal Protection—Trespass
w
Protect information by protecting the servers
w
Trespass:
n
Use
or intermeddling
n
Dispossession,
impairment, deprivation or harm
n
Notification
and self-help?
w
Computer Fraud & Abuse Act:
n
Accessing
protected computer without authorization (or in excess of authorization)
n
Taking
info or causing damage
w
Proactive steps: onsite notice, email notice,
robot exclusion headers, IP address blocks
6.
Non-Legal Protections
w
Anti-robot techniques:
n
IP
address blocks; exclusion headers
n
Dynamically-created
pages
n
Password
protection
n
Monitor
data served; limit amount served to any one user
w
Encryption envelopes
w
Provide custom interface rather than licensing
entire database
w
Sell freshness/currency
w
Sell organizing info/implementation ease
7.
License Grants
w
What IPs are being licensed?
n
Copyright
w
Software, entire database, taxonomy?, teaser portions?,
individual items?
w
The challenge of weak collection practices
n
Trade
secret
w
Software, proprietary codes, usually NOT the entire database
or individual items
n
Trademark
w
Logos
w
Redistribution, co-branding, framing and content
serving
w
“Derivative works” (edits, summaries,
abridgements and commingling)
w
Display rules
w
Post-termination rights
n
Replacement
data
n
Replacement
taxonomy
n
Compliance
enforcement?
8.
Other Licensing Issues
w
Transfer protocols and service levels
n
Data
dump (electronic or physical), on-demand calls or joint page serving; sales tax
implications
n
Data
refreshing/caching
w
Anti-scraping obligations
w
Pass-throughs to end user
n
Contract
restrictions against extraction
n
Liability
disclaimers
w
Indemnity
n
Being
the cheese in a sandwich
n
47
USC 230