We’re looking at data rights and how ownership intersects with Big Data. First, we looked at what data is, and why source and context matter. Then we looked at what you cannot own to understand the tension between what’s unique and ownable, and what’s common knowledge or just needs to be available to everyone so the system can work.
In this post, we explore intellectual property rights and fair use because those are the laws currently being applied to technology. Next, we’ll explore ownership principles in connection with data use, applications, access and sharing.
Things You Can Own
Things you can own are mostly things or land. In law, there is a distinction between real property (land) and personal property (things); each has its own sets of laws.
Then there’s intellectual property, which is neither animal, vegetable, nor mineral. But even intellectual property is based on tangible things, with limited copies, and methods that develop slowly over time. Data, and the things you do with it, are intangible, with potentially infinite copies that change constantly. Trying to attach property rights to a moving target is kind of like trying to own ocean waves. But intellectual property is what we’ve got to work with, so far anyway.
Intellectual Property
Intellectual property comes in two main flavors: Industrial or Artistic.
Industrial rights are generally patents and trademarks that apply to inventions, scientific discoveries, and commercial use of trade names and logos. The purpose of industrial rights is to protect against unfair competition, and to give the inventor or first creator an initial protected right (usually 20 years) to make money from the creation. With the rate that technology is changing, 20 years might as well be forever.
Artistic rights are protected by copyright — including print, music, plays, artwork, film and recordings, and digital works such as computer programs or databases.
In order to own intellectual property, you have to be able to identify what it is. This is easy with a painting, or the recording of a performance, or the design for a flying car. It’s much harder when what you are trying to own is something that is hard to pin down, in new forms, and constantly changing. “How do you catch a cloud and pin it down?“
A great example is tweets. A tweet is a series of words in a particular order, and easily attributable to the person who wrote them. Okay. It seems like tweets are easily suitable for copyright. And they are, in theory. But in practice, tweets are only 140 characters long, and many people tweet the same words in the exact same order, especially if they are tweeting out a title to a blog post with a link. Other tweets just coincidentally have the same words in the same order because people talk about current events and make the same comments. So tweets are harder to identify as owned by a particular person when it’s so easy for many people to come up with the same one.
Fair Use
When someone owns a copyright, you have to get permission to use the copyrighted material — or immaterial if it’s digital — even our language is based on tangible things. Depending on the value and the use, sometimes you have to buy it. Other times, asking nicely is enough.
Then there’s fair use, which allows you to use part of a copyrighted work without permission. So, you can quote an excerpt from a copyrighted work. (But you still need to give attribution by saying who created it to avoid plagiarism.) Other free use includes using a portion for news reporting, or for illustration of a point in teaching or other educational purposes.
What is an excerpt or a portion depends on what’s reasonable under the circumstances (a classic legal standard that is difficult to apply, and therefore keeps lawyers and courts in business).
It’s pretty straight forward to figure out how to quote a few sentences from a blog post, or even several paragraphs from a book. But, how do you excerpt a tweet without destroying its meaning? You can’t. But if you have to use the whole thing, then any copyright is completely meaningless.
So even information that seems like it can be copyrighted can be really hard to own as a practical matter.
Intellectual Property and Digital “Property”
One of the first big tests of copyright and digital photographs involved the thumbnail images that come up when you do an image search on Google. You get the whole image. It’s smaller, but you still see the entire thing.
How do you excerpt a photograph?
The case was Google v. Perfect 10. A porn site (the ones that make the most money online with photographs) sued Google for copyright infringement over thumbnail images. (No, I don’t know what the search was.) The court decided that thumbnails are an excerpt of the full-sized image. They also said a search response with the images “transformed” them from artistic expression to indexing or information retrieval. This is a stretch since the images were not really an “excerpt,” and the website was also selling the thumbnail sized images for use on cell phones.
So what really happened was the court looked at how the internet functions, and how search operates, and realized that if they found a copyright infringement, it would really mess up how Google and the internet works.
But it also means that indexing or data retrieval may not be something you can own.
Another factor for the court was that Google never stored the images on its computers. The case was decided in 2007. It has an excellent description of how the internet works– except that it’s 5 years later, and the internet doesn’t really work like that anymore.
The more recent fight over Java API’s between Google and Oracle came out similarly. The judge, who knew how to code, went home and tried to do what Google did without using the Java API’s. API’s are instructions for how software components talk to each other. He decided it couldn’t be done. But instead of giving Oracle the rights to everything that contained those API’s, the court said the code was fundamental to how things work. It was instructions, not content.
So when organized data is functional instead of just content, it’s closer to common knowledge or a standard design that should not be owned by one company.
This tension between information and functionality is going to be at the core of whether or not someone can own data.
Here’s the rest of the series on data ownership
- Who Owns Data 1: Overview
- Who Owns Data 2: What You Can’t Own
- Who Owns Data 3: Intellectual Property
- Who Owns Data 4: Ownership Interests
- Who Owns Data 5: Privacy
- Who Owns Data 6: Data Principles