Saturday, May 2, 2009

Open Source vs. Open Access

I've reached a point in my development project at which I'd like to go ahead and release FromThePage as Open Source. There are now only two things holding me back. I'd really like to find a project willing to work together with me to fix any deployment problems, rather than posting my source code on GitHub and leaving users to fend for themselves. The other problem is a more serious issue that highlights what I think is a conflict between Open Access and Open Source Software.

Open Source/Free Software and Rights of Use

Most of the attention paid to Open Source software focuses on the user's right to modify the software to suit their needs and to redistribute that (or derivative) code. However, there is a different, more basic right conferred by Free and Open source licenses: the user's right to use the software for whatever purpose they wish. The Free Software Definition lists "Freedom 0" as:
  • The freedom to run the program, for any purpose.
    Placing restrictions on the use of Free Software, such as time ("30 days trial period", "license expires January 1st, 2004") purpose ("permission granted for research and non-commercial use", "may not be used for benchmarking") or geographic area ("must not be used in country X") makes a program non-free.
Meanwhile, the Open Source Definition's sixth criterion is:
6. No Discrimination Against Fields of Endeavor
The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.
Traditionally this has not been a problem for non-commercial software developers like me. Once you decide not to charge for the editor, game, or compiler you've written, who cares how it's used?

However, if your motivation in writing software is to encourage people to share their data, as mine certainly is, then restrictions on use start to sound pretty attractive. I'd love for someone to run FromThePage as a commercial service, hosting the software and guiding users through posting their manuscripts online. It's a valuable service, and is worth paying for. However, I want the resulting transcriptions to be freely accessible on the web, so that we all get to read the documents that have been sitting in the basements and file folders of family archivists around the world.

Unfortunately, if you investigate the current big commercial repositories of this sort of data, you'll find that their pricing/access model is the opposite of what I describe. Both Footnote.com and Ancestry.com allow free hosting of member data, but both lock browsing of that data behind a registration wall. Even if registration is free, that hurdle may doom the user-created content to be inaccessible, unfindable or irrelevant to the general public.

Open Access

The open access movement has defined this problem with regards to scholarly literature, and I see no reason why their call should not be applied to historical primary sources like the 19th/20th century manuscripts FromThePage is designed to host. Here's the Budapest Open Access Initiative's definition:
By "open access" to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself.
Both the Budapest and Berlin definitions go on to talk about copyright quite a bit, however since the documents I'm hosting are already out-of-copyright, I don't really think that they're relevant. What I do have control over is my own copyright interest in the FromThePage software, and the ability to specify whatever kind of copyleft license I want.

My quandry is this: none of the existing Free or Open Source licenses allow me to require that FromThePage be used in conformance with Open Access. Obviously, that's because adding such a restriction -- requiring users of FromThePage not to charge for people reading the documents hosted on or produced through the software -- violates the basic principles of Free Software and Open Source. So where do I find such a license?

Have other Open Access developers run into such a problem? Should I hire a lawyer to write me a sui generis license for FromThePage? Or should I just get over the fear that someone, somewhere will be making money off my software by charging people to read the documents I want them to share?

10 comments:

Sharon said...

You could look at a Creative Commons licence, since they have an option for 'no commercial exploitation'. I know they don't recommend using CC for software source code, but they do at least provide a model you can borrow.

sgillies said...

Ben, don't mess with freedom zero. It'll cost you users, including the really passionate open source users a project needs. Best you can do is teach about and recommend OA through your software and build ties to OA communities.

gavin said...

I agree with sgillies. Freedom Zero -- to use the program for any purpose -- is as crucial as the other tenants of free/open source software.

Especially given the ability to modify and redistribute, it's hard to predict what uses people could find for your contributions down the road. Locking them to a particular use is counterproductive to that.

In addition, we can imagine how exceptions to Freedom Zero don't scale. You want to prevent users from using your software to produce closed-access literature; someone else wants to exclude military uses; someone else wants to exclude commercial uses; someone else wants to exclude use by racist organizations; someone else wants to exclude use by religious organizations... and on and on. We develop a patchwork set of restrictions about how "free" software can be used -- restrictions which derivative works (depending on the license) may have to maintain.

That's why the only restrictions on use which any major FOSS licenses have adopted are restrictions on uses that limit the user's freedom to exercise the other rights granted by the license (e.g. to prevent Tivoization).

I'm an advocate both for FOSS and for OA. But adding restrictions to your software's license isn't the way the promote OA.

By comparison, look at the software Open Journal Systems. It's FOSS designed for publishing OA journals. Some users have adapted it to publish subscription journals. But that doesn't harm the intended use. Meanwhile, the developers actively promote OA and work closely with the OA community to make their software work best for them.

If you make software that's useful to the OA community, that's great. If it's also incidentally useful to non-OA users, well, that doesn't harm the OA community. Meanwhile, you can use your soapbox to promote OA, and focus on developing software that's best for your intended uses.

Ben W. Brumfield said...

Sean and Gavin, do either of you have opinions on software released under non-commercial use licenses, as Sharon suggests? It's still a non-free approach, but might be a step I could take to open up the code a bit while I mull this over.

I am (slowly) becoming convinced, but the preponderance of pay-access family history sites out there makes me worry that the momentum is in the wrong direction.

Anonymous said...

"I'd really like to find a project willing to work together with me to fix any deployment problems,..."

I still don't see why you absolutely need to Open Source the code (for now). How would _you_ like to see the code being used as of today?

My opinion:
Find the right collaborators for a project and deploy it. I can see you collaborating with a university (for hosting, credibility, etc). Work out a custom license with them directly, with specific clauses to keep it Open Access. Once the deployment has reached a steady-state, decide what you'd like to do from that point on.

Perhaps when enough universities have jumped on-board with their own Open Access deployments, you can _then_ open-source it.

-waqas

Don Marti said...

You could always release a trademarked build of the software under a EULA, put OA-only into the ToS of the support forum. and have the actual git tree under a conventional Free Software license. Copyright licenses don't give users the right to use your trademarks.

pfctdayelise said...

I think Don has the right idea. Just get your service provider partner make it a term of *service* to let them publish the transcriptions.

A comparison: Laconica is a free software microblogging platform (alternative to Twitter). http://identi.ca/ is the authoritative instance of that, a public service, which requires that its users license their messages CC-BY. If you aren't happy with that, you can set up and run your own Laconica instance. But the market-first position of identica, plus the valuable network/community effects, mean for most people it's far easier to accept this small imposition than go to the bother of setting up an instance for themselves.

Incidentally Laconica uses the Affero GPL (published by the Free Software Foundation, same as the normal GPL) which is suited to "software as a service" type software, which sounds like it might suit your case as well.

Ben W. Brumfield said...

Thanks to Don and Brianna for your comments. The one problem I see with the identi.ca/Laconica analogy is that with an active user base of around 12, FromThePage is in no way a market leader for hosting historical documents. So there would be little point in one of the historical document providers like Ancestry or Footnote changing their business model just to feature my logo.

That doesn't mean that an authorized/certified provider model won't work, nor that technical provisions couldn't be added to FromThePage that make the software a bit balkier if you tried to run it behind a pay wall. But I can't count on being able to leverage my name or market position for much.

However, Waqas's suggestion -- license the software to non-profits for free (gratis/semi-libre) -- might help get me establish that sort of position, with a larger user base and an accepted way of doing things.

These are all great ideas, and I really appreciate all the advice!

Peter Murray-Rust said...

This is a common and difficult problem. I have had occasion to try to do this where I didn't want modifications of the code to be used without agreement the results might be "wrong". For example code for a square root routine might simply give the wrong result. I ultimately fell back on community norms (formal request to the community to behave in a given way).

However I think there is a better way. Get the software to output files which - by default - include an Open Data (http://www.okfn.org) tag. This requires the data to be Open/libre. And tell the user that - if they wish, they can manually excise this tag - I suspect that the labour and the culture will lead to general compliance.

Ben W. Brumfield said...

Peter, that's utterly brilliant!

I was thinking of doing CC licensing with the PDFs I generate, but it had never occurred to me to do that with the content itself.

I think you've given me the solution: license the software as Open Source and add the appropriate open notices to the display.