So, what is open source, really?

So, what is open source, really?

Open source is ubiquitous, and it's a fundamental part of our lives. We may use it every single day, but how often do we ask ourselves exactly what it means, and it entails for a piece of software to be open source? I bet, quite rarely. Indeed, defining what open source software is and tracking down where it all comes from is quite tricky, and it may get confusing — especially since we frequently hear about software that we can read the source code of in several terms, like “open source”, “free software” or “source available”. Fear not, though, we are going to clear up some of that confusion right away.

When did "open source" originate?

To make things very simple, the licenses we currently consider to be open source have to respect standards decided by an organization called the OSI, which stands for “Open Source Initiative”. The OSI is a California non-profit, public-benefit corporation, which has a Board of Directors, and which fundamentally holds the soft power and authority on getting to decide what “open source” is or is not.

As far as whether they hold a legal patent or any other exclusivity on the word “open source”, and whether it was deserved, there is a lot of discussion going around. Officially, the “open source” campaign was launched in 1998 by a group of people, including Eric S. Raymond, a name that will sound familiar to many when talking about open source. ESR is the author of the popular essay “The Cathedral and the Bazaar”, a book that, to this day, is considered one of the most influential and defining pieces of literature that has ever been written about open-source software development. Its main topic is comparing the main software development models open source projects might have.

Anyway, this group of people managed to write a document, called “The Open Source Definition” — OSD for short — which serves as a basic standard for what open source should be considered to be. The document was based on the Debian Free Software Guidelines, which existed prior to the OSD. However, they were unable to secure a trademark for the word “open source”, and there is an open debate on whether the concept of open source was already defined prior to 1998.

From then on, though, things were anything but smooth, and the OSI as an institution has faced quite a bit of controversy. The main point of criticism that the OSI has faced, time and time again, is a lack of real commitment to freedom. A recent example of this happened in 2020, when OSI Co-founder Bruce Perens left the organization over their acceptance of a new license, called the Cryptographic Autonomy License, that many believed to violate basic principles of freedom. About at the same time, founding member Eric S. Raymond was completely banished from the organization as a whole — to the point where he was no longer allowed to subscribe to its mailing list.

From time to time, other pieces of drama about the OSI still surface. A very recent example happened just a few days ago, when OSI Board candidate Luke Faraone was denied running for the OSI BoD on the basis of a deadline that had not been revealed prior. This episode generated quite a bit of discussion on the fairness of the election process.

Anyway, aside from the drama - which is likely too complex to cover in one video - let's get back to the original point. Remember the “Open Source Definition” I have talked about just a few paragraphs above? To understand what we are talking about, it would be a good idea to take a look at what's inside. This is the specification upon which a software license has to agree to be considered open-source software, after all — it's quite a big deal.

So, what exactly is open source?

The Open Source Definition deals with defining, in practical and objective terms, what criteria a piece of software needs to satisfy to be called “open source” in the first place. This document consists of ten points, which we will go over:

  1. Free redistribution — To be considered open source, no restrictions may be put on how or whether the software is distributed. For example, even though you provide a server from which to obtain a copy of the software and its source code, you may not deny users of your software from making copies of it and sharing it through alternative channels.
  2. Source code — Perhaps the most important part of the definition itself — for software to be open source, the code files from which it was compiled must be available freely and without limitation on the redistribution. Also, the code must be predisposed to being understandable and modifiable — hence, obfuscated code is not allowed. It must also consist of the original source files, before going through any compilation phase.
  3. Derived works — It must be possible for anybody to take the source code and create something else on top of it. The way that derived word is redistributed respects the terms of the license the original work was under. For example, the GPL license has quite a strict set of rules that a licensed software needs to comply to. When you are making a derived work from one such software — often called a fork, too — you must ensure you are not breaking the boundaries of those rules.
  4. Integrity of the Author's Source Code — It is also necessary that credit is given to the original author of the source code. There are cases where derived works must change their name and branding, while still giving proper credit to the original authors.
  5. No Discrimination Against Persons or Groups, which is pretty self-explanatory, actually,
  6. No Discrimination Against Fields of Endeavor — Perhaps the most important point of the entire document — the software may be allowed to be usable for any and all purposes, with no restrictions. This is a point that has been quite controversial, and that a lot of people wouldn't expect. For example, if you add a non-commercial clause to your project — which means, you dictate that nobody is allowed to make a financial profit from your software. Then technically, your software is not published under an OSI-compliant license, and it must not be considered open source. The same would happen if you added a clause as simple as “this software must be used for good, and not for evil”. While apparently innocuous, a clause like this seeks to limit and dictate what this software can be used for and what not, which can turn into a huge problem when the definition is lax and easily malleable. Technically, a piece of open-source software explicitly needs to be usable for any purpose, whether that purpose is making money using the software, or creating nuclear weapons.
  7. Distribution of License — The software needs to be distributed with a copy of the license, which people who receive the software need to comply with.
  8. License Must Not Be Specific to a Product — The software that is licensed must be not forced to be part of any software distribution
  9. -License Must Not Restrict Other Software — The license has no say on other software that the program interacts with or is distributed with
  10. License Must Be Technology-Neutral — The license must not depend on any individual technology.

As you can see, this definition is quite strict, and it does a good job of ensuring a license that follows it has pretty much no way of escaping adhering to the overall spirit of open source.

Relationship between open source and free software

At this point, it would be perfectly understandable to ask oneself: “I have heard at length about free software, is it the same thing”? Well, the answer to that question is... yes and no. While all free software licenses are also OSI-compliant, “free software” is not quite the same thing as open-source software, though some differences tend to be a bit more philosophical.

The “Free Software movement” is another take on public-domain software. It was not invented and led by the OSI, though — it has the Free Software Foundation behind it, and it was founded by former MIT professor Richard Stallman.

Contrarily to open-source software, free software as a concept was very much born in Academia. Much before the late 90s, public domain software was a thing, though it was more popular in academic circles. Software that fits the definition of being “open source” and more has been existing since the 50s.

At the beginning of computing, sharing the source code behind distributed products was quite the norm. It was only in the 1970s and 1980s that companies such as Microsoft and IBM started to buck this trend, begin distributing software without the attached source code, and even be openly hostile towards hobbyists who wanted to study their software. This standard quickly took place, naturally, because it allowed software houses to turn in more and more profits.

Things continued to go this way until something that is admittedly quite funny happened in a lab at MIT. Richard Matthew Stallman, a programmer at the Artificial Intelligence Laboratory at MIT, was growing quite frustrated with the new laser printer that had been recently installed in the lab - a Xerox 9700. The older printer that was in use in the lab came with a copy of the source code behind its software, which Stallman had modified to notify users when their jobs were complete and avoid jams. This was very important back in the day: laser printers weren't quite the relatively small Brother desktop units we use today, but they were giant beasts that would stay on another floor in the lab and that were not personal but, rather, shared between several people. The new printer, however, did not come with a copy of the source code, which did not allow Stallman to enhance it with the same modification, hence compromising productivity in the lab. That was the point at which Stallman contacted the manufacturer, explained the issue, and asked for a copy of the source code, which he was denied, because the source code was covered by an NDA and the company was not open to releasing it to its customers.

Credits: digibarn.com

This was an eye-opening moment for Richard Stallman, who — becoming rightfully very frustrated — began to see what was wrong with the new trend of distributing software, and he wanted to do something about it. Shortly after, he founded the non-profit organization that we still refer to today — the Free Software Foundation.

What initially came out of that organization were two things: first off, the concept of copyleft, a legal mechanism to make sure a piece of software that was released as free software would have to remain free software through all of its iterations and could never turn proprietary, and an initial license to implement this mechanism, called the General Public License — "GPL” for short. Although it has gone through several revisions and the version we use nowadays — the GPL v.3+ License — is not the same as the original iteration, the core concept behind it, and what sets it apart from other licenses is the same: not only do the ten points about open source software that we talked about above hold for it, but it is also a viral license. A regular open source license can — and usually does — allow for anyone to create a derivative work from that project, give credits to the original owners, but release the result without also releasing the source code. The GPL fundamentally changes that: if you make a derivative work from a piece of software that is covered by the GPL, and you decide to release it, not only do you have to credit the original authors, but you are obligated to disclose to them what improvements you applied to their program — which translates into distributing the software along with a copy of the source code. This ensures projects that start on a copyleft license will never turn into copyrighted proprietary programs, because the license simply does not allow that.

As you might have guessed, this is already one of the main things that make free software special. A piece of software that is under a copyleft license, in fact, not only respects all the 10 points that any OSI-compliant license is forced to comply with, but more. Free software is, fundamentally, Open Source software, but even stronger. The FSF, in particular, feels strongly about it, and they strongly maintain that the open-source movement is off-track, and free software should be considered the way to go.

The main difference is that — oversimplifying it a lot for the sake of brevity — while the Open Source movement only tries to focus on a practical and objective point of view, the Free Software movement deeply cares for user freedom, and thought about safeguards to ensure a software that starts off free, stays free.

The Free Software Definition

Much like the OSI has its own document — the Open Source Definition — so the FSF has its own Free Software Definition. Written by Richard Stallman, the Free Software Definition, at first glance, already looks much shorter and more direct than the Open Source Definition. Rather than defining 10 different points it only defines 4, which are thought to be the “Four Freedoms of Free Software" — which were enumerated from 0 to 3, in a way that is reminiscent of how computers actually denote ordinals in memory.

To be considered free software, a piece of software needs to comply with the following four freedoms, which I will cite verbatim below:

The freedom to run the program as you wish, for any purpose (freedom 0).The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.The freedom to redistribute copies so you can help your neighbor (freedom 2).The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

Another thing that is made clear in the document is how the word “free” should be denoted. The community has come out with two ways to define the possible different meanings of it, “free as in beer” and “free as in freedom”.

  • A piece of software that is “free as in free beer” is available for free, and it does not require a payment to run, but it is not necessarily free software per se;
  • A piece of software that is “free as in freedom”, or “free as in free speech”, respects the freedom defined by the Free Software Definition. No implication is made on the price.

A program may be either, or both, of these things, but it is not necessary. The FSF explicitly says that it is not necessary that a piece of free software must also be gratis — it is very much possible to require money to use it. Moreover, it is not strictly necessary for the developers to share the source code publicly on some online platform like GitHub or Codeberg — according to the GPL, the important thing is that the source code gets distributed together with the software, and those who obtain the software also receive a copy of the source code. In practice, this rarely happens, and most Free Software has its source code available publicly and freely, even in cases where the precompiled binaries are paid for: the benefits of collaboration and contributions that stem from having the code available publicly are too good to pass up.

To summarize — no, free software and open-source software are not quite the same thing, and those movements are driven by different levels of idealism and different political ideals. While their definitions share a significant amount of overlap, they are very much not the same, and a license that is compliant with the OSI definition is not necessarily also compliant with the FSF's. It's a subtle difference, but an important one. The devil is in the details here. Usually, we say that a piece of software is licensed under a license that respects both specifications by calling that piece of software “FOSS”, which stands for “Free and Open Source Software”. That's it, that's what this ubiquitous acronym that you keep hearing about really means!

Those insignificant-looking details are, indeed, more critical than they seem. Not-so-pretty things happen when they are not observed.

What happens when software does not respect the OSD or the FSD?

As of late, there has been a trend of releasing software where the source code is distributed along with the binaries, but that is licensed under a custom license that does not respect either the OSI's or the FSF's specification. Those custom licenses are usually more restrictive, and they give the user hard limits on what can and cannot be done with the software. For example, something that often happens with these is that the user is not allowed to make derivative works from that project, effectively making it proprietary software; putting limits on whether and how the software can be distributed, or making distinctions between personal and professional use.

There are many names for this new trend. The more objective name is “source available software” — a definition that is less strong than “open source”, and simply implies that the source code is readable — and the more opinionated “fauxpen source”. A derogatory name that was given to this kind of software because it is typically believed to disguise itself as free and open-source code.

While there is a worrying trend of companies trying to move from open source to source available software to increase profits, so far, those companies who have tried it have seen some negative repercussions from it. What ends up happening is that, if the software is relevant enough, the community creates a derivative work (a “fork”) from the latest version of the software that is not encumbered by the new effectively closed-source license and continues development there. A lot of people end up migrating to that new open-source fork, especially since major cloud providers, like AWS, will migrate to that.

Leaving open source behind backfires!

For example, DevOps giant Terraform has recently turned their leading IaC software platform “Terraform” into source-available, non-free software; but a lot of its original users have since migrated to OpenTofu, a free software alternative managed by the Linux Foundation that is already adopted by a lot of important companies. Their move has essentially backfired on them: in the attempt to increase profits, HashiCorp ended up losing a lot of users and relevance, especially because OpenTofu is maintained by a committee that has the financial resources and the technical capability to keep it running.

Something similar has happened to Redis, a very popular ORM software — an intermediary software that puts itself between the database and the backend code, allowing the latter to interact with the database through a layer of abstraction that simplifies a lot of queries and operations, while decoupling the backend code from database-specific assumptions, thus allowing easier migrations. It was not a problem though: most companies and people who originally relied on Redis moved on to Valkey, a FOSS fork maintained by the Linux Foundation.

The same thing happened to the ELK stack by Elastic, a very useful software suite that implements an entire logging, visualization, vector database, and semantic search engine solution: after gaining enough adoption it abandoned its original Apache License to pursue its custom license — but everyone, including AWS, quickly migrated their products to FOSS fork OpenSearch, an Apache-licensed fork of the ELK stack. Just a few months ago, Elastic admitted defeat and re-licensed a portion of its code to the free AGPLv3 license. Sadly, that was a case of “too little, too late”. The damage is done, and most people did not move back to Elastic, since the derived fork is as healthy as ever, and it's undergoing active development.

Where this model has been more successful has been in smaller programs that launch with it from the start. An example is GitButler, a popular frontend for Git, which started off with a source-available license and has not been replaced by anything yet.

Conclusion

The world of public domain software can be quite complicated. I hope you have a more complete picture of what is out there now — and you truly understand what Free and Open Source software is, and why it is so important.