Protecode: Software License Management is Like Any Other Quality Management Process

Mahshad Koohgoli, CEO, Protecode

Mahshad Koohgoli, CEO, Protecode

Someone rightly observed: “All code is guilty, until proven innocent.” Ask the members of any software development team, and they will agree that tracking open source software components that have gone into building a software product, understanding the associated licenses and related legal obligations, and shipping a legally “clean” product to market — all within the stipulated time — is a task easier said than done.

LINUX For You caught up with Mahshad Koohgoli, CEO, Protecode, to gauge the complexity in the open source license landscape, and learn the best practices that both software development teams and independent developers must adopt, if they want to have clean code at the end of the product development cycle.

To begin with, could you tell us how many types of open source licenses there are, and what their related obligations are?

Broadly speaking, we divide open source licenses into two categories — permissive licenses and restrictive licenses. Permissive licenses put very little restrictions on how you modify and use the modified version of the code. Good examples are BSD (Berkeley Software Distribution), or New BSD, Apache version 2, and Mozilla Public License (MPL). These licenses don’t put onerous restrictions on users, except that the license generally has to be cited, and a copy of the text of the license needs to be distributed along with the code. BSD, or the modified version of BSD, is also very simple — and any company that’s not looking for copyleft licenses can usually opt for this one.

The other type of license is the restrictive open source license. We call these copyleft licenses, as opposed to copyright licenses. These are also sometimes called viral licenses. A good example is the GNU GPL (General Public License). Copyleft licenses try to keep the code open source — along with any modification of the code, or any product based on it. So if you use code with such licenses as the GPL, you are bound by the terms of the license to also release your own code under the GPL license as open source code.

This license can potentially be unsuitable for some businesses. For example, if you are operating in an environment where you want to keep your code proprietary, the use of this license may not be in line with the business objectives of your firm. However, sometimes it is fine to use such licenses, when your business is not based on selling the product — when your real value-add comes from the services that you give to support the product, facilitate its installation, or effect changes to the code. That’s a legitimate business too. Most open source companies practice this model. SugarCRM is a good example. There are companies that have built a whole business around supporting SugarCRM.

So, there is no question of what is good or bad when it comes to licenses. The choice depends entirely on the kind of requirements a business has.

You have once said that, “The Open Source Initiative has an approved list of nearly 70 open source licenses. However, Protecode has catalogued more than 3,000 variations of these licenses.” How come there are so many licenses and versions of open source licenses? Could you explain this statement of yours?

As regards open source licenses, there is a formal open source organisation called the Open Source Initiative (OSI), which has a formal definition of what constitutes an open source license. Based on this definition, it has recognised certain licenses as legitimate open source licenses. OSI has an approved list of nearly 70 open source licenses. However, Protecode has catalogued more than 3,000 variations of these licenses. People have created these versions in order to suit the specific requirements of their open source projects. However, when you look at the actual usage, there are mainly 6 to 10 licenses that cover 90-95 per cent of open source software.

People who work on software development projects, and who usually adopt open source licenses, modify and change these. This creates a proliferation of different licenses, which in turn becomes more and more complex, requiring one to take legal advice to understand the implications of the changes, and so on. So, as much as possible, I would recommend that development teams adopt one of the OSI-approved licenses.

Many “pundits” like to claim that open source licenses put restrictions on making money from products based on open source software. What’s your take on this statement?

There is a misconception that using open source to build your products can curtail your chances of making money from them. This is absolutely untrue. There is absolutely no restriction imposed by any of the OSI-approved open source licenses regarding what to charge for your software. You can choose to charge, and people can choose to pay or not to pay for your product, whether you are using open source or not.

As I mentioned, based on the open source licenses governing the software product, there will be certain licensing obligations — but none of them are monetary. The obligations are generally very mundane. For example, in some cases you may be required to just cite the license you are using. In addition, you may have to display the license, if your product has a user interface. For example, in your smartphone, you would find a tab where all the software and their corresponding licenses are listed.

However, copyleft licenses, depending on how you are using them, will require you to open the source code of your software, as in the case of GPL or GPLv3.

One of the things that I have observed is that the licensing terms are difficult to understand for developers, or even their managers. Hence, it is always best to have access to a licensing expert or IP lawyer. While these days we have solutions that do indicate the licensing obligations associated with software, and generate an action-item report to be considered by the project team, this is not meant to replace legal experts.

With the existence of so many versions of open source licenses, doesn’t it make the whole process of tracking the obligations related to these versions an unmanageable task? How can one discover license violations?

I will give you an example to illustrate the level of complexity that is inherent in any software product development cycle. Consider that you have to create a presentation. To do so, you may refer to your previous presentations. You may have a folder of your favourite icons, you may go to the Web, from where you may find a piece of useful text and a picture, and you add your own creativity around it. Once you have added various elements and finished making the presentation, if somebody asks you what’s yours and what’s not, you may not know, as you have not been keeping track of the origin of each element incorporated into your presentation.

Similarly, when it comes to software development, it is absolutely not possible to keep a track of the licenses and related obligations manually, as in almost any software development environment, there is more than one person involved. Even when one person is developing a project, this task is difficult to accomplish.

However, this doesn’t mean that you shouldn’t leverage open source software code. The best developers in the world don’t write code from scratch — that is just so very inefficient. There is so much software code freely available that you can use, and add your own creativity to it. There is nothing wrong in doing this, as long as you are doing all this in a managed way.

While manually keeping track of different pieces of code is extremely difficult, if not impossible, there are license-management solutions available these days that can help organisations in this task. These solutions work automatically in the background, without disturbing the existing development processes, keeping track of the software components that go into the project, and identifying if they are open source or not. If some pieces of open source code are identified, the solution indicates the associated licenses and related obligations.

Could you give a brief synopsis of your whitepaper on The 8 Step Open Source Software Adoption Process (OSSAP) Guide, which is aimed at making the process of open source software adoption foolproof and transparent?

Based on interactions with over 100 companies about the best practices followed by them during the process of software development, we have devised an eight-step software adoption process, which can make the process of software development free of legal hassles. We call it the open source software adoption process (OSSAP). We have also captured these practices in the form of a whitepaper.

It contains eight steps or practices, some of which we found were being followed by all companies. We have termed these as necessary processes. Some practices were followed only by a few companies, but were good — which we have termed as optional practices. Organisations into software development must follow these practices to ensure they are not violating any of the open source software licences. [For more, refer to the information given in the box item at the end of the interview.]

What are the hazards of not adopting any license management process?

License management is like any other quality management process. Many companies scan the product just before it is to be shipped out. In case there are issues found with licensing at this stage, the company may have to recall the product, which can waste a lot of productive time — and may even prolong the time to market for a software product. Apart from this, it could lead to legal hassles if the product is shipped without complying with the necessary licensing terms.

Through its Open Compliance Program, the Linux Foundation, along with some other companies, is currently working on a standard for software packages and licenses named Software Package Data Exchange standard (SPDX). Since Protecode, too, is a member in this initiative, could you give a few insights related to this standard, and its relevance for software development companies and teams?

SPDX is aimed at formalising, in a standard way, the information about the components of a software package, including details such as the description of what’s included in the software package, what third-party content is included (if any), which are the licenses, copyright ownership attributes of the components, and so on.

The SPDX file, which holds this information, is always meant to travel with the software package. It is almost like a bill of materials or components, the ingredients of something that you buy off-the-shelf. Having this as a standard is a significant boon to managing inventory and compliance in an automated manner.

There are solutions (like the ones that Protecode offers) that can detect the presence of the SPDX file, read it and augment it based on the scanning result, update it if needed, and regenerate an updated bill of material that can then be distributed with your software.

The first version of SPDX was released in August 2011. A new version with additional capabilities is on the slate already.

Summing up, what would your advice be to software developers and development firms?

To anyone involved in software development, it is important to accept the fact that the occurrence of third-party open source content in the code cannot be ruled out, as developers don’t write code from scratch. Hence, it is important to make sure you have a policy in your company, in terms of what is acceptable and what is not acceptable, with regard to licenses. This policy should be known by everyone in the organisation.

Apart from this, try to make sure that you have a good and proven software adoption process in place. Code scanning tools should also be made available to developers. They are affordable, and significantly reduce and ease the effort required in identifying the different licenses and related obligations. They create certificates that indicate what components you have in your software. However, these certificates are only indications to prove that the company practices a quality policy related to scanning and identifying the use of licenses in their projects.

The eight-step open source software adoption process
Mahshad recommends an eight-step process for effective open source license management.

  • Establish a software licensing policy: The first step, which is a necessary practice, involves establishing a software licensing policy. Establish what is acceptable for your organisation, and what is not. Typically, determine what kind of licenses or license stamps you believe will be acceptable for this project. List the stakeholders to define the policy. People from different departments of the business — such as the licensing or legal group, the engineering department, or the business division, and so on, can be involved as they understand the various business requirements. Also, determine what you would do in case a software licensing violation occurs.
  • Software package preapproval: The second step, which can be optional, involves defining a process that allows developers to request for certain off-the-shelf open source software to be used in their project. If somebody wants to use a software package from outside, what is the code they are going to use? How are they going to use it? Is it in binary or source code format? Determine how it will be modified, and so on. These requests will be compiled, tagged and reviewed. The package being requested for use is examined, and related licenses are assessed and matched with the software policy established; based on this evaluation, the request is either accepted or rejected.
  • Existing portfolio assessment: The third step, which is necessary, involves establishing a baseline about what you already have in your company. Basically, it means analysing your existing content and making sure that anything that violates your existing software adoption policy gets flagged, and if necessary, removed.
  • Incoming third-party software assessment and regular software assessment: These are the fourth and fifth steps. Both of these, again, are necessary and relate to analysing the content that comes from outside into your company — from contractors, out-sources or purchased from a third-party. You need to ensure that such sourced content is clean, and complies with the company’s software adoption policy. It is wise to automatically analyse code to make sure that there are no surprises in the end, when the product is ready for the market. We know of companies that do this scanning on a daily basis.
  • Real-time library check-in assessment: This step involves checking the content that exists in the content libraries of the organisation, to determine that each component in the repository complies with the established software adoption policy.
  • Real-time automated assessment: These days we have solutions that work in the background, like an anti-virus solution, and detect any piece of open source software that is added to any workstation through USB or via the Web. The developer immediately gets an alert if any violation of any licensing policy is found. The developer then has the choice of either removing the piece of code entirely, or adding a comment that will be used for testing and would be removed later, and may continue with the process.
  • Preshipment software assessment: This involves the analysis of the final artefact. If you have followed the earlier steps, there should be no surprises.

This is what we call a structured open software adoption process.

All published articles are released under Creative Commons Attribution-NonCommercial 3.0 Unported License, unless otherwise noted.
Open Source For You is powered by WordPress, which gladly sits on top of a CentOS-based LEMP stack.

Creative Commons License.