Enterprises must understand the obligations that underline OSS licences and adopt proactive measures to meet them while using AI-generated code. Otherwise, they may end up in legal entanglements.
Artificial intelligence, is a technique used to train machines to work and behave like humans. It encompasses various training methods and software like machine learning, neural network and deep learning, natural language processing, and more.
As the field of artificial intelligence (AI) advances, generative AI (GAI) techniques are gaining enormous attention for their ability to generate code and provide suggestions based on prompts that are provided by developers, thereby accelerating the software development process.
These AI models are typically trained on billions of lines of code, most of which are open and freely available to the public. Based on the developers’ prompts, these AI tools generate a fully functional code or code snippets.
With the rise in the use of AI tools and developers utilising them to generate creative outputs, GAI will fundamentally change the way that software is built. Whether they are developing or using AI tools, organisations must understand the opportunities and risks involved, and evolve policies, processes, and governance to address them. This includes open source software codes.
A glimpse of open source software licences
Open source software (OSS) has become a major pillar of modern software development, fostering collaboration and innovation. With the increase in the availability of generative AI models, it has become essential to address the complexities of the code governed by open source licences.
The OSS licence allows users to access, use, modify, distribute, or redistribute the open source software. However, many OSS licences impose different obligations and conditions on the use and distribution of the software. These can range from providing attribution to specifying that the OSS modified code must be released under the same licence.
Failure to comply with these obligations can result in IP infringement, which may result in litigation and the loss of right to use the open source software.
OSS compliance issues with AI generated code
The legal concerns arising from AI-based code generator tools are caused by the processing of publicly available OSS code to train the AI models. These codes are sometimes covered by certain licences.
To avoid the possibility of a lawsuit, some organisations have taken the approach of restricting their development teams’ use of generated code. However, such restrictions will prevent developers from obtaining the benefits of AI code generators.
AI models used for code generation are very dynamic and continuously evolving. This might be a challenge in ensuring compliance with OSS licences, as the model behaviours keep changing, leading to new compliance issues. There are other challenges too.
- Training data may consist of a vast amount of OSS, including OSS libraries or snippets and sources that are incomplete or lack proper licensing information. It may contain software licensed under certain OSS licences, making it challenging to ensure these licences are compatible with each other.
- AI-generated code can contain snippets or patterns from open source projects. Identifying the licence for this generated code can be difficult, making it challenging to determine the compliance and associated obligations.
- Developers may wonder whether they are required to licence new software applications created from AI-generated code under the same open source licence.
In light of these complexities, let’s explore the practical remedies to mitigate legal risks.
- Just using open source code in the training model is unlikely to constitute any infringement. However, the obligations must be considered based on the licences under which the open source code is made available. If the obligations are not satisfied as per the licence, it can create IP risk.
- Use tools that have trained their models with reputable sources. Training the models on larger sets of data is expensive and requires significant computing power, which larger companies typically possess.
- Avoid relying on the output generated by the tools and using it as it is. Instead, it is always better to revise the output, use it as a reference and edit the generated code as per the requirements. This approach will help increase the chances of the output to be protected under copyright.
- It is always a good practice to segregate the AI generated code from the code not generated by AI tools.
- Periodically scan the project using software composition analysis tools (e.g., Black Duck, Code Insight) to identify the open source software libraries or the snippets that are under restrictive licences in the project.
- Many AI-generated tools come with features (e.g., GitHub Copilot) that help users to filter the inputs. Using these tools helps to reduce liability.
What companies must do
To address AI-generated code challenges, companies must establish clear-cut open source policies. Let’s delve into a few essential steps.
- Establish open source policies and monitor the usage of open source components in the AI-generated code.
- Maintain business accounts, where administrators can set the configuration that helps to mandate the use of the AI tools.
- Implement a process for licence compliance, which includes internal audits of the applications and approval of the licences based on their obligations.
- Conduct internal audits before the release of the software to ensure that the code is free from critical OSS licences, that there are no major issues, and the applications are in line with OSS licence compliance.
- Educate and train development teams working on AI-generated tools about OSS licence compliance, licence obligations, attribution practices, and the implications of the different and critical OSS licences.
- Consult legal experts specialised in OSS licences and IP laws to ensure compliance with relevant guidelines and regulations. Seek their advice on the complex OSS licensing situations and any legal concepts in the context of AI-generated code or in the usage of AI generators.
Generative AI has revolutionised software development by automating the generation of code, thus accelerating developers’ workflows. However, organisations must address OSS compliance when utilising these AI tools.
By understanding the OSS licences, implementing best practices, and leveraging automated tools, organisations can handle the code complexity and ensure OSS compliance requirements. All these approaches will promote a healthy OSS compliance ecosystem.