GitHub Copilot Is Only Effective Because It Steals Open Source Code


The Software Freedom Conservancy (SFC), a non-profit community of open source advocates, announced its withdrawal from GitHub today in a scathing blog post urging members and supporters to publicly condemn the platform. The SFC’s issue with GitHub stems from allegations that Microsoft and OpenAI trained an AI system called Copilot on data that had been made available under an open source licence. Open source code is not like a donation box where you can take whatever you want and use it however you want.

It’s closer to photography. Even if a photographer does not charge you to use one of their images, you are still required to give credit where credit is due. According to an SFC blog post, Copilot does not do this when it comes to using other people’s code snippets:

“This harkens to long-standing problems with GitHub, and the central reason why we must together give up on GitHub. We’ve seen with Copilot, with GitHub’s core hosting service, and in nearly every area of endeavor, GitHub’s behavior is substantially worse than that of their peers. We don’t believe Amazon, Atlassian, GitLab, or any other for-profit hoster are perfect actors. However, a relative comparison of GitHub’s behavior to those of its peers shows that GitHub’s behavior is much worse.”

GitHub is the world’s de facto repository for open source code. It’s a cross between YouTube, Twitter, and Reddit, but for programmers and the code they create. Sure, there are alternatives. Switching from one code-repository ecosystem to another, however, is not the same as trading Instagram for TikTok. Microsoft paid more than $7 billion to acquire GitHub in 2018. Since then, Microsoft has used its position as OpenAI’s primary benefactor to collaborate on the development of Copilot. And access to Copilot is only available through a special invitation from Microsoft or through a paid subscription. The SFC and other open source advocates are outraged because Microsoft and OpenAI are effectively monetizing other people’s code while removing the ability for those who use that code to properly credit those who use it.

Copilot must be killed. Alternately, Microsoft and OpenAI could construct a time machine and travel back in time to label every single datapoint in Copilot’s database, allowing them to create a second model that gives proper credit to every output. But it’s always easier to take advantage of people and exploit the Wild West regulatory environment than it is to care about the ethics of the products and services you offer.


Please enter your comment!
Please enter your name here