GitHub Copilot Appears To Be In Violation Of The Open Source Licence


Since Copilot’s debut, Butterick has voiced criticism of the program.

Microsoft paid $7.5 billion buying GitHub in 2018 and has since integrated the code repository into its developer tools while taking a largely hands-off stance. Matthew Butterick, a writer, attorney, and programmer, has some problems with GitHub Copilot, Microsoft’s machine-learning based code helper, and how it appears to be treating open source licences incorrectly.

GitHub Copilot, a plugin for Visual Studio and other IDEs, operates by providing “suggestions” for code completion as you type. Codex serves as the system’s power source. However, developers like Butterick are having trouble with how the AI is learned, or more specifically, from where it is trained.

The issue here is that the public repositories on which GitHub is trained are licenced and demand credit when their work is utilised. Although Microsoft has been evasive about its usage of the code, referring to it as fair use, Copilot is able to generate verbatim portions of code in addition to suggestions.

As per OpenAI, the developers of Codex (which is licensed by Microsoft), “Codex was trained on tens of mil­lions of pub­lic repos­i­to­ries includ­ing code on GitHub. Microsoft itself has vaguely described the train­ing mate­r­ial as bil­lions of lines of pub­lic code”.


Please enter your comment!
Please enter your name here