Copyright & Code: Who Owns AI-Generated Software?

Explore the legal paradox of AI-generated code. Understand the U.S. Copyright Office rulings, the risk of open-source infringement, and how to protect enterprise IP.

Updated on
5 min read
Copyright & Code: Who Owns AI-Generated Software?

Introduction: The Ownership Void

The integration of generative AI into the software development lifecycle has created an unprecedented legal paradox for enterprise IT. Tools like GitHub Copilot, Cursor, and ChatGPT allow developers to generate complex functions, database schemas, and entire microservices in seconds — as we explored in our analysis of secure coding with Copilot. But this hyper-productivity masks a massive strategic risk: Who actually owns the code?

If your engineering team uses AI to write 80% of a proprietary enterprise application, the assumption is that your company owns the intellectual property (IP). However, recent legal precedents have shattered this assumption. Organizations are increasingly falling into the “Ownership Void”—a legal gray area where they hold all the liability for the code they ship, but possess none of the copyright protection.

1. The Bedrock of Human Authorship

Intellectual property law was written for human beings. In early 2025, the U.S. Copyright Office released a definitive report reaffirming that “human authorship is a bedrock requirement of copyright.” This was further cemented in March 2025 when the U.S. Court of Appeals for the D.C. Circuit affirmed in Thaler v. Perlmutter that an artificial intelligence system cannot be deemed the author of a work.

What does this mean for software development? It means that pure AI-generated code is inherently in the public domain.

  • The Prompt is Not Authorship: The courts have ruled that writing a highly detailed prompt does not make you the author of the output. The human does not control the expressive elements of the output with sufficient specificity.
  • The Competitor Threat: If your company builds a revolutionary trading algorithm entirely via an LLM and deploys it, you cannot copyright that specific block of code. If a competitor somehow acquires that source code, you have no legal basis to sue them for copyright infringement, because it was never legally yours to begin with.

2. The Regurgitation Risk and Open-Source Licensing

While you cannot copyright AI-generated code, you can absolutely be sued for it. This is the liability trap.

Large Language Models are trained on billions of lines of code scraped from public repositories, including those governed by strict copyleft licenses like the GPL (General Public License). Sometimes, the AI suffers from “data memorization” and regurgitates exact snippets of copyrighted code.

If an AI assistant suggests a 50-line encryption function and your developer blindly accepts it, they might be unknowingly injecting GPL-licensed code into your proprietary, closed-source application. If discovered, the original author of that open-source code can sue your organization for copyright infringement. In the worst-case scenario, the viral nature of the GPL license could legally force your company to open-source your entire proprietary codebase.

3. Documenting the “Human Creative Input”

To secure copyright protection for AI-assisted software, the resulting codebase must contain a “substantial and identifiable” amount of human creative input. The AI must be treated as a tool (like a compiler or a text editor), not the author.

For IT managers and architects, this requires a fundamental shift in DevSecOps documentation and a vigilant approach to managing technical debt. You must be able to prove human intervention in a court of law. Strategies include:

  • Prompt and Diff Retention: Organizations must archive the developer’s original prompts alongside the exact code the AI generated.
  • Documenting Modifications: More importantly, version control systems must clearly highlight the modifications, refactoring, and architectural choices the human developer applied to the AI’s raw output.
  • Granular Registration: When registering software with copyright offices, legal teams must now explicitly disclaim the portions of the codebase that were autonomously generated by AI, claiming protection only for the human-authored architecture and modifications.

4. Strategic Mitigation: Enterprise Licenses and Indemnification

Relying on developers to manually track every line of AI code is unsustainable. Instead, organizational leadership must mitigate this risk at the procurement and tooling level.

  • Banning Consumer AI Tiers: Using the free, public versions of AI chatbots for enterprise coding is a massive legal liability. These tiers offer zero IP protection and frequently train on your input.
  • IP Indemnification Clauses: IT leaders must exclusively procure enterprise-grade AI coding assistants (like Copilot for Business or enterprise LLM APIs) that include explicit IP Indemnification. These contractual clauses state that if the AI hallucinates copyrighted code and your company is sued for infringement, the AI vendor (e.g., Microsoft, Google, or Anthropic) assumes the legal and financial responsibility.
  • Anti-Regurgitation Filters: Enterprise tools must be configured to actively block outputs that match known public code. If the AI generates a snippet that exists in a public GitHub repository, the tool should either block the suggestion or immediately flag the associated open-source license, allowing the developer to make an informed compliance decision.

Conclusion

Artificial Intelligence is redefining the economic value of writing code, but it is also destroying traditional notions of software ownership. For IT managers, the goal is no longer just shipping code fast; it is shipping code that you can legally defend. As the EU AI Act continues to reshape the regulatory landscape, understanding the human authorship requirement, implementing strict traceability, and relying on indemnified enterprise tools will be essential to safely navigate the copyright minefield of the AI era.

William Blondel

55 posts published

Senior full-stack web developer and amateur genealogist. Born geek with an Amstrad CPC 6128. PHP & Laravel Expert 🐘