ChatGPT and the legalities of language generation

6 minute read 27.01.2023 Annie Watts, Zeina Milicevic

ChatGPT uses deep learning techniques to generate human-like text. It has sparked concern among copyright owners whose literary works may have been used without their knowledge or permission, raising questions about copyright infringement as well as subsistence and ownership of copyright in the output produced.

Key takeouts

The emergence of ChatGPT, a chatbot recently launched by OpenAI, has sparked concerns among authors and artists that their online content is being reproduced without permission.

Whether the actions of the chatbot amount to copyright infringement (and if so, who the infringer is), and whether copyright subsists in the output produced by the bot are issues that will need to be grappled with in coming years.

It remains to be seen whether OpenAI or ChatGPT users will face a legal challenge in respect of the use of the product, whether in respect of copyright infringement or otherwise.

ChatGPT is a chatbot recently launched by OpenAI in late 2022. It is described by OpenAI as a large language model that uses deep learning techniques to generate human-like text. Essentially, you can ask the bot to generate text in response to the prompts you feed it. One example, which raises concerns in its own right, is that you could ask the bot to generate a response to an essay question.

In its development, the bot underwent extensive training, which involved it being fed enormous amounts of data retrieved from books, websites, articles, and other literary sources on the internet. It then underwent "supervised learning" and "reinforcement learning" and was fine-tuned by human trainers to be able to respond to human prompts.

All in all, the bot is equipped to draw from the vast amounts of information it has been fed, and distil the information into a succinct response in a matter of seconds.

When copyright and bots collide

The emergence of the bot has sparked concerns among authors and artists whose literary works (articles, blog posts, reports) are hosted online. This is not only because their content may be pulled from the internet and fed to the bot when it is trained. It is also because it is possible that their literary works constitute some of the content that is generated by ChatGPT in whole or in part when it produces a response to a human's prompt. This is done without the creator's knowledge or permission.

While a number of issues arise from the creation and use of ChatGPT, there are three that immediately spring to mind.

Does the training and use of ChatGPT result in an infringement of copyright?

There are two distinct points in time at which an infringement of copyright could occur:

when material is pulled from the internet and "fed" to the bot, likely amounting to a reproduction and possibly a communication of the material; and
when the bot produces an output that reproduces a substantial part of a third party's copyright work (and possibly adapts it) and then communicates it to a user. Whether this actually occurs in practice is yet to be seen, however the possibility does remain.

If copyright is infringed, then who is the infringer?

Where infringement occurs, the infringer is the person:

who does; or
who authorises the doing of,

the acts comprised in the owner's copyright (see s 36 of the Copyright Act 1968 (Cth)).

In the first infringement scenario, this will ultimately be a factual question. Who is selecting the content and feeding the bot? Is it OpenAI? Is it a contracted software engineer? If the latter, is OpenAI authorising the infringement?

In the second infringement scenario, there does not appear to be a 'person' who does the infringing act(s). It all appears to be done by the bot. But is someone authorising the bot to infringe? And if so, is it the corporate entity that developed and released the bot, OpenAI? Or is it the human who prompts the bot to conduct its responses? Or both?

This will again come down to how ChatGPT works in practice. However, it would be interesting to understand whether the bot has been trained to ensure that it does not reproduce a substantial part of a copyright work and if so, and if it turned out it did or it was asked to by a user, whether this would change the position on authorisation liability.

Is there copyright in the output?

A separate issue at play is whether copyright subsists in the "output" produced by the bot.

Under Australian copyright law, there needs to be a human author in order for copyright to subsist in a work (see our previous article and podcast, The robots are coming: artificial intelligence, ethics and the law). In other words, copyright is unlikely to subsist in material created by a chatbot.

Nonetheless, the terms of use of the OpenAI product appear to assume that copyright can subsist in the output produced by ChatGPT. The terms state:

"As between the parties and to the extent permitted by applicable law, you own all Input, and subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to Output."

Should a Chatbot be capable of meeting the authorship requirement? If not, how can an entity validly exploit (by assignment or licence) material that a bot creates in circumstances where there would be no copyright to exploit? Will this issue ultimately turn users away from using the bot?

Considerations for emerging technologies

The issues raised above are just some of the many considerations the legal world will need to grapple with over coming years. Questions are already being raised as to whether university students should be able to utilise the bot for their assignments, and if so, to what extent this can be controlled or combatted (such as through the development of GPTZero).

More recently, we have seen that the bot also has the potential to create reputational issues, with Australian singer Nick Cave voicing his concerns after a fan sent him the results of a ChatGPT response to their prompt to produce lyrics "in the style of Nick Cave". Cave labelled the bot's results a "grotesque mockery" of his lyrics and a "travesty".

At this time, it remains to be seen whether OpenAI or ChatGPT users will face a legal challenge in respect of the use of the product. However, the prospects of a legal suit in this space are not inconceivable, with Getty Images having recently sued the creators of AI art tool Stable Diffusion in the UK for copyright infringement, albeit in respect of artistic works.

Get in touch with us if you'd like to discuss the legal implications of emerging AI technology, and how it may affect you.