AI and Copyright Lord C-J ” The Government need to take this option off the table” – Speaker AI and Creative Industries | UK, China, Middle East

With huge thanks to Christian Gordon-Pullar for all his work here is our response to the Government’s consultation on IP and Copyright. We are clear that there is no lack of clarity in UK copyright law that should allow technology companies to scrape the internet and use copyright material for training their AI models without any recompense to creators and that we need to introduce clear rules requiring transparency of use and a better enforcement mechanism for. breaches of copyright.

I and my Liberal Democrat colleagues fully support the major campaign by the media, artists and the creative industries to demand that the government take their preferred option, of a text and data mining exeception requiring an opt-out, off the table and make sure that they ensure that one of the most valuable sectors in the British economy survives and thrives alongside AI.

Here is a link to the Consultation

https://www.gov.uk/government/consultations/copyright-and-artificial-intelligence/copyright-and-artificial-intelligence

And here is our response

Response to Consultation: AI and Copyright on behalf of Lord Clement-Jones and Christian Gordon-Pullar

Context for Response to Consultation

Use of AI clearly offers significant opportunities across the broad canvas of the United Kingdom’s creative industries, and abroad. Creators and associated creative businesses are using AI technology to support creativity, the process of content production or to help personalise content. AI clearly has many creative uses, as Sir Paul McCartney has emphasised. It is one thing, however, to use the technology but another to be at the mercy of it.

The Government consultation[3] itself begins with the sentence :

“Two major strengths of the UK economy are its creative industries and AI sector. Both are essential to drive economic growth and deliver the government’s Plan for Change.”

We support the policy objectives within the consultation and in particular, at a high level, the three objectives set out in the Government consultation, in relation to AI and Copyright, namely:

Supporting right holders’ controlof their content and ability to be remunerated for its use.
Supporting the development of world-leading AI models in the UKby ensuring wide and lawful access to high-quality data.
Promoting greater trust and transparencybetween the sectors.

It is incumbent on any Government to find a true and fair balance for authors, musicians, artists and all creative content creators and owners, not just for foreign and domestic tech and AI companies and tech entrepreneurs, at the expense of the giants on whose creative and historical works their success relies and on whose shoulders their business and technology stands.

The Ministerial foreword reinforces this:

“This consultation sets out our plan to deliver a copyright and AI framework that rewards human creativity, incentivises innovation and provides the legal certainty required for long-term growth in both sectors.”

It is unclear and remains unexplained – in the Consultation – why the Government states:

“AI firms have raised concerns that the lack of clarity over how they can legally access training data creates legal risks, stunts AI innovation in the UK and holds back AI adoption”

It is entirely unclear where or what lack of clarity is being referenced? There is currently clarity and certainty in the Copyright regime in the United Kingdom and additionally the UK recognises Computer Generated Works (See para 51 of the Consultation). In relation to copyright and Intellectual Property (IP), under the current law in relation to content ingestion by AI developers, consent must be secured for the use of rightsholders’’ content. The Consultation appears to be creating the distinct impression that copyright owners should be concerned and this is creating uncertainty.

The Consultation also states:

The creative industries drive our economy, including TV and film, advertising, the performing arts, music, publishing, and video games. They contribute £124.8 billion GVA to our economy annually, they employ many thousands of people, they help define our national identity and they fly the flag for our values across the globe. They are intrinsic to our success as a nation and the intellectual property they create is essential to our economic strength

It is unclear however if, and to what extent, the Government has carried out any serious investigation into the financial impact on the creative industries in the preparation of this Consultation, or since its publication. It is however clear that the impact will be significant and very likely greater than the proposed benefits of the data centres and investments offered by Big Tech.

The estimate of benefits to the UK economy use by the AI Opportunities Plan is built on shaky foundations,. It is derived from Google’s UK Economic Impact Report which highlighted that “AI-powered innovation could create over £400 billion in economic value for the UK economy by 2030. The £400 billion figure cited by Google comes from a report commissioned by Google and compiled by the consultancy firm Public First. This economic impact report was designed to analyse the potential effects of AI adoption on the UK economy by 2030.Public First conducted the research using several methods:

Polling of over 4,000 individuals across every region in the UK
Polling of 1,000 senior business leaders from small, medium, and large businesses across various industries
Traditional economic modelling to measure the economic activity driven by Google products.

The report estimates that AI-powered innovation could create over £400 billion in economic value for the UK economy by 2030, which is equivalent to an annual growth rate of 2.6%.

This figure is based on projections of how AI technologies could boost productivity, create new job opportunities, and drive innovation across various sectors of the economy. It is important to note that this is a projection based on economic modelling and assumptions about future AI adoption and impact. As with any such forecast, it should be viewed as an estimate rather than a guaranteed outcome.

We remain convinced that the current copyright regime is clear and no evidence has been produced to warrant a new and more permissive exception regime to existing copyright laws in the United Kingdom. It is our preferred option that the Government makes a clear statement that the use and/or ‘ingestion’ of content, without consent, to train an AI model capable of being used beyond non-commercial research, constitutes copyright infringement.

Foreword/Summary

Questions surrounding the balance between copyright and data mining (text and data mining or TDM) is a major issue for content owners and creatives in the literary, musical and visual arts and not just in the UK, but around the world.

Getty and the New York Times are suing in the United States, so too many writers, artists and musicians and it was at the root of the Hollywood Actor and Writers strike last year .

Here in the United Kingdom, as the Government’s intentions have become clearer the temperature has risen. We have seen the creation of a new campaign -Creative Rights in AI Coalition (CRAIC) across the creative and news industries and Ed Newton-Rex six[4] raising over 30,000 signatories from creators and creative organisations. But with the current Consultation, we are now faced with a proposal regarding text and data mining exception which we thought was settled under the last Government. It starts from the false premise of legal uncertainty.

As the News Media Association says:

The government’s consultation is based on the mistaken idea—promoted by tech lobbyists and echoed in the consultation—that there is a lack of clarity in existing copyright law. This is completely untrue: the use of copyrighted content by Gen AI firms without a license is theft on a mass scale, and there is no objective case for a new text and data mining exception.

There is no lack of clarity over how AI developers can legally access training data. The applicable law in England and Wales is absolutely clear that commercial organisations – including Gen AI developers – must license the data they use to train their Large Language Models (“LLMs”). Merely because AI platforms such as Stability AI are resisting claims doesn’t mean the law in the UK is uncertain. There is no clear reason for – and no need for developers to – find ‘it difficult to navigate copyright law in the UK’.

AI developers have already, in a number of cases, reached agreement with between news publishers. OpenAI has signed deals with publishers like News Corp, Axel Springer, The Atlantic, and Reuters, offering annual payments between $1 million and $5 million, with News Corp’s deal reportedly worth $250 million over five years.

More recently, it is clear that the US fair use defence questions have not been settled despite the ruling in Thomson Reuters v. ROSS Intelligence, which involved Thomson Reuters suing ROSS Intelligence for using its copyrighted Westlaw headnotes to train an AI-powered legal research tool. On February 11, 2025, Judge Stephanos Bibas of the Delaware federal district court ruled against ROSS, rejecting its fair use defence and granting partial summary judgment in favour of Thomson Reuters. It is notable, however, that the court emphasised that ROSS’s use was commercial and non-transformative, as it created a competing product using the copyrighted material. This decision is significant as it sets a precedent for AI copyright cases, though it does not address generative AI specifically.

There can be no excuse of market failure. There are well established licensing solutions administered by a variety of well-established mechanisms and collecting societies. There should be no uncertainty around the existing law and the surrounding legal framework. We have some of the most effective collective rights organisations in the world.

The Consultation says that “The government believes that the best way to achieve these objectives is through a package of interventions that can balance the needs of the two sectors” The government appears to believe we need to achieve a balance between the creative industries and the tech industries. But the Consultation raises the fundamental question as to what kind of balance the government’s preferred option will deliver.

The government’s preferred option is to change the UK’s copyright framework by creating a text and data mining exception where rights holders have not expressly reserved their rights—in other words, an ‘opt-out’ system, where content is free to use unless a rights holder proactively withholds consent. To complement this, the government is proposing: (a) transparency provisions; and (b) provisions to ensure that rights reservation mechanisms are effective.

The government has stated that it will only move ahead with its preferred ‘rights reservation’ option if the transparency and rights reservation provisions are ‘effective, accessible, and widely adopted’. However, it will be up to Ministers to decide what provisions meet this standard, and it is clear that the government wishes to move ahead with this option regardless of workability, without knowing if their own standards for implementation can be met.

A few key overarching points to note:

Although it is absolutely clear that that use of copyright works to train AI models is contrary to UK copyright law, the laws around transparency of these activities haven’t caught up. As well as using pirated e-books in their training data, AI developers scrape the internet for valuable professional journalism (even where such articles are protected by © Copyright notices and terms and conditions) and other media, in breach of both the terms of service of websites and copyright law, for use in training commercial AI models.
At present, developers can do this without declaring their identity, or they may use IP scraped to appear in a search index for the completely different commercial purpose of training AI models.
How can rights owners agree – in principle or in practice – to opt-out of something they don’t know full understand or even know about? AI developers will often scrape websites, or access other pirated material before they launch an LLM in public. This means there is no way for IP owners to opt-out of their material being taken before its inclusion in these models. Once used to train these models, the commercial value has already been extracted from the third party IP scraped, without permission, with no practical way to find or delete data from those models.
The next wave of AI models responds to user queries by browsing the web to extract valuable news and information from professional news websites. This is known as Retrieval Augmented Generation-RAG. Without payment for extracting this commercial value, AI agents built by companies such as Perplexity, Google and Meta, will effectively free ride on the professional hard work of journalists, authors and creators. At present such crawlers are hard to block.

This is incredibly concerning, given that no effective ‘rights reservation’ system for the use of content by Gen AI models has been proposed or implemented anywhere in the world, making the government proposals entirely speculative.

As the NMA also say :

“What the government is proposing is an incredibly unfair trade-off—giving the creative industries a vague commitment to transparency, whilst giving the rights of hundreds of thousands of creators to Gen AI firms. While creators are desperate for a solution after years of copyright theft by Gen AI firms, making a crime legal cannot be the solution to mass theft.[5]”

We need transparency and clear statement about copyright. We absolutely should not expect artists to have to opt out. AI developers must: be transparent about the identity of their crawlers; be transparent about the purposes of their crawlers; and have separate crawlers for distinct purposes. Unless news publishers and the broader creative industries can retain control over their data – making UK copyright law enforceable – AI firms will be free to scrape the web without remunerating creators. This will not only reduce investment in trusted journalism, but it will ultimately harm innovation in the AI sector. If less and less human-authored IP is produced, tech developers will lack the high-quality data that is the essential fuel in generative AI.

Amending the applicable Law to address the challenges posed by AI development, particularly in relation to copyright and transparency, is essential to protect the rights of creators, foster responsible innovation, and ensure a sustainable future for the creative industries.

This should apply regardless of which country the scraping of copyright material takes place if developers market their product in the UK, regardless of where the training takes place.

It will also ensure that AI start-ups based in the UK are not put at a competitive disadvantage due to the ability of international firms to conduct training in a different jurisdiction. It is clear that AI developers have used their lobbying clout to persuade the government that a new exemption from copyright – in their favour – is required.

In response we will be vigorously opposing the preferred option for a new text and data mining exemption with an opt-out and will be seeking to ensure that the government answers the following key questions before proceeding further

What led the government to do a u-turn on the previous government’s decision to drop the text and data mining exemption it proposed?
What estimate of the damage to the creative industries it has made of implementing its clearly favoured option of a TDM plus opt out given there is no robust economic assessment currently in existence
Is damaging the most successful UK economic sector for the benefit of US AI developers what it means by balance?
Why it has not included the possibility of an opt in to a TDM in its consultation paper options?
What examples of successful workable opt outs or rights reservation from TDM’s can it draw on particularly for small rights holders? What research has it done? the paper essentially admits that effective technology is not there yet. Isn’t it clear that the EU opt out system under the Copyright Directive has not delivered clarity?
What regulatory mechanism if any does the government envisage if its proposal for a TDM with rights reservation/opt out is adopted? How are creators going to be sure any new system would work in the first place?

Detailed Response below

Response to Consultation

Copyright – Text and Data Mining

The three stated objectives in the Consultation[6] are set out in para / section 54 of the Consultation:

Supporting right holders’ controlof their content and ability to be remunerated for its use.
Supporting the development of world-leading AI models in the UKby ensuring wide and lawful access to high-quality data.
Promoting greater trust and transparencybetween the sectors.

The Government rightly believe that there is a need to promote and further enable AI development. This must however be balanced with a commensurate and proportionate recognition of the critical importance and value of data as raw material. AI developers rely on high-quality data to develop reliable and innovative AI-driven inventions and applications. Licensing regimes under existing IP law are designed to cater for the needs of AI developers.

By the same token content and data-driven businesses themselves have seen a rapid increase in the use of AI technology and machine-learning, either for news summaries, data gathering efforts, translations for research and journalistic purposes or to assist organisations to save time by processing large amounts of text and other data at scale and speed. Digital technologies, including AI, are and will continue to be of critical importance to these industries, helping create content, new products and value-added services to deliver to a broad range of corporate and retail clients. Whether in news media or cross-industry research, publishers are themselves investing in AI; continued collaboration with start-ups and academia are creating tailored materials for wide populations of beneficiaries (students, academia, research organisations, and even marketers of consumer publishing products).

It is of paramount importance to balance the needs of future AI development with the legal, commercial and economic rights of copyright and data-owners and the need to incentivize new AI adoption with recognition of the rights of – and remuneration for – existing content owners.

We have however seen no evidence the existing copyright legislative framework fails to adequately address the current needs of AI developers. Moreover it is particularly important, in our view, to ensure that the development of AI is not enabled at the expense of the underlying investment by copyright and data-owners. (see endnote 1).

If the content owners of underlying data materials withhold the licensing of, or access to, such materials or attempt to price them at a level that is unfair, the answer is for Government via the Competition and Markets Authority/the new Digital Markets Unit (or indeed other regulators who form part of the Digital Regulation Cooperation Forum) to put in place competition measures to ensure there is a clear legal recourse in such situations.

In summary we do not believe that current copyright law creates a disparity between the interests of AI developers and investors and content owners. The existing copyright regime under the CDPA reflects a balance that fairly protects those investing in data creation without giving an unfair advantage to technology companies offering AI-enabled content creation services. In particular the current framework provides a balanced regime for data and text mining and we believe no changes are required at present.

At the very least, if AI Operators and providers must be able to demonstrate transparency and provide users and regulators with access to clear records of the inputs that the AI technology has used (e.g. sources of content includes copyrighted content), it will be impossible to satisfy the UK regime as well as basic international standards on cybersecurity standards, let alone copyright infringement or applicable parallel imports laws, to satisfy UK sovereignty principles.

In order, RESPONSES below.

Section C1

Question 1.Do you agree that option 3 is most likely to meet the objectives set out above?

NO, we do not agree.

Creating a more permissive system of copyright is unlikely to incentivise AI developers to obtain consent or license content from rightsholders.
AI developers have shown little appetite to license content at scale and there have been no signals, from what we have seen, that that position would change under any new regime. In the EU, which introduced a new Text and Data Mining (TDM) Exception with an Opt-Out (before the explosion in AI development) there has been no material increase in licensing of content, demonstrating that it is not the law which is preventing such licensing.
As currently drafted, the Consultation contains a new exception would also be available to all users, not simply AI developers for training. This would mean any user could copy works and reproduce them for commercial gain unless those rights were reserved. This presents the distinct opportunity for some unscrupulous users to deliberately look for works that are not rights reserved to exploit them commercially which is not possible under the existing copyright system

Question 2.Which option do you prefer and why?

Ranking Options in order:

We would therefore urge the Government to elect Option 0 – Make no legal change. No other option is currently justifiable given the lack of evidence of an adverse commercial environment preventing access to data or text by AI-enabled content creators. Should the Government or IPO consider that there needs to be increased access to data at lower cost, it should look at otherpolicy levers to stimulate such uptake, such as providing tax incentives for content owners to license content, rather than reducing copyright protection.
We also concur with industry leads who consider that forcing rightsholders to opt in to protection, or opt out of a data mining exception – as suggested in Option 3 – would be complicated and costly for many businesses and industries who own literally millions of works, when licensing is far simpler, and would be against the spirit of international treaties on copyright
Further such changes would impact the rights of copyright owners, as enshrined in Article 1 of the Human Rights Act . The Human Rights Act 1998 incorporates rights contained in the European Convention on Human Rights (ECHR) into UK national law. This means that they can be used to challenge the actions and decisions of governments and public bodies in the UK courts. Under the UK Human Rights Act 1998, intellectual property rights are protected as part of the broader “right to property” enshrined in Article 1 of the First Protocol, meaning that public authorities cannot interfere with your intellectual property without a legitimate legal reason and in the public interest; this includes patents, trademarks, copyrights, and other forms of intellectual property you may own

Article 1 of the First Protocol states:.

“Every natural or legal person is entitled to the peaceful enjoyment of his possessions. No one shall be deprived of his possessions except in the public interest and subject to the conditions provided for by law and by the general principles of international law.

The preceding provisions shall not, however, in any way impair the right of a State to enforce such laws as it deems necessary to control the use of property in accordance with the general interest or to secure the payment of taxes or other contributions or penalties.”

Possessions include any tangible and intangible property

While the Act protects intellectual property, it does allow for limitations in the public interest, meaning that the government can restrict intellectual property rights under certain circumstances if it is deemed necessary for the greater good. This is clearly for the benefit of tech and AI companies, not the greater good of content owners and creative industries across the fields of literary, musical and visual arts, inter alia.

Section C1 (cont.)

Question 3. Do you support the introduction of an exception along the lines outlined above?

RESPONSE: No, this is not necessary under UK law as the copyright owner already holds such rights, and such an exception would not be effective.

Absent a licence, or consent in writing, such rights to control his/her/its copyright are reserved for the copyright owner and no use of that copyright is permitted (except under existing non-commercial research exceptions for academic research, inter alia). Any such unauthorised use would constitute copyright infringement.

Question 4. If so, what aspects do you consider to be the most important? If not, what other approach do you propose and how would that achieve the intended balance of objectives?

RESPONSE: Only applicable if Option 3 is the eventual outcome. If such an approach for Option 3 were in fact the outcome at the end of the consultation, a presumption (as per existing UK law) should exist that no content is automatically permitted for TDM use by AI/Tech companies or other third parties, and that would be the case even in case where content available publicly or otherwise does not have a text of machine readable opt-out language. The presumption must be in favour of the content and copyright owner (else risks creating costly litigation for SMEs and individuals who cannot reasonably be expected to allocate funds to litigate foreign and domestic tech companies and other well-funded tech start-ups seeking to use content without consent.

Any new exception would also have to be narrowly drafted to ensure it is limited to AI training, to ensure ill-intentioned users do not exploit the new system to reproduce works for commercial gain outside of the AI environment.

Question 5. What influence, positive or negative, would the introduction of an exception along these lines have on you or your organisation? Please provide quantitative information where possible.

RESPONSE: Any new exceptions would adversely impact creative industries both operationally and financially – as seen from feedback and publications and statements made by the Performing Rights Society [7](PRS)[8], Anti-Copying in Design (ACID[9]) and others. (See footnotes for references).

Content owners would have to spend time and money on legal advice, potentially, to:

Embed Metadata and Watermarks – Add metadata to digital files to indicate copyright ownership and usage restrictions. Watermarks could deter unauthorised use if a robust and easily useable form was readily available. Embedding metadata could be relatively simple and could be done using file properties, specialised software or programming methods (e.g., EXIF for images, or custom fields in JSON or XML) See Appendix 1
Monitor and Enforce Their Rights
Content owners would have to regularly check for unauthorised use of their copyright work online. If an owner identifies infringements, they would need to contact the offending party to request removal or seek legal advice. However, identifying the offending party remains a significant challenge without a proper system in place in terms of transparency requirements..

For example, A photographer would have to retrospectively opt-out thousands of individual works to gain protection which is currently automatic, time that they can ill-afford to spend which detracts from their valuable time, better spent generating new revenue-generating copyright-protected works. Legal costs would like increase – to challenge infringement – but under a new regime there would have to be a dual track for action, one under the new regime and another under the existing regime, potentially doubling legal costs.

Question 6. What action should a developer take when a reservation has been applied to a copy of a work?

RESPONSE: The developer must seek consent and pay for the content before training AI or technology systems on the content and without such consent would not / should not train its AI or technology on such content . This applies equally today under the existing law – and most companies ignore such rights because they are not enforced and the consequences are too financially burdensome for content owners – hence the rights should be bolstered not diluted.

Question 7. What should be the legal consequences if a reservation is ignored?

RESPONSE: Any new system for rights reservation must have the same legal standing as Technical Protection Measures. That is sub-optimal in any event. We propose that a statutory strict liability should be imposed and a presumption of copyright infringement should apply in case where use is without consent/licence.

Question 8. Do you agree that rights should be reserved in machine-readable formats? Where possible, please indicate what you anticipate the cost of introducing and/or complying with a rights reservation in machine-readable format would be.

RESPONSE: No: any such system should be sufficiently flexible to enable different content owners to opt out for types of works. While machine readable formats would most likely be required, these must be simple and low cost enough for all rightsholders to access; without this, such measures place the burden on the content owners to spend money to defend copyright and IP protection, rights that are fundamentally embodied in existing law and rights already held under the Human Rights Act 1998.

Section C2: Technical Standards

Question 9. Is there a need for greater standardisation of rights reservation protocols?

RESPONSE: If required at all, standardisation of protocols and standards for such protocols would seem helpful.

Question 10. How can compliance with standards be encouraged?

RESPONSE: Infringement or breach of any such protocols would need to be clearly stated to constitute copyright infringement with deterrents in place to create a compliant legislative regime. In the absence of such protocols, a statutory strict liability should be imposed or a presumption of copyright infringement should apply

Question 11. Should the government have a role in ensuring this and, if so, what should that be?

RESPONSE: Establish a Government regulator or unit to enforce such rights, to be paid for by the tech industry – which is demanding additional rights, which derogate from the rights of copyright and IP owners, which already exist under existing UK Copyright legislation and under the Human Rights Act 1998.

Section C3 – Licensing and contracts

Question 12. Does current practice relating to the licensing of copyright works for AI training meet the needs of creators and performers?

RESPONSE: Currently the licensing regime does not expressly address licensing for AI training but if AI training entities should apply the existing legal principles under the existing Law and therefore actually check copyright notices and apply for licensing /consent where no other approach is available.

Question 13. Where possible, please indicate the revenue/cost that you or your organisation receives/pays per year for this licensing under current practice.

RESPONSE: n/a from the authors

Question 14. Should measures be introduced to support good licensing practice?

RESPONSE: There is no presumption that commercial AI training or use of inputs is permitted under UK copyright law and rights-management societies and professional bodies including PRS and other licensing organisations already provide for such good licensing practices and may therefore need to update those for use by AI etc –

See https://www.prsformusic.com/ and also https://www.gov.uk/licence-to-play-live-or-recorded-music and ICO for film licensing – at https://www.independentcinemaoffice.org.uk/advice-support/what-licences-do-i-need/film-copyright-licensing/ and ICMP for Contemporary Music https://www.icmp.ac.uk/blog/understanding-music-copyrights-and-licenses

Question 15. Should the government have a role in encouraging collective licensing and/or data aggregation services? If so, what role should it play?

RESPONSE: No – this should be left to professional collection societies and licensing bodies authorised by each industry but the Government could, as an alternative to the preferred approach of robust enforcement, assist content owners by making any unauthorised use enforceable as a statutory liability, or create a presumption of infringement if that is not already clear (it seems clear to the authors)

Question 16. Are you aware of any individuals or bodies with specific licensing needs that should be taken into account?

RESPONSE: n/a

Section C4 – Transparency

Question 17. Do you agree that AI developers should disclose the sources of their training material?

RESPONSE YES. Transparency is vital to the AI eco-system. We advocate for transparency, by which we intend that AI developers must maintain records of the individual works that their AI systems etc. have ingested at a granular level.

Question 18. If so, what level of granularity is sufficient and necessary for AI firms when providing transparency over the inputs to generative models?

RESPONSE : As with current Law – the source, author and detail of data / content used and whether it is used under licence or not. Granularity is crucial – a general statement would not be sufficient to protect the principles of transparency nor to protect creator’s rights under the Law.

Question 19. What transparency should be required in relation to web crawlers?

RESPONSE: We should retain the amendments to the Data Use and Access Bill in this respect proposed by Baroness Kidron and passed by the House of Lords on the 28th of January 2025 which provide inter alia for regulations to require disclosure by AI models of

the name of the crawler,
the legal entity responsible for the crawler,
the specific purposes for which each crawler is used,
the legal entities to which operators provide data scraped by the crawlers they operate, and
a single point of contact to enable copyright owners to communicate with them and to lodge complaints about the use of their copyrighted works.
the URLs accessed by crawlers deployed by them or by third parties on their behalf or from whom they have obtained text or data,
the text and data used for the pre-training, training and fine-tuning, including the type and provenance of the text and data and the means by which it was obtained,
information that can be used to identify individual works, and
the timeframe of data collection.

Question 20.What is a proportionate approach to ensuring appropriate transparency?

RESPONSE: Unclear but it must at least involve an equal or greater effort by AI and tech developers using AI to scrape content as is being considered for content owners who have to add tech measures to their content e.g. watermarks etc and notices in machine readable format for opt outs and/or further technical, legal and operational costs to craft disclaimers or text for assertion of their (already existing) rights.

Question 21. Where possible, please indicate what you anticipate the costs of introducing transparency measures on AI developers would be.

RESPONSE: Unclear at this stage but perhaps the Government can broker – as part of its incentive deals– a framework to resolve past copyright infringement issues, to obviate the need for class actions by creative content owners or individuals, a one-off settlement/payment for past copyright infringement

Question 22. How can compliance with transparency requirements be encouraged, and does this require regulatory underpinning?

RESPONSE: If Option 3 is adopted then it must be a condition for tech developers and AI companies, at least, to take all reasonable operational measures to ensure that copyright and content is licensed or its input and output use is authorised (under license or written consent), such efforts to be at least equal or greater than the efforts being likely considered for content owners (who have to add tech measures to their content e.g. watermarks etc and notices in machine readable format for opt outs and/or further technical, legal and operational costs to craft disclaimers or text for assertion of their (already existing) rights)

Question 23. What are your views on the EU’s approach to transparency?

RESPONSE: It is very questionable, to say the least, how effective or workable the Working Groups implementing the EU AI ACT have found the opt out provisions; in the meantime, the transparency provisions is a clear benchmark for the UK and it should take note, given that until recently UK was bound by such rules. The law in UK should at least equally protect UK citizens and content and creative owners – – but not impose unworkable opt out mechanisms based on an as-yet-untested EU comparison – to promote consistency and to avoid a mass migration of creatives.

Section C5 : Clarification of Copyright Law

Question 24. What steps can the government take to encourage AI developers to train their models in the UK and in accordance with UK law to ensure that the rights of right holders are respected?

RESPONSE: See above responses to Q20 and Q22 – and reiterated here. A statutory strict liability should be imposed or a presumption of copyright infringement should apply, failing which, the Government should make a clear statement, in the form of a Copyright Notice, that the current exception regime does not allow for the use of works, covered by copyright, for commercial purposes, without the consent of the owner of those works.

Section C6

Question 25. To what extent does the copyright status of AI models trained outside the UK require clarification to ensure fairness for AI developers and right holders?

RESPONSE: If an AI company has trained its AI on content that is covered by copyright in the United Kingdom, then making the output or service provided by that company in the United Kingdom would still constitute copyright infringement.

At the very least, if AI Operators and providers are unable to demonstrate transparency and provide users and regulators with access to clear records of the inputs that the AI technology has used (e.g. sources of content includes copyrighted content), it will be impossible to satisfy the UK regime as well as basic international standards on cybersecurity standards, let alone copyright infringement or applicable parallel imports laws, in order to satisfy UK sovereignty principles.

Question 26. Does the temporary copies exception require clarification in relation to AI training?

RESPONSE: No, this is no defence; it is also no different to existing approach taken by any computer (an AI is just a software programme and no different to existing technologies, for now)

Question 27. If so, how could this be done in a way that does not undermine the intended purpose of this exception?

RESPONSE: We are not in favour of any exception but if such an exception were to be considered, then clear guardrails would need to be implemented – to ensure that any such temporary copies create no economic value or advantage.

Section C6 – Encouraging Research and Innovation

Question 28. Does the existing data mining exception for non-commercial research remain fit for purpose?

RESPONSE: YES, it is sufficient and fit for purpose, as it currently stands[10] The Exception received significant Parliamentary scrutiny before being implemented in 2014 and we believe any reform would significantly change the careful balance agreed upon then. Any such reform of the Exception would require significant and separate analysis, as opposed to being mixed in with this consultation.

Question 29. Should copyright rules relating to AI consider factors such as the purpose of an AI model, or the size of an AI firm?

RESPONSE: No. All such instances and use of copyright content are still governed by the existing UK Copyright legislation and the size of purpose of the firm is irrelevant (unless perhaps it is a true charity not a charitable front designed by and for a commercial purpose).

Section D – Computer-generated works: protection for the outputs of generative AI

Option 0: No legal change, maintain the current provisions

RESPONSE: Maintain the status quo.

Computer Generated Works (CGWs) distinguish the UK from other countries and prevents the argument that AI needs to ‘own’ IP outside of the existence of a ‘human author’ for creativity – it does not. AI is a tool in the hands of a company or individual.
CGWs protection is necessary to encourage the production of outputs by generative AI or other tools, and any legal ambiguity is likely to be resolved or of little effect. The Courts will resolve any ambiguity as they have done in England and Wales for centuries.
The exception in s9(3) CDPA works. “If a work is computer-generated – that is, not authored by a human – then copyright ought to be vested in the person who made the ‘arrangements necessary for the creation of work”
AI does not require or deserve any special rights or considerations and such rights are adequately covered in the relevant S.9(3) of the CDPA: .

Section D2 – Outputs

Question 30. Are you in favour of maintaining current protection for computer-generated works? If yes, please explain whether and how you currently rely on this provision.

RESPONSE : YES: See above re Computer Generated Works expressly that these distinguish the UK from other countries where such a regime does not exist.

Question 31. Do you have views on how the provision should be interpreted?

RESPONSE: It has been clearly interpreted in case law. The Advocate General in Painer[11] took this view, noting that only human creations can be copyright- protected (although the human can employ a “technical aid” like a camera). A similar position has also been taken by the U.S. Copyright Office, which determined that images created using the generative AI model, Midjourney, were not original works of authorship protected by U.S. copyright law because this excludes works produced by non-humans[12]. Caselaw from other countries also reflects this understanding[13]. It is right and proper that the facts of each case should determine the outcome, as was Parliament’s intention[14].

RESPONSE: No changes to CGWs are required

Question 32. Would computer-generated works legislation benefit from greater legal clarity, for example to clarify the originality requirement? If so, how should it be clarified?

RESPONSE: No.

Question 33. Should other changes be made to the scope of computer-generated protection?

RESPONSE: No

Question 34. Would reforming the computer-generated works provision have an impact on you or your organisation? If so, how? Please provide quantitative information where possible.

RESPONSE: unknown until details are provided of what the changes would be in a legislative context and the authors consider this unnecessary

Question 35. Are you in favour of removing copyright protection for computer-generated works without a human author?

RESPONSE: NO, for reasons given above. UK is fortunate to have a CGW right which is absent in many legislative frameworks

Question 36. What would be the economic impact of doing this? Please provide quantitative information where possible.

RESPONSE: Unknown at yet

Question 37. Would the removal of the current CGW provision affect you or your organisation? Please provide quantitative information where possible.

RESPONSE: Almost certainly given the licensing arrangements and revenue based on existing legislation. Quantum unknown.

Section D4

Question 38. Does the current approach to liability in AI-generated outputs allow effective enforcement of copyright?

RESPONSE: The law is clear in relation to AI-generated outputs. If a service is being provided in the UK which has been trained on the use of UK material, without permission, then the service is infringing and operating illegally. The enforcement of the law is clearly challenging given the lack of transparency by AI developers of the works they have used to train their models and for what purpose. See above proposals on strict liability regime for AI companies infringing copyright and alternative enforcement mechanisms mentioned in previous responses, above.

Question 39. What steps should AI providers take to avoid copyright infringing outputs?

RESPONSE: comply with the law –

check copyright notices (which is easy with AI tools) and
obtain consent under licence or written permission to use substantial elements of content in which copyright subsists and is claimed and/or owned by a third party under a simple © Notice.

Section D5 – AI Output Labelling

Question 40. Do you agree that generative AI outputs should be labelled as AI generated? If so, what is a proportionate approach, and is regulation required?

RESPONSE: YES and YES

Question 41. How can government support development of emerging tools and standards, reflecting the technical challenges associated with labelling tools?

RESPONSE: Unclear, the labelling is easy with AI and tech tools

Question 42. What are your views on the EU’s approach to AI output labelling?

RESPONSE: n/a No comment. The EU AI Act, formally adopted by the EU in March 2024, requires providers of AI systems to mark their output as AI-generated content. This labelling requirement is meant to allow users to detect when they are interacting with content generated by AI systems to address concerns like deepfakes and misinformation. Unfortunately, implementing one of the AI Act’s suggested methods for meeting this requirement—watermarking—may not be feasible or effective for some types of media. As the EU’s AI Office begins to enforce the AI Act’s requirements, the Government should closely evaluate the practicalities of AI watermarking.

Section D6: Digital Replicas and other issues

Question 43. To what extent would the approach(es) outlined in the first part of this consultation, in relation to transparency and text and data mining, provide individuals with sufficient control over the use of their image and voice in AI outputs?

RESPONSE: This is an important area that requires a more detailed review of the effectiveness of UK laws. Moral rights and personality image rights such as exist in EU would help protect individuals to have adequate control over their image/reputation and performance. This is an area that needs further review and potentially, legislation. Ratification of international treaties on this topic such as the Beijing Treaty would be an important first step towards international cooperation on standards and enforcement frameworks.

There are significant limits on the control people have over their image and voice in the UK. To the extent image (or personality) rights are protected at all, it is via a mix of privacy law, data protection, contract law, moral rights and the common law tort of ‘passing off’. The approaches outlined in the first part of the consultation do not materially improve individuals’ position in relation to use of their image and voice in AI outputs. It is directed to the use of copyright works. It does not follow that a copyright work is directly probative of a person’s image and/or voice. Further, it does not follow that the owner of that copyright work is the person in question.

Question 44. Could you share your experience or evidence of AI and digital replicas to date?

RESPONSE: The ability of digital replicas in real time can cause and have caused irreparable damage to many including people we know who have been fooled by sophisticated AI scams and with real-time artificial intelligence replicas of real people, actors well then personalities and even family members, easily cloned from information available on social media and images shared on the Internet, can cause irreparable damage to individuals who may be ill prepared or ill-equipped to address these – and those in the public arena (including actors and artists or politicians, even) may suffer financial harm as well as reputational damage.

There have also been examples of deepfake videos of politicians in recent times in the UK- for example of Sadiq Khan and Sir Keir Starmer. A change in the law to explicitly cover acts like these, rather than leaving recourse only to adjacent rights such as defamation or passing off would, in our view, be advisable.

Section D7 – Emerging Issues

Question 45. Is the legal framework that applies to AI products that interact with copyright works at the point of inference clear? If it is not, what could the government do to make it clearer?

RESPONSE: No comment – question unclear

Question 46. What are the implications of the use of synthetic data to train AI models and how could this develop over time, and how should the government respond?

RESPONSE: It is likely the outputs and quality of AI tools trained on synthetic data models will be degraded as compared to original/real data models

Question 47. What other developments are driving emerging questions for the UK’s copyright framework, and how should the government respond to them?

RESPONSE: None, at present.

Section E

End notes

Lord Clement-Jones CBE[15] is a Liberal Democrat Life peer and the Liberal Democrat DSIT Spokesperson in the House of Lords, and inter alia, the Co-Chair of the All-Party Parliamentary Group on Artificial Intelligence. He was chair of the House of Lords Select Committee on Artificial Intelligence (2017–2018) and is a former member of the Select Committees on Communications and Digital (2011–2015)) as well as a former Lib Dem Lords spokesperson on the Creative Industries (2004-10). He is an officer and active member of the All-Party Parliamentary Group on Intellectual Property.
Christian Gordon-Pullar is an IP specialist and an experienced intellectual asset manager with more than 30 years’ experience, ranked in the IAM Top 300 Global IP Strategists in 2020- 2024 (inclusive). He has a proven track record in IP in the fields of financial services, pharmaceuticals and life sciences, fintech and e-commerce, working at a C level with venture capital and private equity firms across portfolios. Until August 2024, Christian was Chairman of Fox Robotics Ltd, a UK Agritech AI start up. He has led IP licensing efforts in multinationals across Europe and Asia. Based in Singapore from 2001 to 2019, he also has significant Asia experience where he was head of Tech, Intellectual Property and Corporate Functions Legal, AsiaPac at JPMorgan. Before that, he was global head of intellectual property at Standard Chartered Bank and CEO of Standard Chartered’s global IP licensing entity. [16] Christian was formerly a solicitor in the IP Group (TMT) at Lovell White Durrant, now Hogan Lovells, from 1993-1999.

Consent. The individuals named above would be agreeable to being contacted by the Intellectual Property Office (UK IPO) in relation to this consultation.

APPENDICES

Watermarking

Watermarking of copyright content for LLMs is an active area of research and discussion, with several approaches being explored to address copyright concerns in AI training and generation. While watermarking shows promise, its practicality for preventing copyright theft is still strongly debated.

Embedding Watermarks: Researchers have proposed methods to implant backdoors on embeddings, such as the Embedding Watermark method3. This technique aims to protect the copyright of LLMs used for Embedding as a Service (EaaS) by inserting watermarks into the embeddings of texts containing trigger words.
Output Watermarking: Some techniques focus on watermarking the text generated by LLMs. These methods can significantly reduce the probability of generating copyrighted content, potentially by tens of orders of magnitude4.
Model-Level Watermarking: A novel approach involves embedding signals directly into LLM weights, which can be detected by a paired detector. This method allows for watermarked model open-sourcing and can be more adaptable to new attacks.
Reinforcement Learning-Based Watermarking: A co-training framework using reinforcement learning has been proposed to iteratively train a detector and tune the LLM to generate easily detectable watermarked text while maintaining normal utility[17].

While watermarking shows potential, several factors affect its practicality in preventing copyright theft:

Effectiveness: Some studies demonstrate that watermarking can significantly reduce the likelihood of generating copyrighted content4. However, the effectiveness varies depending on the specific method and implementation.
Detection Challenges: Detecting watermarks in fully black-box models remains difficult. Some methods, like DE-COP, have shown promise in detecting copyrighted content in training data, even for black-box models6.
Trade-offs: There’s an inherent trade-off between watermark transparency and effectiveness. Increased transparency may make watermarks more detectable and modifiable9.
Implementation Constraints: Watermarking during the LLM training phase cannot be applied to already trained models, limiting its applicability to existing LLMs[18].
Legal and Ethical Considerations: The use of copyrighted material in training datasets remains a contentious issue, with ongoing legal debates and lawsuits.

In conclusion, while watermarking techniques for LLMs are advancing rapidly, their practicality in preventing copyright theft is still uncertain. These methods show promise in reducing the generation of copyrighted content and potentially tracking its use, but challenges remain in implementation, detection, and legal frameworks. As the field evolves, a combination of technical solutions, legal guidelines, and ethical considerations will likely be necessary to address copyright concerns in AI effectively.

EU Transparency requirements

The EU AI Act requires a “sufficiently detailed summary” of training data for General-Purpose AI (GPAI) models to ensure transparency and protect stakeholders’ rights, such as copyright holders. The required level of granularity includes:

Data Sources and Types: Providers must disclose the origins of datasets (e.g., public or private databases, web data, user-generated content) and specify the types of data used (e.g., text, images, audio) across all training stages, from pre-training to fine-tuning.
Content Description: Summaries must detail dataset size, filtering processes (e.g., removal of harmful content), augmentation methods, and whether copyrighted or personal data is included. This also involves specifying licensing terms for the data.
Narrative Explanations: Clear, non-technical descriptions must accompany technical details to ensure accessibility for both experts and laypersons.

This level of detail is designed to balance transparency with the protection of trade secrets while enabling stakeholders to exercise their rights effectively

[1] See Section C for details.

[2] See Section C for details.

[3] https://www.gov.uk/government/consultations/copyright-and-artificial-intelligence

[4] https://ed.newtonrex.com/

[5] https://www.lordclementjones.org/2024/12/21/governments-ai-copyright-consultation-is-selling-out-to-the-techbros/

[6] https://www.gov.uk/government/consultations/copyright-and-artificial-intelligence/copyright-and-artificial-intelligence

[7] https://www.prsformusic.com/m-magazine/news/prs-for-music-announces-ai-principles

[8] https://www.prsformusic.com/press/2024/creative-rights-in-ai-coalition-calls-on-government-to-protect-copyright

[9] https://m.facebook.com/100063658326152/photos/1084206480377953/

[10] The Post Implementation Review Process, published in 2020 found (in relation to the series of exceptions introduced in 2014), the review has not identified any improvements in the assumptions which would change the original assessment. Based on the largely positive responses from the call for evidence that the original objectives remain valid, and evidence to suggest the exceptions are operating as intended, we find that it would therefore be appropriate for the exceptions to remain in their current form. See https://www.legislation.gov.uk/uksi/2014/1372/pdfs/uksiod_20141372_en_002.pdf

[11] Eva-Maria Painer v Standard Verlags GmbH (C-145/10) C:2011:798 at [89]–[94] at [121]

[12] Second Request for Reconsideration for Refusal to Register Théâtre D’opéra Spatial (Copyright Review Board September 5, 2023). U.S. Copyright Office, Library of Congress. Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence, 16 March 2023 88 FR 16190.

[13] Australia: it is necessary to identify a human author in order for there to be an original literary work (Telstra Corporation Limited v Phone Directories Company Pty Ltd (2010) FCA 44); Singapore: copyright only arises when a work is created by a human author (Asia Pacific Publishing Pte Ltd v Pioneers & Leaders (Publishers) Pte Ltd [2011] SGCA 37

[14] Bently et al, Intellectual Property Law, 6th Edn at [138].

UK Intellectual Property Office, “Consultation outcome—Artificial Intelligence and Intellectual Property: copyright and patents: Government response to consultation” (GOV.UK, updated 28 June 2022)

[15] https://www.libdems.org.uk/tim_clement_jones

[16] https://www.iam-media.com/strategy300/individuals/christian-gordon-pullar

[17] https://openreview.net/forum?id=r6aX67YhD9

[18] https://arxiv.org/html/2501.02446v1

21st December 2024

In the new era of geopolitical competition and economic rivalry, what strategies should China and the UK adopt to forge a more constructive relationship?

5th April 2021

AI and Copyright Lord C-J " The Government need to take this option off the table"

Here is a link to the Consultation

And here is our response

Response to Consultation: AI and Copyright on behalf of Lord Clement-Jones and Christian Gordon-Pullar

STAY CONNECTED

ABOUT LORD CLEMENT-JONES

AI and Copyright Lord C-J " The Government need to take this option off the table"

Here is a link to the Consultation

And here is our response

Response to Consultation: AI and Copyright on behalf of Lord Clement-Jones and Christian Gordon-Pullar

Government’s AI Copyright Consultation is Selling out to the Techbros

AI Governance: Science and Technology Committee launches enquiry

We need to end the confusion and build public trust over health data

The Road to Trustworthy Use of Healthcare Data: Good Governance and a Sovereign Health Fund

In the new era of geopolitical competition and economic rivalry, what strategies should China and the UK adopt to forge a more constructive relationship?

How the OECD’s AI system classification work added to a year of progress in AI governance

STAY CONNECTED

ABOUT LORD CLEMENT-JONES