Evidence of watermark* generation “in the wild”
Evidence of watermark* generation “in the wild”
Annex 8I obviously contains evidence of real life generation of watermarks*. However, there is also some additional evidence to support Getty Images’ case.
I start with evidence internal to Stability which establishes that Stability was aware from the outset that there were problems with Models producing watermarked* images. In particular:
An internal Stability Chat dated 28 July 2022 in which Bill Cusick, former prompt engineer at Stability states: “There are still some watermarks showing up in archviz renders. Maybe once every 20-25 instances, so not often but when it does, it’s clearly revealing some kind of Getty or similar looking watermark”. This is pre-release of v1.x.
An email notification from Hugging Face to Mr Mostaque at Stability on 15 September 2022 with the subject “CompVis/stable-diffusion-v-1-4-original…Another training data quality issue – iStock watermarks”. The message says “Any prompt that includes the phrase ‘vector art’ will very frequently include a repeated ‘iStock’ watermark on the image. This probably isn’t the desired result and is likely due to an unauthorized scraping of iStock artwork into the training set”.
An email from a user dated 2 February 2023 to Team DreamStudio saying “it’s not the first time i generate images including the getty image watermark”. Team DreamStudio responds (“the DreamStudio Email”):
“Stable Diffusion (the models that DreamStudio runs on) was trained on a wide crawl of the internet and what that means is that occasionally generations will display a watermark due to the fact that watermarks were likely present in some of the dataset that the models were trained on.
While models are being trained, they are learning the features of the images that are in the dataset, so what can happen sometimes is that a certain set of words has a likelihood of displaying watermarks because it is thinking ‘I’ve seen a lot of images in this space, sometimes they have watermarks, so I should attempt to add a watermark to this image’.
This is an unfortunate byproduct that can come about by way of having such a large dataset and some of those images having watermarks in them that it learned from”.
An internal Stability Chat dated 4 February 2023 in which the participants discuss watermark removal and Tim Dockhorn comments: “When training w/o watermark conditioning, the model still generates samples with clearly visible watermarks. It’s in half of the data so I think that’s expected…”. This appears to be a discussion which is taking place during development and training, from which I infer that there was a recognition at that time that watermarked images made up about half of the dataset on which the next iteration of the Model was being trained.
An internal Stability Chat dated 4 March 2023 (prior to the release of SD XL), to which I have already referred, in which the internal team discuss the current “de-watermarking” taking place in respect of SD XL and the fact that the watermark issue is a “blocker for launching”.
An internal Stability Chat dated 10 March 2023 (also at the time that training of SD XL appears to have been taking place) in which a comparison of SD XL ‘alpha’ and ‘Beta’ (both pre-release models) is taking place. Tom Mason says a decision needs to be taken as to whether to “switch in beta for the release”. Conner Ruhl, then a software engineer says “It’s specifically diagonals and the occasional Getty it seems to really be struggling with…”. Joe Penna responds “SDXL alpha (2.2.0) is making a watermark every other generation. I’ve yet to see one single watermark from ‘SDXL beta (2.2.2)”.
A further internal Stability Chat from 10 March 2023 (again during the training of SD XL). Conner Ruhl says this: “Unless I’m specifically being targeted by some ghost in the machine…SDXL is not ready for prime-time or to be consider a blocker for launching…On a constant basis I’m seeing watermarks and furniture”. He then sets out links to examples, pointing out that these are all examples from the last day or two. He then comments “We are going to immediately be slammed by folks on the watermark issue alone”. Mohamed Diab later replies “just got 4 images all with watermarks, using one of the shuffle prompts and isometric style”. He also appears to provide links to examples. Conner Ruhl responds “Getty outta here”. Mr Auerhahn, who was involved in this chat, confirmed in his evidence that this was plainly a pun in recognition of the fact that Getty Images watermarks* were being generated. On the following day, Brian Fitzgerald observes that “results are more consistent than SDXL so far, no watermarks at all” – although it is unclear to what he is referring. Later in the same chat, Conner Ruhl says “Definitely would prefer versus the watermark/furniture model for the [DreamStudio] launch”. Later in the same chat, there is discussion around the fact that the problem with the model producing random furniture appears to be (as Scott Detweiler describes it) “somehow related to the ‘watermark killer’”, which I infer is a filter or additional functionality of some description, designed to remove the potential for the Model to generate watermarks. Scott Detweiler subsequently observes that “2.2 beta has so far not handed me random furniture”.
Pausing there – it is not always easy to understand exactly what is going on in these Chats. However, doing the best I can, I consider it reasonable to infer that v1.x had a tendency during training to generate watermarks*. This tendency was known to Stability prior to the release of v1.x and does not appear to have been ironed out on release, as the 15 September 2022 email notification (relating to iStock watermarks*) indicates. This is entirely consistent with my findings on the evidence in the previous section of this judgment.
Although attempts appear to have been made to filter the data that was being used to train v2.0 for watermarks (as confirmed by the Model Card), those attempts were not entirely successful – again, as my analysis of the evidence in the previous section also confirms. The DreamStudio Email appears to acknowledge as much when it talks about the “Models” on which DreamStudio was running at that time – these can only have been v1.x and v2.0.
It is also clear from this evidence that Stability encountered significant issues with the generation of watermarks in the development and training of SD XL. However, the evidence shows (and I find) that steps were taken to remove images bearing watermarks from the training data at this stage. Consistent with this is the fact that there is no internal Stability evidence indicating either that watermarks* continued to appear on synthetic images during real world use after the release of SDXL, or that there was a problem with later checkpoints of SDXL, or, indeed, v1.6. Accordingly, I infer from this evidence (together with the absence of any other evidence of watermarks* appearing “in the wild” in respect of these Models) that the inadvertent, uncontrived generation of watermarks* ceased to be a problem in real world use in respect of SD XL following its release and were never a problem in real world use in respect of v1.6. I find that the only evidence of the generation of watermarks* in relation to these Models (discussed above) is contrived and unrepresentative.
In closing, Getty Images relied upon the evidence of Mr Auerhahn in cross examination to the effect that “[t]here were some issues with SDXL producing images that appeared to be watermarked” and that he was “possibly” aware of similar issues for “v1 and v2”. Mr Auerhahn accepted that the problem was “too prevalent” with SD XL. Bearing in mind that Mr Auerhahn was a party to various of the Chats to which I have referred above, it is not surprising that he recalled an issue with SD XL. However, his cross examination did not clarify when these concerns arose and, specifically, whether they arose pre- or post-release of SD XL. Mr Auerhahn was taken only to pre-release Chats during his cross examination. This is important because there is no evidence whatever of any watermarked* images being produced “in the wild” post-release and it was not specifically put to Mr Auerhahn that there were ongoing difficulties post-release of SD XL. He was also not asked about the evidence in the Chats which suggests that work was done to filter out images bearing watermarks from the training data and thus to preclude the possibility of the Model producing watermarks*.
Given that Mr Auerhahn was not asked in cross examination to clarify when he understood the problems to have arisen, I can see no basis to find that he was referring to the post-release period of SD XL when he identified the existence of problems with the generation of watermarks*. Much more likely, it seems to me, given the contemporaneous documentary evidence, is that he was referring to the problems that were being encountered pre-release of SD XL, and so I find. That would be entirely consistent with the fact that (i) Getty Images’ “proof of existence” testing was unable to identify a single watermarked* image for SD XL and v 1.6; and (ii) its Output Claim testing identified only the Donald Glover Image for SD XL and the Gabba Images for v 1.6 – albeit generated (as I have already found) by unrealistic and eccentric use that is unrepresentative of real world use.
Finally, there remain two additional sources of evidence on which Getty Images rely: (i) direct communications between members of the public and Getty Images on GI SalesForce; and (ii) Exchanges between members of the public on Reddit, a social news and discussion website (“the Reddit Exchanges”).
Getty Images have chosen to redact the names and contact details of all of the individuals who got in touch with its Salesforce team. They have not chosen to call any of these individuals or to permit Stability to view their details so that it might make a decision as to whether to call them. To my mind (beyond establishing that watermarks* have been generated by real world users) this limits the usefulness of the evidence in the sense that it is impossible to resolve any ambiguities. Instead I must do the best I can on the documents alone. There are three communications, all posted to GI SalesForce, on which Getty Images rely:
A message from a user in the USA dated 28 August 2022 (“the August 2022 SalesForce Message”) attaching the image at 8I, page 9 to which I have already referred, saying:
“Is your leadership team aware that companies behind AI-generated imagery have used your database of imagery to help train their AI systems to generate photographs in order to replace your business model? AI systems like Stable Diffusion, DAL E and Midjourney are using snapshots of the entire internet without the consent of copyright holders. Here is an image of an in-progress AI generated render that shows clearly that the AI was using your stock photo image” [an image with a watermark* is attached].
A message from a user in the USA dated 1 October 2022 (“the October 2022 SalesForce Message”) attaching various images (included in Annex 8I) on which the iStock image appears:
“I’ve been using Stable Diffusion and recently I came across a prompt that results in well over 50% of the images showing an iStock water mark, are these images legal to use in my projects? How does Stable Diffusion go about licensing material from you?”
A message from a user in Latin-America dated 6 March 2023 (“the March 2023 SalesForce Message”): “I’d like to report a case of stable diffusion using Getty Images”. The message then provides the prompt used and attaches an image with a watermark, albeit that the image has been generated by Stable Diffusion Online, which does not appear to have anything to do with Stability. Furthermore, this does not appear to be an anglophone user. I can only assume that it is for these reasons that the image is not included in Annex 8I.
These messages take matters little further than Annex 8I. Two of them are messages which attached the Annex 8I images. It is unclear whether the first message relates to Stability and I have already refused to draw an inference as to the Model involved for that reason. The image referred to in the third message certainly does not appear to have anything to do with Stability. I have drawn an inference in relation to the images attached to the October 2022 SalesForce Message and I note for present purposes that the images attached to that message were not the only examples that the user had been able to generate with watermarks*.
Getty Images rely only upon a few exchanges between members of the public on Reddit, two of which feed into the images at Annex 8I and two of which do not appear clearly to relate to images generated using platforms connected with Stability (albeit they evidence that the generation of watermarks on synthetic images in general was being discussed by users):
A post by Antique_Plane_130 from around August 2024 which asks “What’s the best way to remove a watermark from an image?”. However, as Ms Cameron accepted, the relevant post does not mention a Getty Images or iStock watermark. It refers to a “blurry watermark in one of the corners” and says that the user is using “automatic 1111”. It is not clear that this is connected to Stability and there is no accompanying image. There is no reference to a Stable Diffusion Model and no indication that the user is based in the UK. I do not regard this post to be of any assistance.
A post by NealJMD from 27 August 2022 which shows four images (used in Annex 8I at page 3 and illustrated earlier in this judgment at paragraph 163) and says “Looks like Stable Diffusion was trained on watermarked images – when asked for vector art, it put the iStockPhoto watermark all over it”. The prompt is identified lower down in the Chat. I have inferred earlier in this judgment that this is a reference to v1.x, but there is no indication that the user is based in the UK. Other posts on this Chat include (from Musicguy1982) “I’ve gotten a few results with the Getty Images watermark although unreadable”; (from BunniLemon) “I tried the same prompt, and noticed that when I added ‘trending on ArtStation’ only one image showed up with the iStock’ logo…”; and (from namesareunavailable) “lately I experience awful lots of watermarks, even if I put watermark in the negative prompts. To me it has become quite annoying actually”.
A post (from Rare_Negotiation_544) on 31 October 2023: “Stable Diffusion generated a Getty Images watermark”. The embedded image is at page 6 of Annex 8I (and appears later in this judgment at paragraph 386) and the watermark* is very distorted. The model is unknown and there is no indication as to the location of the user.
A post (from lonewolfmcquaid) on 23 February 2023 asking why Stable Diffusion generates “accurate watermarks”, together with an embedded image showing a model and a badly distorted watermark. However, later in the exchange lonewolfmcquaid says that he made it with “playground” – a platform that is unconnected to Stability. In the same Chat, redpandabear77 says “I’ve made a ridiculous amount of images and I’ve never seen anything close to this. But I also don’t use the default models and I never try to make anything that resembles stock images. I just have a hard time believing that somebody got it looking that clear without trying to prompt for it”. Bobi2393 responds by sending a thread from this subreddit six months and three months earlier about other watermarks and other people confirming a similar result. Lonewolfmcquaid comments “i mean i was shocked as hell, it almost felt like winning a lottery cause what are the actual fucking chances lool”. This Chat again does not take matters much further on the threshold issue given that the image was generated on a platform that is unconnected to Stability.
There are a few other Reddit posts which produced various of the remaining images in Annex 8I, but again do not advance the position on the evidence any further.
In closing, Getty Images also relied upon an article by Stephen Jukic dated 18 January 2023 reporting on the commencement of litigation by Getty Images against Stability which includes an image with a distorted Getty Images watermark*. However, there is no indication which version of the Model was used to generate the image.
Conclusion on the Threshold Issue
Having regard to all of the available evidence considered above, I now turn to look specifically at each of the relevant versions of the Model to see whether the threshold question is satisfied.
- Heading
- Mrs Justice Joanna Smith DBE INTRODUCTION
- FACTUAL BACKGROUND
- PROCEDURAL BACKGROUND
- THE WITNESSES AND EVIDENCE
- LEGAL RESPONSIBILITY FOR STABLE DIFFUSION v1.X
- THE TRADE MARK INFRINGEMENT CLAIM
- The Expert Evidence as to the scope for generation of watermarks*
- Annex 8I
- The Getty Watermark Experiments and Annex 8H
- Re-worded prompts
- Evidence of watermark* generation “in the wild”
- Model v1.x
- Models SD XL and v1.6
- SECTION 10(1) INFRINGEMENT
- Use of a Sign
- Identity of Mark and Sign
- Identity of goods or services
- Getty Images Watermarks*
- SECTION 10(2) INFRINGEMENT
- SECTION 10(3) INFRINGEMENT
- PASSING OFF
- THE SECONDARY INFRINGEMENT CLAIM
- COPYRIGHT SUBSISTENCE AND OWNERSHIP
- THE LICENSING ISSUE
- Sources of law
- The interpretation of written contracts
- REMAINING OUTSTANDING ISSUES
- CONCLUSION
- Appendix A Glossary of Terms
- Appendix B
- I shall address the following scenarios a consumer generating content through a locally downloaded copy of Stable Diffusion v2.0
- Local Downloads via GitHub and Hugging Face
- Stable Diffusion v2.x The Stability GitHub page for v2.x includes the following features
- A General Disclaimer in the following terms
- The “Use-based Restrictions” in Annex A are stated as follows
- The model license is again stated to be subject to a CreativeML Open RAIL++- M License
- DreamStudio (v.1.4 and 2.0)
- Logging into the account, the user is again faced with a large stability.ai logo Conclusions
![IL-2023-000007 - [2025] EWHC 2863 (Ch)](https://backend.juristeca.com/files/emisores/logo_O3rEzCI.png)