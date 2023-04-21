Three companies that train easy-to-use machine learning art generators are being sued by both Getty Images and three artists for content theft. This raises a number of interesting questions worth exploring.

In my view, the first thing to be said is that the lawsuits reflect an inability to understand machine learning algorithms work. The companies are accused of storing huge repositories of compressed images extracted from the web without the consent of their creators or owners, when the reality is that they do not store images, either compressed or uncompressed, but instead mathematical representations of them. An algorithm, no matter how much it “looks”, does not “see” an image, but instead registers a map of pixels that it can eventually interpret in a certain way and model mathematically through a series of functions. Nor does the software “recombine” or function as a “21st century collage tool”, nor does it put together fragments of images in collage form, but rather creates images from scratch based on these mathematical representations, as would an artist who takes inspiration from another artist for his creation.

These mathematical models are what companies store in their repositories, which are perhaps best understood as the equivalent of human memory: we do not store an image, but a set of redundant neural circuits that allow us to evoke it. Trying to prevent an algorithm from being able to go through a page, examine the images contained in it — and we should remember that web scraping is not illegal in most cases — and generate a series of mathematical functions with them would be like trying to prevent visitors to a museum from being able to remember what they have seen, or to be inspired by it to create their own works.

In this respect, algorithms work like the brain: if we only teach a person works created by one author they will end up creating, if they have the skills to do so, very similar works, which in fact, may actually be plagiarism and be considered as such. If we show an algorithm works by Leonardo da Vinci, considering that his most commonly cited work is the Mona Lisa (“the most known, the most visited, the most written, the most sung, the most parodied work of art in the world”), this oversampling will mean that, when a virtual art assistant is asked to create something in the style of Leonardo da Vinci, it will produce images that reflect the style and content of Leonardo da Vinci.

As we have seen over the years, artists or the heirs of artists regularly exploit copyright law to file lawsuits and request compensation because a work, even if it does not repeat any sequence of notes or instrumental pattern, “evokes its meaning and sound”. We’ve seen the courts award more than $7 million, prompting the same lawsuit, by the same guys, against another artist. There have even been copyright claims on silence, on the purring of a cat, of an image copied from another artist or in the public domain, of the arrangement of food on a plate and any other number of cases that would be better heard by a psychoanalyst than a judge. Rather than trying to curb machine learning assistants who create images, people who abuse copyright law need punishing.

Artists can ask for the work not to be protected, but the problem is that art is created to be seen, and once it is in the public domain, it will be copied. But by all means, let’s have a debate about the issue.

And by the way, the illustration is not a Banksy. It is what Stable Diffusion generated when I asked for “a graffiti of a robot on a wall in the style of Banksy”.

