Unpredictable Patterns #87: The future of the image
The end of the photograph, truth and evidence, the skotograph, creativity and the future of imagery law
Dear reader,
We are back to school, and so here is a more subject-matter oriented piece about how the rise of large models might affect the nature of the image and our relationship to imagery overall. There is so much happening in the borderland of art, technology, philosophy and law these days - and it is exciting. More to come.
The future of the image
We have come to a breaking point in our relationship with images - again (this was pointed out to me by a colleague a few days back). It has only been around 200 years since the invention of photography, and it now seems as if our relationship with the image is about to change again.1 The reason is simple - it is the rise of the many artificial intelligence-powered models that allow for the creation of both photo realistic images and beautiful art.2 And the development in this field is fast - the number of images produced by these engines can be expected to grow fast over the coming years.3 All this contributes to a world in which we are likely to start interacting with images in a very different way.
First, the new technology will shift the meaning of the concept “image”. This is one of the more interesting ways in which technology and language interacts - new technologies allow for new uses of concepts, shifting the meaning of these concepts. The concept “game” is different now as compared to before computer games entered our collective consciousness. The concept “image” as well as “photo” will change as well - both acquiring new meanings and losing old.4
Second, the role of imagery in the production of knowledge will change. In many ways, the camera was seen as an instrument that captured reality (we “take” photos), and so the image became evidence of a state of the world. These new images are not captured, but created and so they cannot be used as evidence in the same way that photographs could. This represents a change in how we view images overall, but also actually suggests that the worries we have about deep fakes are likely to fade over time as we adjust our understanding of how images can be used. The photograph collapses back into the drawing, and where it used to stand out as a category of its own, with an evidentiary value of its own, it now becomes just an image like any other.5
The image is decoupled from its production - because we can no longer be sure that it was produced in a certain way, and because of this has certain characteristics. While we used to know that the image was produced by someone who was there and then, we now can say nothing about the production of the image based on its quality or format.6
Third, we are going to have to learn to read images differently - they become, in some ways, more like text. This is interesting in a couple of different ways - not least because this hints at what is going on behind the scenes in the tokenisation of the world that these models engage in.7
We have reached a point where our power of representation allows us to compress all things in to sequences of tokens, but these tokens, then, exact a price: they also mean that we now have to assume that all things were produced by the re-arrangement of tokens in different ways. Where we might have, as Walter Benjamin suggests, been reading the photograph for the unintended and surreptitiously revealed8, for something that was captured beyond the intention of the photographer, we now must read the image (it no longer is a photograph in the sense we used that word) as a composition of tokens.
In a hundred years, when we write the history of the image again, we will note that there was a curious window for 200 years where one kind of image - the photograph - was elevated among others as more truthful, and more authoritative, but that this brief period (that began with the photograph and ended with the machine learning models) soon was forgotten as the absolute majority of the imagosphere (the sum total of all images available to us) shifted to the created image over the captured.9
The rise of the skotograph
The name “photograph” is derived from the Greek word for light. This is important for several reasons, and plays a part in our understanding of the nature of the photo: it was created with light and so in some sense inherits its veracity from the light that we have learned to trust so naturally.10 The new images that we produce with different models are made in the dark, so to say, and so could naturally be called skotographs, after the Greek word for darkness.
Now, interestingly this word already exists and refers to spirit photography.11 It was coined by Felicia Scatcherd, who worked at the London-based Society for Psychical Research. A believer in psychic phenomena and spirit photography, she collaborated with Arthur Conan Doyle among others - and edited the Psychic Review.
The theory behind the skotograph was simple: these were photographs where images appeared without any light - and so represented mysterious energies from the spirit world. It was a repurposing of the camera to measure not light but mysterious presences.
Now, it may seem silly to suggest that our new machine-learning models are producing spirit photography, but I want to suggest that we can use the idea of the skotograph to ask a couple of interesting questions about what we are seeing when it comes to the development of new imagery. Or put in a different way - if we persist in viewing these models not just as creative tools, but as some kind of instrument that allows us to capture something - what is it then that the machine learning models are capturing now?
The first answer that comes to mind is that skotographs are capturing our imagination - the images we produce are still prompted by human ingenuity and creativity, so what we have been able to do with these new instruments is to capture not just what is, but also what could be. In art, this was already possible - the sum total of our imagery in art is search through our collective imagination for the compelling and interesting, the new and astonishing. The addition of photo-realism to this search just means that we now are able to search our imagination in a particular way — in a sense we can now search our dreams, and re-create images that we have only seen in our sleep. We are searching the space of all things that can be seen.
But we can go further and suggest that what this new instrument does is to allow us to search our collective unconscious - the sum total of these new created images is in some way a representation of what we want to see, really see, and that says something about us. Some of it will be disappointing - like the relentless focus on porn and violence - but still reveals something about us.12 Some of it will be aspirational and allow us to explore possible futures, adding a visual dimension to science fiction in ways that will open new worlds.13
Should we mourn the photograph? Is the loss of the ability to capture reality a loss that will significantly change our societies? We could imagine ways in which the photograph survives - but in a generation or two the presumption of truth that we seem to grant photography will likely have faded away. Perhaps there will be ways to use encryption to build new authority landscapes in the flat desert of imagery, several people signing an image to guarantee its veracity - perhaps under liability - could be one way of trying to salvage the photograph, but ultimately such efforts seem doomed to fail.
And perhaps that is for the better - trust in photography may never have been well-placed, given how easy it is to lie with a camera, after all. It is not as if all photos could be taken as true automatically.
Maybe, then, the collapse of photography into the general category of imagery is a good thing that deprives propagandists of one of their more powerful tools - a tool that has been abused through-out the 20th and 21st centuries. If complemented with a general turn to critical thinking, that is — and that is a big if.
The skotograph could easily become a powerful propaganda tool as well - even if we know that we might be manipulated. We run into an old problem here: we did not evolve to seek truth, we evolved to seek trust in tribes.
The creative man-machine
The rise of the imagery producing machine-learning model can also be seen as a part of a wider trend: one in which the tools of creativity become more and more powerful, and more and more accessible to a growing group of people. We have seen this before in music - where the new tools allow more and more people to produce music than ever before. Logic Pro, Ableton Live, Reason — all of these music creation tools have lowered the cost for production of music enormously, but they have also increased access to creativity by orders of magnitude.
Now, there are different ways we can react to this. One is to say that this cheapens art - because art should not be easy to produce. This view implies that what makes art into art is that so few people can produce it - and so allowing more and more access to creativity will devalue art over all. Those that hold this view may insist that what is created with these tools will always have a certain quality - a “tonality” or “colour” that taints the work and shows that it is not authentic, not created from scratch. They could also argue that there is no such thing as art produced without suffering and pain - an old view that might not be as articulated today, but still simmers under the surface of much of the skepticism that is directed at the explosion of creativity that we are witnessing.
Anything created with these tools will be useless, just because of the tools we use!
The opposing view suggests that what we have now is a much broader culture, with more participation than ever before, and that those who create themselves will be more appreciative of the creative effort - and so creativity is heading into a new renaissance, one in which we will discover entirely new artists and musicians, who otherwise would have been ignored by a stalwart entertainment establishment.
A third view would be to say that we should not expect any greater changes to the way we consume culture - we have always tended to do so in a power law fashion, it is in the nature of our collective experience of art - but maybe the set of contenders will now be broader and more interesting.14 That 20% of the artists in any domain will still get 80% of the attention is a given, but who gets to be part of that 20% is no longer just up to the establishment.
Another way to view this is to say that the new tools actually require more of the artist than the older ones - because they are so complex, and have such enormous ability to search the creative space in entirely new ways - and so will be even harder to master. This view sees new AI-tools as a new medium (as someone recently suggested to me), like oil or water colours, but one that still requires mastery and deeper understanding. The real value will be found by those that understand the tools well enough to search way outside of the already existing peaks in the cultural fitness landscape.
The real difference is that we can now travel to parts of that landscape that lie intellectual lightyears beyond what we have been able to see before. Every new tool in this space is like a new telescope that allows us to see further into the creative space than ever before. The challenge will be not to stay on the ever-more well-trodden paths we have already discovered.
There is a risk here - a funny one - that was outlined by Stanislaw Lem in one of his Sallies in the Cyberiad: one of the inventors invent a virtual poet, and the poetry this poet produces is so beautiful that when a really, really talented poet hears it, well, they simply fall silent an cease writing - but that same fate never befalls the mediocre poets, because they cannot see the beauty as clearly as the great. The result then is the quieting of the excellent, and the amplification of the mediocre — because of the power of this new tool, the virtual poet!
We are a far way away from that happening with our large models, I think - but it does raise the question of attraction and retention: will a field that sees a monumental increase in access to creativity attract fewer of the excellent and retain more of the mediocre? The answer to that question depends a lot on your view of humanity, I suspect - and the source of our creativity.
A few words on the law
It seems obvious that the developments we discuss in here also will impact legislative discussions. There are already on-going debates about copyright and new tools15, so we should expect that to continue - and there are others that are better suited to laying out the complexities of than I am. But I do think it is interesting to think about one aspect of the legislative equation - and that is the possible long arc of the rights in this space, given the model you have of the foundations of these rights.
In one, admittedly simplified16, model of copyright, the rights awarded were awarded to foster and inspire creativity — but what we have seen in the last couple of decades is that there is an abundance of creativity. Even if we then argue the quality of that creativity is low (a somewhat elitist argument, to be sure) - we remain obliged to answer the question of if all creativity should be protected in the same way.
There has been interesting research on this, where new models have been explored — one of the most intriguing ones is a model in which the scope and length of copyright protection is awarded in relation to the novelty of the work in question.17 Such novelty-based protection has a curious correlate in different forms of machine learning that reward novelty in the behaviour of a model and could be seen as a reward for a search through a new part of creative space where no one has gone before. To incentivise the exploration of creative space and grant protection to the new would be akin to rewarding the pioneering efforts in the Old West - where you could stake a claim to some part of land that had not previously been explored.
The argument from creative scarcity has been weakened by recent developments, and another argument has gained in strength: the protection of investments. The sui generis right to databases developed in the European Union legal framework for copyright is explicitly a protection - narrow - of a certain kind of investment in a database.18
This model seems now to become more important, and you can imagine a right that distributes protection according to an overall investment analysis: what is the cost of the production of the image - across model training, data set and creative prompting? The shift from incentive to create to protection of investment - should it occur - will be an interesting break with the tradition of copyright in different ways.
Every change in the means of production of art, also drives change in the way the property rights are constructed and protected - and such change is important to ensure that the overall social benefit of new technology can be unlocked.
We can also be sure that this new development will raise questions about the right to our resemblance and extended self (including things as ephemeral as our personal style) - beyond the laws that exist in some jurisdiction about your right to your own resemblance in advertising.19 One could imagine a world in which there is a right not to be depicted, but that would be terribly hard to enforce and understand. There could also be a right to the extended self that we all have - and so a right not to be simulated into new media of any kind, but again, that would be hard to enforce. The possible media space for any one individual is enormous - and the spectrum of possible harms is complex.
What seems certain is that we are in for a fundamental reset in the way we think about the image.
Thanks for reading,
Nicklas
The dating of the invention of photography is tricky, but many suggest that photography was invented by Joseph Nicéphore Niépce and that the first surviving photograph dates from around 1826.
Examples include Dall-E, but also Midjourney and others — the market for these image engines seems to be growing fast.
Access to these models is quickly becoming abundant, and with some actors allowing for users to download and set up their own services, there is no natural bottlenecks ahead.
One way to put this is to say that “photo” and “image” are slowly becoming synonyms.
It is interesting to see this argument now used in defense of photo-realistic celebrity deep fake pornography. The argument that anyone could draw this or that celebrity nude and no-one would mind if they did is, implicitly, an argument for the flattening of the truth landscape across all images. See this article.
There is information lost here, and we can see this in language to - the idea that a photo is “taken” will likely fade, since we cannot know for sure that this is what happened.
Put simply, many of the new tools for creating images are built on a model where images can be represented as tokens - the image is turned into a kind of language (if we simplify horribly). See for example: https://deepai.org/publication/visual-transformers-token-based-image-representation-and-processing-for-computer-vision
There is a beautiful passage in his history of photography that captures this way of seeing: “However skilful the photographer, however carefully he poses his model, the spectator feels an irresistible compulsion to look for the tiny spark of chance, of the here and now, with which reality has, as it were, seared the character in the picture; to find that imperceptible point at which, in the immediacy of that long-past moment, the future so persuasively inserts itself that, looking back, we may rediscover it. It is indeed a different nature that speaks to the camera from the one which addresses the eye; different above all in the sense that instead of a space worked through by a human consciousness there appears one which is affected unconsciously. […] Photography makes aware for the first time the optical unconscious, just as psychoanalysis discloses the instinctual unconscious.” (Benjamin, W “A Short History of Photography” p.7.)
There is a real question here if this will happen - some assessments suggest that there are more than 750 billion photos on the Internet, and Mary Meeker suggested in 2014 that 1.8 billion photos were uploaded every single day (see here) — so how do we know that the photo will really be replaced by machine-learning generated images? Well, there are two aspects to this question: the first is what proportion of images needs to be created for us to assume that all images are created (10%, 20%?) and the second is what will happen to photos as we get better and better AI-filters. The 1.8 billion photos are already edited, filtered and changed in numerous ways - so maybe the reality is that we do not need that many purely created images to change our view of the imagosphere? There is another aspect of this as well - the production of new imagery with these models changes the imagosphere, so that all future training on a general dataset collected from the web will contain images that are already derivative of what has come before.
It is not crazy to argue that the camera was an instrument and each photo a measurement.
A popular form of parapsychological research in the 19th century, practiced by many who believed that the camera could be used as an instrument to reveal the presence of spirits.
The persistence of Eros and Thanatos is not surprising - but the shift in imagery may actually change our views of depiction of sex and violence. The tension between reality, authority and capturing an act in pornography is in a sense sharper than in general photography - the pornographer’s act of capturing the act is a key component of pornography - someone was there, it happened for real - and so will that genre persist as one in which images need to be captured? Or will the majority switch into completely created imagery - and if so, will our view of violence and sex being depicted shift when there is no-one “real” in those pictures? The horrible challenge of virtual abuse imagery suggests that this requires that we rethink our theories of harm, and understand what depiction in itself means. This also applies to what people may do with images of us - potentially harming our extended self. See e.g. Graw Leary, Mary, The Third Dimension of Victimization (February 1, 2016). Ohio State Journal of Criminal Law, Vol. 13, No. 139, 2016, Available at SSRN: https://ssrn.com/abstract=2733789 or http://dx.doi.org/10.2139/ssrn.2733789
Some artists already do this without AI, but with new technologies that allow for the creation of fantastic imagery. Simon Stålenhag’s images are great examples.
See eg Ola Haampland (2017) Power Laws and Market Shares: Cumulative Advantage and the Billboard Hot 100, Journal of New Music Research, 46:4, 356-380, DOI: 10.1080/09298215.2017.1358285
I purposefully omit the moral rights in this argument, but they are different - their existence has less impact on the nature of content markets, however.
See Parchomovsky, Gideon and Stein, Alex, Originality (September 2009). Virginia Law Review, Vol. 95, 2009, Cardozo Legal Studies Research Paper No. 272, U of Penn, Inst for Law & Econ Research Paper No. 09-10, Available at SSRN: https://ssrn.com/abstract=1361911
See for a comparison of US / EU regimes Smith, Mitchell, A Comparison of the Legal Protection of Databases in the United States and EU: Implications for Scientific Research (May 23, 2010). Available at SSRN: https://ssrn.com/abstract=1613451 or http://dx.doi.org/10.2139/ssrn.1613451
See eg publicity rights as laid out here Johnson, James A., The Right of Publicity - Show Me the Money (October 1, 2013). New York State Bar Assoc.-Torts, Insurance, & Comp. Law Section Journal Vol. 42 No.1, 2013, Available at SSRN: https://ssrn.com/abstract=2388775 - style rights will be different, and hard to construct, but, paradoxically, probably easier now with probabilistic charting of our creative landscapes.
"But we can go further and suggest that what this new instrument does is to allow us to search our collective unconscious"
I've been troubled by the 'collective unconscious' framing for model outputs. I'm not sure we should so readily accept that the training data is comprehensive enough or weighted properly to capture any essential us-ness. Surely it's reflective of *something* but how do we know whether it's an accurate or distorted image? And is it even a philosophically and scientifically coherent idea that there is a collective unconscious, or are we extending misguided folk/early psychology too far?