For most people, the process behind an AI-generated image feels like a kind of magic. You type a sentence, and a few seconds later a fully formed visual appears, polished enough to fit into a magazine spread or a film pitch deck. It feels instantaneous, almost effortless. But behind that illusion is a mathematical ritual that has more in common with the history of human creativity than most people realize. What AI does when it makes an image is not creation in the artistic sense—it is reconstruction. A diffusion model like those behind Midjourney v7, Stable Diffusion XL or Runway Gen-3 begins with pure noise, a static-filled field of possibility where nothing yet exists. It then removes the noise step by step, guided by patterns it has learned from billions of images. The process looks like emergence, but technologically it is narrowing. It is not dreaming; it is approximating. And understanding this difference is essential, not because it diminishes the power of AI, but because it clarifies the human role inside the creative equation.
The idea that images emerge from noise is not new. Jackson Pollock spoke of chaos as a medium. Gerhard Richter embraced blur and distortion as a pathway to meaning. Even early photography relied on chemical reactions that appeared random before they settled into form. But in all these cases, the artist decided what mattered. They interpreted the accidents. They selected the gestures worth keeping. They recognized when a pattern carried emotional resonance. A diffusion model, in contrast, does not interpret anything. It follows probability. It moves toward visual outcomes that are statistically likely to satisfy the prompt, shaped not by intention but by the collective patterns of its training data.
When Ian Goodfellow introduced GANs, he described them as adversarial systems locked in a constant negotiation between creation and critique. Diffusion models, which followed, operate differently but embody a similar tension: they collapse the infinite into the plausible. They do it with extraordinary skill, but with no curiosity, no taste and no emotional stake in the result. The system generates an image that could exist—not one that must exist, or one that carries meaning within a specific cultural, personal or artistic context.
And this is precisely where Copy Lab’s belief in the sacred human–GenAI partnership becomes visible. AI can produce form, but only humans can produce intention. A diffusion model can reconstruct a landscape that resembles the work of Ansel Adams or a portrait that borrows from the visual language of Annie Leibovitz, but it cannot decide what the photograph should say. It does not know whether the mood is wrong, whether the tension is missing, whether the image feels hollow despite technical precision. It cannot sense the difference between a picture that is correct and one that is alive.
Creators like Wong Kar-wai, Cindy Sherman, James Turrell and Greta Gerwig work within a universe of choices—color, light, gesture, pace, silence. Their artistry lies not simply in their ability to produce an image, but in their ability to reject thousands of plausible images in search of the one that expresses something essential. Generative AI removes the labor of producing those thousands of options, but the responsibility of choosing the one that matters does not disappear. It intensifies.
The ease of generation can even make the process more challenging. When every variation looks convincing, the creator must rely more deeply on their internal sense of meaning, on the emotional calibration that no model can approximate. This is the same challenge faced by novelists when every sentence produced by a language model sounds polished. It is the same dilemma designers face when every composition looks clean. The abundance demands more taste, not less. More interpretation, not less. More awareness, not less.
The way a diffusion model produces an image also reveals the essential distinction between computational creation and human creation. The model begins with noise and removes what doesn’t statistically belong. The human creator begins with experience, memory, desire and contradiction, and adds what gives the work its soul. AI reconstructs. Humans express. AI predicts. Humans intend. The two are not in conflict—they are complementary. And that complementarity is the foundation of Copy Lab’s ideology.
We believe that GenAI is not here to replace the subjective, uncertain, emotionally charged aspects of creative work. It is here to elevate them. To give creators more room to explore, more capacity to iterate, more freedom to focus on the decisions that cannot be automated: the decisions that give the work purpose. The sacred partnership between humans and GenAI is powerful not because AI can generate images, but because humans can decide what those images should mean.
So when AI makes an image, what we see is not magic. It is mathematics performing a reconstruction of what the world has already shown it. The magic—if we choose to use that word—happens afterward, when a human looks at the result and senses what the model cannot: whether the image resonates, whether it reveals something new, whether it belongs in the world at all. That moment of recognition is the true act of creation. And it remains entirely, irreducibly human.
/Carl-Axel Wahlström, Creative Director Copy Lab, 2025
What Really Happens When AI Makes an Image

