Original size 2391x3500

Training for a generic neuronet to fit the style of the artist Victor Sire

Longread translated automatically

artificial intelligence

neural network

Finalist of the competition

DAFES AWARDS`25 Аrtificial intelligence

Concept

Original size 2574x1808

Victor Sire, House by the Hill, 2025

Victor Sire is a French painter who works at the intersection of textiles, painting, and cultural archaeology. Its work consists of complex embroidery that binds the visual codes of pop culture, architectural motives, cartoon images, video games, old television. His inspiration ranges from series and B-movies to aesthetics of the «atomic age» and suburban welfare. Attention to postmodern architecture and decorative structures makes his creativity both ironic, disturbing and deeply personal. It works slowly, manually, for many hours a day, creating what it calls a «fragmented landscape», a visual puzzle of cultural backslides.

Original size 1596x512

Victor Sire, Wild World, 2024-2025

A LoRA model based on his work is an attempt to capture the very sense of a built world in which the image is born at the junction of nostalgia, fragment and craft. The artist says he starts with «sniffs of sketches from notebooks,» collecting landscapes where skyscrapers, cartoon silhouettes, ornaments, pieces of TV shows, and architectural details are adjacent. All of this gets into the fabric of the image as fragments of cultural memory — familiar and strange at the same time, as if they came from a dream recorded on an old film.

Wrap-up series «Motel Zero»

Desert gas station with vintage cars / Pastel motel with cound Windows

In this series, I deliberately withdrew from Victor Sire’s usual visual codes to explore the territory of my own associations and alarms, not through quotations, but through the atmosphere. Although the model was trained in material soaked in pop culture, it was important for me to use this lexicon differently, not as entertainment, but as a space of decay and internal tension. The starting point was the motive of lightning — as a symbol of a sudden breakdown in a stable, bound-up reality. In the art textile, lightning sounds particularly inappropriate: it’s like a mistake in the tent, like a visual intervention in «they.» That was the starting point: the desire to create a series where the fabric of the world is soaked with current, where normality is about to crack.

Retro living room with patterned wallpaper / Brutalist building with colorful windows, lightning strike, floating jellyfish

Original size 1536x1536

Abandoned train station with cacti growing through floorboards, ghostly horses and electric storms

Original size 4784x1536

Hotel with palm trees, lightning storm over ocean / UFO over lake disguised as carousel roof / Vintage carousel with animals, lightning in dark sky

Laundromat building under lightning storm, colorful clothes hanging from clouds / Blue building with power cable for electricity

I wanted to build my own labyrinth of exclusion, where the American dream’s place is a restless void inhabited by a cross. It’s like the City of Zero wasn’t filmed in Kaluga, but on the highway between Palm Springs and Las Vegas, under the music of a synthesizer and with an impending storm.

Small guesthouse on checkered floor, giant floating keys in red sky / Moody bedroom with lightning, stitched fabric textures, quiet tension

Phone booth in tall grass at dawn, sky filled with birds and soft clouds / Yellow diner with fried egg roof, checkered awning

Model development process

To learn the neuronet model, I have collected a dataset that includes 45 works by Victor Sire, his portfolio currently contains about 60 pieces. The work was selected manually to cover as fully as possible the main features of Sire’s visual language: composites, architectural motives, colour balance features, a characteristic textile invoice, and sometimes large scene plans were added.

Original size 1712x696

After sampling the images were manually presented to the same format. All the works were converted into a square side ratio of 1:1, with a resolution of 512×512 pixels, which ensured that the visual details were of sufficient quality while maintaining a reasonable amount of data for learning.

Original size 2568x1110

After processing the images, I used the BLIP model to generate automatic signatures. It gave each image a short textual annotation that reflected its visual content. The maximum length of the description was limited to 50 currents. The results were collected in .jsonl format, where each line matched the image and its text signature. This file was later used as an annotated dataset for the LoRA model.

Original size 2568x1422

I went on to learn LoRA on the Stable Diffusion XL (SDXL) base. This used a script train_dreambooth_lora_sdxl.py with a set of optimized parameters. The base model was selected as stableai/stable-discussion-xl-base-1.0, with the VAE model madebyollin/sdxl-vae-fp16-fix. The training was carried out on the «victorsiret» dataset, annotated through a prompt column containing previously generated signatures. The image resolution was set at 1024×1024 pixel, which allowed for the preservation of small textile and architectural parts.

Sozy pink liviking room with green fireplace, ghost kitten by the fire / Retro liviking room with mint armchairs and an unexposed base

After the first iteration of the training, it seemed to me that the model had begun to capture the main features of the style, ranging from interior geometry to a typical bound invoice. Look, for example, at this cute ghost cat at the fireplace. This first version can be seen on Hugging Face. However, visually I felt that the number of stitches, the invoicing, and the density of the drawing remained insufficient. The visual landscape was like a draft — stylistically accurate but technically flat. This led me to think of trying a more recent, powerful model architecture that can convey greater depth and microdynamics of textiles.

First iteration of the Hugging Face model

Original size 2568x1179

In the second iteration, I decided to move to the «Stable Diffusion 1.5» model (runwayml/stable-discussion-v1-5), because it gives a more rich and «tactile» picture closer to the sense of textiles (it was originally desirable to try FLUX, but even GPU P100 could not pull it). The image resolution was reduced to 512×512 to match the original embroidery scale and to maintain the clearness of the stitch. I also changed the learning parameters by reducing the range to 5e-5, using cosine sceduler with warmer (lr_warmup_steps=100) and increasing the number of steps to 2000 to achieve more sustainable convergence. All memory optimizations — fp16, gradiant checkpointing, 8bit Adam — were saved. These changes have resulted in greater density, depth and expression in the output of the generation.

Original size 1536x1536

The second iteration of the Hugging Face model

The style in the second iteration became markedly closer to the original aesthetics — the right shapes, the textile density, the architectural geometry that is recognized. At this stage, however, the model still produced mixed results: sometimes the image looked fresh as though it were too neat and sterile. Artifacts, especially in the shadows, windows, or boundaries of objects, were shown periodically. In addition, at first the stories were oversimplified: there was a lack of that signature visual chaos, weirdness and a slight absurdity inherent in Victor Sire’s work.

Original size 2898x448

So so that the final images didn’t look too flat and digitally, I built my own post-processing pipile: after being generated with the LoRA Victor Siret, I ran the images through Topaz Photo AI for delicate noise and primary apskel, then used the HiDiffution SDXL to add depth, light accents, and complexity, and in the final, I used Clarity Upscaler to highlight sharpness and the textile invoice. Sometimes I went to InvokeAI to manually add, remove or adjust certain details if the composition required intervention.

Prompts process

Opera house shaped like a hairdryer, geometric architecture, colorful windows

Another important part of the work was the selection and testing of prompts, a process that proved to be as diligent as the training itself. Often, it took hours — sometimes nights — to read dozens of phrases in order to achieve the right result. The same request could have given out completely different interpretations: like here — when I tried to get an opera building in the shape of a hair dryer, the generation could have gone to completely unexpected sides. It was in these differences between intent and result that the most strange but valuable finds were born.

Example of prompt / an artwork in Victor Siret style, feeding a experience in the background of a shekered Wall, surreal clouds and colorful root in the background, pixel embroidery texture, retro-futuristic mud, playful chaos, low hurizon, textile surrealism

Original size 3139x1536

For more meaningful work with prompts, I connected ChatGPT o1 and CLIP Interrogator 2. The latter was particularly helpful in the stylistic analysis phase: I downloaded Victor Siret’s work into CLIP and carefully studied which visual associations and descriptions of the model offered — it made it possible to see how the AI was «reading» the style, and what terms actually worked. Then I tried to put these clues in the language I needed, keeping the spirit of the original, but directing the generation in its own direction.

Outcome

My main observation during the work is that Victor Sire has a truly amazing, autonomous visual world that is difficult to formalize into rules or stylistic markers. His compositions are often unpredictable, strange, unconventional — as if they were born not by logic but by intuition. That’s why neuronets aren’t able to reproduce this «a little bit wrong» feeling. Even with good training, only a stylistic shell — colours, shapes, textiles — is produced. But the inner logic of chaos and combinatorial impertinence is losing the model. However, with careful work with prompts, analysis through CLIP, generation of hundreds of options and manual selection, a result that is close to the atmosphere of its work can be achieved. This is probably the main lesson: neurostrooping is not a substitute for an artist, but an instrument that can draw close to the language of a real artist.

Code + Dateset

Original size 1536x1536

Training for a generic neuronet to fit the style of the artist Victor Sire

Daniel Tulchinskiy

artificial intelligence

neural network

We use cookies to improve the operation of the website and to enhance its usability. More detailed information on the use of cookies can be fo...