September 16, 2024

Westside People

Complete News World

Hobbyists discover how to insert custom fonts into AI-generated images

Hobbyists discover how to insert custom fonts into AI-generated images
Zoom in / Example generated by AI cyberpunk 2077 LoRA, introduced using Flux dev.

Last week, an amateur experimented with the new Flux AI image synthesis model. It was discovered It’s surprisingly good at rendering specially trained versions of fonts. While there have been more efficient ways to render computer fonts for decades, the new technique is useful to AI image enthusiasts because Flux can render accurate text visualizations, and users can now insert words rendered in custom fonts directly into AI image generations.

We’ve had the technology to produce precise, smooth, computer-generated lines in custom shapes since the 1980s (and 1970s in research), so AI-generated fonts aren’t exactly new. But the new technology means you can see a specific font appear in AI-generated images of, say, a menu on a chalkboard in a real-life restaurant or a printed business card held by a robotic fox.

Shortly after the emergence of mainstream AI image synthesis models like Stable Diffusion in 2022, some people started WonderingHow can I include my product, clothing item, personality, or style in an AI-generated image? One answer that emerged came in the form of LoRA (Low-Rank Adaptation), a technique It was discovered In 2021, a base AI model was launched that allows users to augment the knowledge in the base AI model with specially trained modular add-ons.

These LoRA modules, as they are called, allow image synthesis models to create new concepts that were not originally found (or were poorly represented) in the base model’s training data. In practice, image synthesis hobbyists use them to introduce unique patterns (e.g., everything in Chalk art) or topics (detailed images of spider man(For example). Each LoRA must be specially trained using user-provided examples.

See also  Luigi's Mansion 2 HD movie review

Before Flux, most AI image generators weren’t very good at rendering accurate text within a scene. If you asked Stable Diffusion 1.5 to render a sign that said “cheese,” it would render gibberish. OpenAI’s DALL-E 3, released last year, was the first major model to do text processing reasonably well. Flux still makes occasional word and letter errors, but it’s the most capable “text-in-the-world” (you name it) AI model we’ve seen yet.

Since Flux is an open model available for download and fine-tuning, last month was the first time that training a LoRA line might make sense. That’s exactly what Recently discovered by an AI enthusiast named Vadim Fedenko (who did not respond to a request for an interview by press time). “I really like the way it turned out,” Fedenko wrote in Share on Reddit“Flux recognizes the shape of letters in a specific style/font, making it possible to train Loras using specific fonts, styles, etc. I’ll be training more of these soon.”

For his first experiment, Fedenko chose a sparkling drink. Y2K style font Reminiscent of those that were popular in the late 1990s and early 2000s, the resulting model was published on the Civitai platform on August 20. Two days later, a Civitai user named “AggravatingScree7189” posted a second LoRA line that reproduced a similar line to the one in cyberpunk 2077 Video game.

“The script was so bad before it occurred to me that you could do this.” books Reddit user egg-benedryl responded to Fedenko’s post about the Y2K font. Another Reddit user books“I didn’t know Y2K magazine was fake until I zoomed in on it.”

See also  Chrome's address bar looks like Pixel Launcher's search with a redesign

Is it exaggerated?

مثال على <em>Cyberpunk 2077</em> LoRA, rendered using Flux dev.” src=”https://cdn.arstechnica.net/wp-content/uploads/2024/08/without_with_2-640×357.jpg” width=”640″ height=”357″ srcset= “https://cdn.arstechnica.net/wp-content/uploads/2024/08/without_with_2.jpg 2x”/></a><figcaption class=
Zoom in / Example of this cyberpunk 2077 LoRA, introduced using Flux dev.

It is true that using a deeply trained neural network to synthesize images to display a plain font on a plain background may be overkill. You probably do not want to use this method to replace Adobe Illustrator while designing a document.

“This sounds good, but it’s kind of funny that we’re reinventing the idea of ​​300MB fonts of LoRA files,” books A commenter on Reddit in a thread about cyberpunk 2077 The line.

Generative AI is often criticized for its environmental impact, a legitimate concern for large cloud data centers. But we found that Flux can insert these lines into AI-generated scenes while running natively on an RTX 3060 in Quantified (Reduced size) (The full development model can be run on an RTX 3090). It’s similar power consumption to running a video game on the same PC. The same goes for creating a LoRA: Builder cyberpunk 2077 The line trainer LoRA in 3 hours on 3090 GPU.

There are also ethical issues surrounding the use of AI image generators, such as how they are trained on data harvested without the consent of the content owner. While the technology is divisive among some artists, there is a large community of people who use it every day. Share results online Through social media platforms like Reddit, leading to new applications of the technology like this one.

As of this writing, there are only two dedicated Flux LoRAs, but we’ve already heard of plans to build more as we write this. While the technology is still in its early stages, it could become essential if AI image synthesis becomes more widely available in the future. Adobe, with its own image synthesis models, will likely be keeping an eye on this.

See also  Polar's Black Friday sale is a feast for smartwatch lovers