in

OpenAI’s new picture generator goals to be sensible sufficient for designers and advertisers


The new model makes progress on technical issues that have plagued AI image generators for years. While most have been great at creating fantastical images or realistic deepfakes, they’ve been terrible at something called binding, which refers to the ability to identify certain objects correctly and put them in their proper place (like a sign that says “hot dogs” properly placed above a food cart, not somewhere else in the image).

It was only a few years ago that models started to succeed at things like “Put the red cube on top of the blue cube,” a feature that is essential for any creative professional use of AI. Generators also struggle with text generation, typically creating distorted jumbles of letter shapes that look more like captchas than readable text.

Example images from OpenAI show progress here. The model is able to generate 12 discrete graphics within a single image—like a cat emoji or a lightning bolt—and place them in proper order. Another shows four cocktails accompanied by recipe cards with accurate, legible text. More images show comic strips with text bubbles, mock advertisements, and instructional diagrams. The model also allows you to upload images to be modified, and it will be available in the video generator Sora as well as in GPT-4o.

It’s “a new tool for communication,” says Gabe Goh, the lead designer on the generator at OpenAI. Kenji Hata, a researcher at OpenAI who also worked on the tool, puts it a different way: “I think the whole idea is that we’re going away from, like, beautiful art.” It can still do that, he clarifies, but it will do more useful things too. “You can actually make images work for you,” he says, “and not just just look at them.”

It’s a clear sign that OpenAI is positioning the tool to be used more by creative professionals: think graphic designers, ad agencies, social media managers, or illustrators. But in entering this domain, OpenAI has two paths, both difficult.

One, it can target the skilled professionals who have long used programs like Adobe Photoshop, which is also investing heavily in AI tools that can fill images with generative AI.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

Stage up your expertise with free AI literacy assets

A sensible framework for actual outcomes