OpenAI has launched a major update to ChatGPT's image-generation capabilities, integrating advanced tools for creating and editing visuals directly through its GPT-4o model. The rollout, announced Tuesday during a livestream by CEO Sam Altman, represents the first significant leap in ChatGPT's visual functions in more than a year.
The updated model allows users to generate images, modify existing visuals, and design complex materials such as charts, menus, and maps-all through conversational prompts. The features are available immediately for subscribers to OpenAI's $200-a-month Pro plan, and will expand to free and Plus-tier users, along with developers accessing OpenAI's API, in the coming weeks.
OpenAI demonstrated how users can iterate on image requests in real time, such as asking for "a snail in a city," then refining that scene by changing the backdrop or adding accessories. The company said the system can also handle more complicated instructions regarding image composition.
The model's ability to render legible and structured text within images-long a challenge for generative AI-has been notably improved. OpenAI said this enhancement makes ChatGPT better equipped to produce infographics, diagrams, logos, and other professional visuals. The company added that it can now generate "a photorealistic image of a custom menu," or even a map, in response to user prompts.
According to OpenAI, the upgraded system takes longer to process image requests than its predecessor DALL·E 3, but this time investment results in more accurate and detailed outputs. It can also perform "inpainting" on existing images-editing foreground and background elements, even when people are present in the photo.
To train GPT-4o's image capabilities, OpenAI told The Wall Street Journal it used "publicly available data," in addition to proprietary content from partners such as Shutterstock.
"We're respecting of the artists' rights in terms of how we do the output, and we have policies in place that prevent us from generating images that directly mimic any living artists' work," said Brad Lightcap, OpenAI's chief operating officer, in a statement to the Journal.
OpenAI also said it offers an opt-out form for artists to request removal of their work from the model's training datasets. The company noted that it honors requests to block its web-scraping bots from collecting training data from websites, including image files.
The feature rollout arrives amid increased scrutiny of generative AI models. Google's recent image-generation update to its Gemini 2.0 Flash model went viral-but faced backlash for its lack of safeguards, which allowed users to produce copyrighted or manipulated content such as watermark-free images and fictional characters.
OpenAI acknowledged limitations in its own system. In a blog post, the company said ChatGPT may "make things up when generating images, such as including text with fake country names on a picture." It also noted that smaller text and non-Latin alphabets can be difficult for the AI to render correctly.
According to the blog post, image generation can take up to a minute with the new features. "It takes longer because the images are more detailed," Altman said during the livestream.