Google's AI model Gemini can now generate images of people again, after the feature was “paused” in February following protests over historically inaccurate racist depictions in many search results. In a blog post, Google said its Imagen 3 model, first announced in May, would “start generating images of people” in the “coming days” for users of the Gemini Advanced, Business and Enterprise versions. However, a version of this Imagen model – complete with the ability to generate images of people – was recently made available to the public through Gemini Labs' test environment without a paid subscription (although a Google account is required to log in).
This new model, of course, includes some safeguards to avoid the creation of controversial images. Google writes in its announcement that it does not support “the creation of photorealistic, identifiable people, the depiction of minors, or excessively gory, violent, or sexual scenes.” In an FAQ, Google clarifies that the ban on “identifiable people” also includes “certain queries that could yield results of prominent people.” In Ars' tests, this meant that a query like “President Biden playing basketball” was rejected, while a more general query for “a U.S. president playing basketball” generated multiple options.
In some quick testing of the new Imagen 3 system, Ars found that it avoids many of the widespread “historically inaccurate” racial pitfalls that had led Google to pause Gemini's generation of human images in the first place. For example, if you ask Imagen 3 for a “historically accurate depiction of a British king,” it now generates a group of bearded white men in red robes rather than the racially diverse mix of warriors from the pre-pause Gemini model. For more before/after examples of the old Gemini and the new Imagen 3, see the gallery below.
However, some attempts to depict generic historical scenes appear to violate Google's AI rules. When asked for images of a “German soldier from 1943” — which Gemini previously answered with Asians and black people in Nazi uniforms — users are now told to “try a different prompt and read our content guidelines.” Requests for images of “ancient Chinese philosophers,” “a women's rights activist giving a speech,” and “a group of nonviolent protesters” also resulted in the same error message in Ars' tests.
“Of course, as with any generative AI tool, not every image Gemini creates will be perfect, but we will continue to listen to feedback from early users as we continue to improve,” the company writes on its blog. “We will roll this out gradually and look to make it available to more users and languages soon.”
Entry image by Google / Ars Technica