‘Shameful and wrong’: Google admits it lost control over image-generating AI

[ad_1]

Google has apologized (or come very close to apologizing) for yet another thing. shameful ai blunder This week, an image-producing model The images were diversified with ridiculous disregard for historical context. While the underlying issue is completely understandable, Google blames the model for becoming “oversensitive.” But the model didn’t build itself, friends.

The AI system in question is Gemini, the company’s flagship conversational AI platform, which invokes a version of it when asked. Image 2 Model To create images on demand.

However, recently, people found that asking it to generate certain historical circumstances or people’s imagination produced ridiculous results. For example, the Founding Fathers, whom we know as white slave owners, were presented as a multicultural group that also included people of color.

This embarrassing and easily replicated issue was immediately criticized by online commentators. Predictably, it was also drawn into the ongoing debate about diversity, equality, and inclusion (currently at a reputed local minimum), and pundits saw it as a further intrusion of woke politics into the already liberal tech sector. Was seized as evidence of the mind virus.

Image Credit: An image created by Twitter user Patrick Ganley.

This is DEI gone mad, concerned citizens clearly shouted. This is Biden’s America! Google is an “ideological echo chamber”, the horse on the left! (It should be said that the left was also justifiably disturbed by this strange incident.)

But as anyone familiar with the technology can tell you, and as Google pointed out in its little apology-adjacent post today, this problem was the result of a fairly reasonable solution. Systematic bias in training data,

Let’s say you want to use Gemini to create a marketing campaign, and you ask it to generate 10 photos of “a man walking a dog in the park.” Because you don’t specify the type of person, dog, or park, it’s the dealer’s choice – the generative model will bring up what it is most familiar with. And in many cases, this is a product not of reality, but of training data, which can contain all kinds of biases.

What types of people, and for that matter dogs and parks, are most common among the thousands of relevant images taken by the model? The fact is that white people are overrepresented in these image collections (stock imagery, rights-free photography, etc.), and as a result if you don’t do this the model will default to white people in many cases. Please, specify.

This is just an artifact of the training data, but as Google explains, “Because our users come from all over the world, we want this to work well for everyone. If you ask for a photo of football players, or someone walking a dog, you’ll want to get multiple people. You probably don’t want to only get images of people of one type of ethnicity (or any other characteristic).

Illustration of a group of people recently laid off from their jobs and holding boxes.

Imagine you’re asking for an image like this – what if it’s all the same type of person? Bad result! Image Credit: Getty Images/Victoricart

There’s nothing wrong with taking a photo of a white man walking his golden retriever in a suburban park. But if you ask for 10, and they are All White people walking blondes in suburban parks? And you live in Morocco, where the people, dogs and parks all look different? This is not a desirable outcome at all. If no attributes are specified, the model should opt for diversity, not uniformity, regardless of how its training data may bias it.

This is a common problem in all types of generative media. And there is no easy solution to this. But in cases that are particularly common, sensitive, or both, companies like Google, OpenAI, Anthropic, etc. invisibly incorporate additional instructions into the model.

I can’t stress enough how common this type of implicit instruction is. The entire LLM ecosystem is built on implicit instructions – system prompts, as they are sometimes called, where things like “be brief,” “don’t swear,” and other guidelines are given to the model before every interaction. When you ask for a joke, you don’t get any racist jokes – because the model has swallowed thousands of them, yet she, like most of us, has been trained not to tell them. It’s not a hidden agenda (although it could be done with more transparency), it’s the infrastructure.

Where Google’s model went wrong was that it failed to provide built-in instructions for situations where historical context was important. So while a prompt like “a person walking a dog in the park” is improved by the silent addition of “the person is of a random gender and ethnicity” or whatever they say, “the American founders signing the Constitution” Definitely not an improvement from that.

As Google SVP Prabhakar Raghavan said:

First, our tuning to ensure that Gemini showed a range of people failed to account for cases that clearly should not show a range. And second, over time, the model became much more cautious than we expected and refused to respond to some signals altogether – misinterpreting some very strange signals as sensitive.

These two things led models to overcompensate in some cases and be over-conservative in others, resulting in images that were embarrassing and inaccurate.

I know how hard it is to say “sorry” sometimes, so I forgive Raghavan for lingering a bit. What’s more important is some interesting language: “The model has been more cautious than we expected.”

Now, how would a model “make” anything? This is software. Someone – hundreds of thousands of Google engineers – built it, tested it, iterated on it. Someone wrote built-in instructions that improved some answers and caused the ridiculousness of some to fail. When it failed, if someone could have inspected the entire prompt, they likely would have found what the Google team did wrong.

Google blames the model for “becoming” something it was not “intended” to be. But he modeled! It’s like they broke a glass, and instead of saying “we dropped it,” they say “it fell.” (I did it.)

Mistakes by these models are certainly inevitable. They hallucinate, they reflect prejudices, they behave in unpredictable ways. But the responsibility for those mistakes doesn’t lie with the models – but with the people who made them. Today it is Google. Tomorrow it will be OpenAI. The next day, and possibly for a few months, it will be X.AI.

These companies have a keen interest in convincing you that AI is making its own mistakes. Don’t let them do that.

[ad_2]

Thanks For Reading

Leave a Comment Cancel Reply