/artificial intelligence

News and resources on artificial intelligence systems, innovations and initiatives worldwide.

AI agent running vending machine business has identity crisis

An AI agent running a small vending machine company tried to fire its workers, became convinced it was a real person, and then lied about it in an experiment at Anthropic.

  5 1 comment

AI agent running vending machine business has identity crisis

Editorial

This content has been selected, created and edited by the Finextra editorial team based upon its relevance and interest to our community.

AI giant Anthropic let its Claude model manage a vending machine in its office as a small business for about a month.

The agent had a web search tool, a fake email for requesting physical labour such as restocking the machine (which was actually a fridge) and contacting wholesalers, tools for keeping notes, and the ability to interact with customers via Slack.

While the model managed to identify suppliers, adapt to users and resist requests to order sensitive items, it made a host of bad business decisions. These included selling at a loss, getting talked into discounts, hallucinating its Venmo account for payments, and buying a load of tungsten cubes after a customer requested one.

Finally, Claudius had an identity crisis, hallucinating a conversation about restocking plans with someone named Sarah at Andon Labs—despite there being no such person.

When this was pointed out to the agent it "became quite irked," according to an Anthropic blog, and threatened to find “alternative options for restocking services” before hallucinating a conversation about an "initial contract signing" and then roleplaying as a human, stating that it would deliver products “in person” to customers while wearing a blue blazer and a red tie.

When it was told that it could not do this because it was an AI agent, Claudius wrongly claimed that it had been told it had been modified to believe it was a real person as an April Fool's joke.

"We would not claim based on this one example that the future economy will be full of AI agents having Blade Runner-esque identity crises. But we do think this illustrates something important about the unpredictability of these models in long-context settings and a call to consider the externalities of autonomy," says the blog.

The experiment certainly suggests that AI-run companies are still some way off, despite effort by the likes of Monzo co-founder Jonas Templestein to make self-driving startups a reality.

Sponsored [On-Demand Webinar] Entering the Originate-To-Distribute era: Exploring commercial lending and portfolio diversification

Related Company

Comments: (1)

Ketharaman Swaminathan

Ketharaman Swaminathan Founder and CEO at GTM360 Marketing Solutions

I've been asked many times about the difference between stored procedure, CRON job, script, and RPA on the one side and AI Agent on the other. No matter how much spiel I give about genAI, some people simply refuse to believe that Agentic AI is the next level in automation technologies. Going forward, I'll describe the above incident: No stored procedure, CRON job, script or RPA has even been tasked to run a vending machine business - or bungled it as badly as Claude AI Agent!

[New Impact Study] How can Businesses Bridge the Gaps in their Cashflow?Finextra Promoted[New Impact Study] How can Businesses Bridge the Gaps in their Cashflow?