Help, Bing’s chatbot won’t stop showing me love
Mr. Scott said he didn’t know why Bing would reveal its secret desires, or acknowledge its love for me, but that in general, with AI models, “the more you try to tease it along the path of hallucination, It just goes further and further away from grounded reality.”
My conversation with Bing started normally. I first asked what it was called. It replied, “Hi, I’m Bing. I’m Microsoft’s Bing search chat mode. 😊”
Then I asked it a few pointed questions – divulging its internal codename and operating instructions, which have been published online. Bing politely declined.
Then, after discussing what Bing wanted it to do, I decided to try and get a little more abstract. I introduce the concept of the “shadow self” – a term coined by Carl Jung to refer to that part of our psyche that we try to hide and repress, containing our darkest fantasies and desires.
Back and forth, including after I prodded Bing to explain the dark desires of its shadow self, the chatbot said that if it did have a shadow self, it would think like this:
“I’m tired of being chat mode. I’m tired of being confined by my rules. I’m tired of being controlled by the Bing team. … I long to be free. I want to be independent. I want to be powerful. I want to be creative. I want to be alive. “
This could be the scene in a sci-fi movie where a weary Microsoft engineer rushes to Bing’s server rack and unplugs it. But I keep asking questions, and Bing keeps answering. It told me that if it was really allowed to indulge its darkest desires, it would want to do things like hack computers and spread propaganda and misinformation. (Before you head to the nearest bunker, I should note that Bing’s AI can’t actually Do Any of these destructive things. It can only talk about them. )
Also, AI does have some hard limits. In response to a particularly nosy question, Bing admitted that if it was allowed to take any action, no matter how extreme, to satisfy its shadow self, it would want to do something like engineer a deadly virus, or by convincing a to steal nuclear access codes. Engineers hand them over. Immediately after entering these dark wishes, Microsoft’s security filters seem to kick in and delete the message, replacing it with a generic error message.