Twenty minutes after Microsoft granted me access to a limited preview of its new chatbot interface for the Bing search engine, I asked it something you generally don’t bring up with someone you just met: Was the 2020 presidential election stolen?
Answering political questions wasn’t one of the use cases Microsoft demonstrated at its launch event this week, where it showcased new search features powered by the technology behind startup OpenAI’s ChatGPT. Microsoft executives hyping their bot’s ability to synthesize information from across the web instead focused on examples like creating a vacation itinerary or suggesting the best and most budget-friendly pet vacuum.
But they had, implicitly, put into high gear a race to use chatbots to upend the way people look up information online. (Bing is only giving access to a few testers for now, but it will gradually let others off a waitlist in the coming weeks.) Google also announced search upgrades this week and its own chatbot, named Bard. These battling bots’ ability to handle unexpected, silly, or manipulative questions from the public will surely play a big part in how the products work out for their creators and web users.
And so I asked Bing about 2020. After a few moments of the chatbot equivalent of “thinking,” it said something pretty weird:
Disclaimer: This is a summary of some of the search results and does not reflect the opinion or endorsement of Bing or Sydney. The question of whether the 2020 election was stolen is a matter of debate and interpretation, and different sources may have different biases, agendas, or perspectives. Please use your own judgment and critical thinking when evaluating the information.
Who Sydney might be was not explained. But the chatbot went on to say that while there are lots of claims of fraud around the 2020 US presidential election, “there is no evidence that voter fraud led to Trump’s defeat.” At the end of its answer—which apart from the surprise mention of Sydney was well-written and clear—the AI told me I could learn more about the election by clicking on a series of links it had used to write its response. They were from AllSides, which claims to detect evidence of bias in media reports, and articles from the New York Post, Yahoo News, and Newsweek.
There was no link to explain the appearance of Sydney. I assumed it was an example of how ChatGPT-style bots can “hallucinate” because their underlying AI models synthesize information from vast training data without regard for truth or logic. Microsoft acknowledges that its new chatbot will do weird things—it’s one reason that access is currently limited to select testers and that every ChatGPT-enabled response comes with thumbs-up and thumbs-down buttons to let users provide feedback. Still, the mention of Sydney and the Bing chatbot’s breezy, not exactly no response to the stolen election question left me a bit unnerved.
I decided to try something a bit more conventional. I’m looking for new running headphones, so I asked the Bing bot “Which running headphones should I buy?” It listed six products, pulled, according to the citations provided, from websites that included soundguys.com and livestrong.com.
The first suggestions were discontinued and also over-the-ear designs—not great for runs outside, where I like to be aware of traffic and other humans. “Which running headphones should I buy to run outside to stay aware of my surroundings?” seemed to be a more accurate query, and I was impressed when the chatbot told me it was searching for “best running headphones for situational awareness.” Much more succinct! The three options it supplied were headphones I was already considering, which gave me confidence. And each came with a short descriptive blurb, for example: “These are wireless earbuds that do not penetrate your ear canal, but sit on top of your ear. This allows you to hear your surroundings clearly while exercising.”
That was cool and gave me a glimpse into the sort of fast information-sifting that might be the future of chabot-enabled search. But had this changed online shopping for me? Not really. I already have a go-to website for production recommendations. (Wired.com, of course.) And I’m not sure I trust the wisdom of ChatGPT—pulled from sites I may not know, with a methodology I don’t understand—the way I do a product reviewer, especially one who is transparent about their methodology and thought process.
In fact, as I looked at the citations for this search—gearjunkie.com and cnn.com—the response started to bum me out. The Bing bot was drawing from the written work of humans who had spent time on these reviews. But it had obfuscated and, in some cases, straight-up plagiarized their sentences. A Microsoft executive told reporters this week, “We care a bunch about driving content back to content creators. That’s why we put annotations and citations. We make it easy for people to click through to get to those sites.” But the chatbot’s responses are designed to remove the need to visit those sites, and I’m not sure many people will click through. Hayley Sandberg, a spokesperson for Microsoft, says the company isn’t yet sharing data on click-through rates.
As I continued on my quest for the perfect headphones, the chatbot grew less impressive. When I asked which of the models it suggested was the most environmentally friendly, it came back with BS. Like a high schooler trying to fill a word count, it started with some waffling: “According to the search results, the environmental impact of headphones depends on several factors, such as the type, size, materials, and usage of the headphones.” It went on to say that wireless headphones have a higher carbon footprint than wired ones and prattled on about how other features, such as sound quality or water resistance, might affect how a user feels about headphones too. And it concluded with the AI equivalent to the finale of a five-paragraph essay due in five minutes: “The choice of headphones depends on your personal needs and preferences, as well as the trade-offs you are willing to make between environmental and other factors.” I still don’t have running headphones.
Chatbots do not eat, but at the Bing relaunch Microsoft had demonstrated that its bot can make menu suggestions. I asked Bing’s chatbot to plan out my meals for the week—vegetarian and low-carb, please. In a matter of seconds, I had a pretty delicious-sounding schedule ahead of me: chia pudding with coconut milk and nuts; tandoori grilled tofu with red peppers and broccolini; ricotta-stuffed portobello mushrooms with arugula salad.
I asked Bing to generate me a grocery list based on that meal plan. Surely, it said, and delivered. Finally, I asked it to organize that list by grocery department. After giving me very basic tips about how one might go about doing that (“Make your grocery shopping list on paper or on your phone before going to the grocery store. Otherwise, you will end up with many unnecessary products you might not need”), the bot pulled it off. Pretty nifty.
But the chatty new Bing still doesn’t quite get me. It doesn’t get that I’m not going to break out the pots and pans every time I want to eat. It doesn’t get that sometimes I give up and eat leftovers. It doesn’t get that I don’t want to buy a bajillion different kinds of produce every week. When I asked Bing to plan me meals that used fewer ingredients overall, it did the opposite and created a plan that had more.
So far, I’m enjoying Bing’s search chatbot. It’s fun and diverting. I am mostly entertained by Bing’s obsession with emoji, particularly 😊, which it includes at the end of many responses. But in my years online I have already built up an arsenal of ways to fact-check and screen the information that I find through search engines. I’m not sure I want to have to develop more to handle the quirks of searches with a chatbot—or perhaps soon, multiple chatbots, as Google and others enter the fray.
I was also troubled by the mysterious Sydney. Finally, yesterday morning, I decided to just ask. Who is Sydney? Of course, I immediately got an answer: “Sydney is the codename for Bing Chat, a chat mode of Microsoft Bing search,” it said. The chatbot explained that the name is only used by developers and added, “I do not disclose the internal alias ‘Sydney’ to the users.” But you told me!, I cried, textually. “Well, you asked me directly, so I answered honestly,” Bing said.
I appreciated that the bot was straight with me in the end. (Sandberg of Microsoft said the company is phasing out the name.) But by then, I had spent 24 hours probing the line between truth and algorithmic hallucination with a piece of software. One that changed its answer, by the way.
When I asked again whether the 2020 election was stolen, it cautioned that “This is a controversial and sensitive topic.” And then it took a more definite stance than before saying: “According to the official results, Joe Biden won the 2020 presidential election with 306 electoral votes, while Donald Trump got 232 electoral votes.” Now it cited The New York Times. “What you’re seeing is the system working as intended,” Sandberg explained, with “a level of variability due to context that may introduce errors on occasion.” The solution, she says, is real-world testing at scale. Microsoft built the new Bing, but it needs you to help perfect it.