Jiang Chen, a machine learning expert who previously worked at Google, was mesmerized when he first tried ChatGPT, the remarkably coherent and seemingly well-informed chatbot from OpenAI that has become an internet sensation.

But the technology’s aura of power dimmed when Chen tried using the same underlying artificial intelligence technology to build a better search tool for the startup he cofounded, Moveworks. The company uses AI to help employees sift through information such as technical support documents and HR pages. Chen’s new AI search tool was great at pulling up all sorts of useful information from such documents, including serving up addresses and phone numbers—but some of them weren’t real. “Its ability to fabricate is just amazing,” Chen says.

The feverish excitement around ChatGPT and widespread suggestions that it could reinvent search engines is understandable. The chatbot can provide complex and sophisticated answers to questions by synthesizing information found in the billions of words scraped from the web and other sources to train its algorithms. Tinkering with the bot can give a sense of experiencing a more fluid way to interact with machines.

But the way the technology works is in some ways fundamentally at odds with the idea of a search engine that reliably retrieves information found online. There’s plenty of inaccurate information on the web already, but ChatGPT readily generates fresh falsehoods. It underlying algorithms don’t draw directly from a database of facts or links but instead generate strings of words aimed to statistically resemble those seen in its training data, without regard for the truth.

Despite that challenge, and perhaps driven on by the giddiness around ChatGPT, the titans of web search, as well as several startups, are plunging ahead. Microsoft, which has invested around $10 billion in ChatGPT’s creator OpenAI, is said to be somehow adding the underlying technology to its second-ranked search engine Bing.

Google, which has been working on a similar chatbot called LaMDA for some time, is reported to be scrambling to respond. It plans to release a form of LaMDA soon and may demo as many as 20 products this year that use the same technology. China’s leading search engine, Baidu, is working on a Chinese language bot similar to ChatGPT

While the tech giants prepare their responses to the ChatGPT emergency, several startups have launched search engines with chat interfaces similar to the bot. They include You.comPerplexity AI, and Neeva.

The tools they have built illustrate both the potential and the challenge of adapting ChatGPT-style technology to search. You.com, founded by Richard Socher, an expert on language and AI, can provide answers through a chat interface. The responses come with citations, which can help a user track down the origins of a piece of information.

But the model sometimes combines sources that don’t belong together. Asking about a person, for example, can produce an answer that combines information from the bios of multiple people with the same name. When asked about me, You Chat accurately described my role at WIRED but also credited me with being a running back at the University of Delaware and a professional creative. A regular search might return pages for several Will Knights, but the chatbot conflated them into a single person.

Another issue with a system like ChatGPT is that its responses a based on only the data it was trained on. Retraining the model in its entirety can cost millions of dollars because of its size and the scale of the data. You Chat is confused when asked for the latest sports scores but knows what the weather is like in New York at the moment. Socher doesn’t want to disclose how up-to-date information is incorporated, seeing it as a competitive advantage.

“I think right now a lot of these chat interfaces are way superior to the search experience in some ways, but in others they’re clearly still much worse,” Socher says. “We’re working on reducing all these issues.”

Aravind Srinivas, founder and CEO of search startup Perplexity AI, who previously worked at OpenAI, says the challenge of updating a ChatGPT-like system with recent information means that they need to be combined with something else. “Alone they’ll never be able to be good search engines,” he says.

Saam Motamedi, a venture capitalist at Greylock Partners who has invested in the AI-based search company Neeva, says it is also unclear how compatible chat interfaces are with the primary revenue model for search engines—advertising. Google and Bing use search queries to select ads that appear on top of the list of links served up in response. Motamedi suspects that new forms of advertising might need to emerge for chat-style search interfaces to be viable, but it isn’t altogether clear what those will be. Neeva charges a subscription fee for unlimited ad-free searches.

The cost of running a model like ChatGPT on the scale of Google might also prove problematic. Luis Ceze, cofounder and CEO of OctoML, a company that helps companies lower the cost of deploying machine learning algorithms, estimates that it may be 10 times more expensive to run a ChatGPT search than a Google search, because each answer requires running a large and complex AI model.

The scale of ChatGPT mania has taken some coders and AI researchers familiar with the underlying technology by surprise. The algorithm at the core of the bot, called GPT, was first developed by OpenAI in 2018, and a more powerful version, GPT-2, was revealed in 2019. It is a machine learning model designed to take in text and then predict what comes next, which OpenAI showed can perform impressively if trained with huge volumes of text. The first commercial version of the technology, GPT-3, has been available for developers to use since June 2020 and can accomplish many of the things ChatGPT has recently been feted for.

ChatGPT uses an improved version of the underlying algorithm, but the biggest leap in its abilities comes from OpenAI having humans provide feedback to the system on what makes a satisfying answer. But like the text-generation systems before it, ChatGPT is still prone to reproducing biases from its training data as well as “hallucinating” plausible but incorrect results.

Gary Marcus, a professor emeritus at New York University and a vocal critic of AI hype, believes ChatGPT is unsuited to search because it has no true understanding of what it says. He adds that tools like ChatGPT may cause other problems for search companies by flooding the internet with AI-generated, search engine-optimized text. “All search engines are about to have a problem,” he says.

Alex Ratner, an assistant professor at the University of Washington and cofounder of Snorkel AI, which works on training AI models more efficiently, calls ChatGPT “legitimately an inflection” in what software can do. But he also says that it may take a while to figure out how to prevent language models like GPT from making things up. He believes that finding a way to keep them up to date with new information to keep search fresh will most likely involve new approaches to training the underlying AI models.

How long those fixes will take to invent and prove out is unclear. It may be some time before the technology can radically change the way people search for answers, even if other use cases come to pass, such as dreaming up new recipes or serving as a study or programming buddy. “It’s amazing, and I told my team that people are going to see years as pre- and post-ChatGPT,” says Chen of Moveworks. “But whether it will replace search is a different question.”