

In the paper, the authors sketch out what they call an aspirational example of what this approach might look like in practice. Seekers of information pose questions, the system answers conversationally-that is, with a natural language reply as you’d expect from an expert-and includes authoritative citations in its answer. One ideal result they envision is a bit like the starship Enterprise’s computer in Star Trek. “What if the distinction between retrieval and ranking went away and instead there was a single response generation phase?” “What would happen if we got rid of the notion of the index altogether and replaced it with a large pre-trained model that efficiently and effectively encodes all of the information contained in the corpus?” Donald Metzler and coauthors write in the paper. But instead of merely augmenting the system, the authors propose machine learning could wholly replace it. It’s worth noting machine learning is already at work in classical index-retrieve-then-rank search engines. The Google team suggests the next generation of search engines might synthesize the best of all worlds, folding today’s top information retrieval systems into large-scale AI. Though they have their own shortcomings (more on those below), large language models like GPT-3 are much more flexible and can construct novel replies in natural language to any query or prompt. But these tools are brittle, with a limited (though growing) repertoire of questions they can field. There are question-and-answer tools, like Alexa, Siri, and Google Assistant. Search results have improved leaps and bounds over the years.

Though search engines surface (hopefully quality) sources that contain at least pieces of an answer, the burden is on the searcher to scan, filter, and read through the results to piece together that answer as best they can. Like when you get sucked down a panicky, health-related rabbit hole at two in the morning. When seeking information, most people would love to ask an expert and get a nuanced and trustworthy response, the authors write. They say large language models- machine learning algorithms like OpenAI’s GPT-3-could wholly replace today’s system of index, retrieve, then rank. In a paper on the arXiv preprint server, the team suggests the technology to make the internet even more searchable is at our fingertips. But now, AI researchers at Google, the very company that set the bar for search engines in the first place, are sketching out a blueprint for what might be coming up next.


So useful, in fact, that it hasn’t fundamentally changed in over two decades. When someone enters a query in the search bar, the search algorithm thumbs through its indexed version of the internet, surfaces pages, and presents them in a ranked list of the top hits. Google’s algorithmic tentacles scan and index every book in that ungodly pile. Which is why most of our quests for “enlightenment” online begin with Google (and yes, there are still other search engines). This is the raw internet in all its unfiltered glory. But how would a seeker find them? Lacking organization, the books are useless. Those books are brimming with knowledge and answers. Imagine a collection of books-maybe millions or even billions of them-haphazardly tossed by publishers into a heaping pile in a field.
