Show logo

Looking for Search

  |  Command Line Heroes Team   Tech history

Command Line Heroes • • Looking for Search | Command Line Heroes

Looking for Search | Command Line Heroes

About the episode

The web was growing quickly in the ‘90s. But all that growth wasn’t going to lead to much if people couldn’t actually find any web sites. In 1995, an innovative new tool started crawling the web. And the search engine it fed opened the doors to the World Wide Web.

Elizabeth Van Couvering describes trying to find websites before search engines, and how difficult it was becoming in the early ’90s to keep track of them all. Louis Monier talks about having to convince others how important search engines would become—and he showed them what a web crawler could do. Paul Cormier recounts taking the search engine from a research project to a commercial one. And Richard Seltzer wrote the book on search engines, helping the rest of the world see what a profoundly vital tool they would become.

Command Line Heroes Team Red Hat original show

Subscribe

Subscribe here:

Listen on Apple Podcasts Listen on Spotify Subscribe via RSS Feed

Transcript

It's the winter of 1995. You're an early adopter cruising the internet for all the latest details on, I don't know, Pixar's new movie Toy Story, or maybe NASA's Galileo Space Probe. The world wide web is your oyster because a brand new tech has arrived that lets you search for anything online. It's unlike anything that came before. There's a flood of information at your fingertips, and now you can navigate it with total ease. This website for searching other websites is going to change the world. It's got a weird name, but who cares? You type it in and smile: altavista.com. All season we've been looking into the tech breakthroughs that made 1995 one of the most extraordinary years on record. It was the year the dot-com bubble began and 16 million internet users suddenly showed up. With their arrival came a flood of new content, and with that flood came a pressing new need, the need for navigation. How do you follow every thread in a world wide web? How do you find that crucial piece of information you've been hunting for? At each point in history, we've had to invent new ways to organize our data. The more data we got, the more creative we had to get. In ancient times, we invented alphabetical order, indexes, and tables of contents so we could find what we needed in books. More modern inventions, like the Dewey Decimal System organized huge libraries of knowledge. But digging through something as giant as the world wide web required an invention more powerful than anything that had come before. And 1995 was the year that invention arrived. I'm Saron Yitbarek and this is Command Line Heroes, an original podcast from Red Hat. As of the year 2020, there were close to 5 billion internet users on the planet. 5 billion people, posting content, tagging themselves, streaming videos, writing posts, and researching term papers. To get it all done they searched through more than 4 billion web pages. That's a lot of content. For perspective the New York Public Library system has just 55 million items. Our ability to sort through all that content isn't just convenient, it's fundamental. Without a way to search the web, the web as we know it would not work. In fact, before 1995, there really wasn't a good way to search. And I actually even have a book that I used to write down websites in so that I could help people at the cyber cafe find things that they wanted. Elizabeth Van Couvering is an Assistant Professor in Media and Communications at the University of Karlstad. She's been studying the history of search engines for about as long as they've been around. And she describes the early web as a flat and static place with a bunch of directories to help you find your way. The first directory was made in 1991 by Tim Berners-Lee himself. He called it the Virtual Library. Sounds impressive, but it was honestly just a list of sites. New websites were sent to Berners-Lee and he posts links on his library page, organizing them into categories like Anthropology and Bio Sciences. He was getting, at most, a hundred visits a day. But early web directories like that? They were updated manually. In Yahoo's early state, there were editors who kept a list. There was also InfoSeek where webmasters manually submitted their pages for inclusion. But web search, as we know it, hasn't yet evolved. It soon became apparent that the work was not very possible actually. It simply wasn't possible to do it on a human scale because you just needed too many people and there still was no funding. Van Couvering's point about funding is an important one. In the early days, most people didn't understand what search was going to become, so there wasn't much money behind the idea. The internet was a niche, not quite a global phenomenon. And the idea you could get enough traffic at a search site to make money through ads, which maybe seems obvious today, was not so obvious back then. That early site InfoSeek, for example, tried a subscription model selling access to their list of websites for 10 bucks a month, and you were limited to 100 searches each month. These companies we're searching all the time for what was going to be the business model. That problem, finding the right business model for search, was going to take a long time to solve. We'll get back to it later. But in the meantime, there was another problem. How do you get a technology that's useful enough? That's good enough to actually sell? Too much demand and not enough supply. That's Louis Monier. He's a computer scientist, and back in 1995, he was working at a computer company called Digital Equipment Corporation, DEC for short. Monier saw the problem that was emerging on the web. There were too many web pages and people wanted to find information and they could not. So clearly we needed a search engine. Monier wasn't the only person thinking about search. Teams at Stanford and Carnegie Mellon were also working on search engines. But the approach that Monier took was different. He wasn't trying to create a better directory or a more organized list. He was trying to create something that could automatically crawl the web, indexing everything it found. I decided to build a crawler, which is a program that automatically goes from web page to web page and downloads them and then indexes them so that you can search them later. The crawler was called Scooter, and it was remarkably ambitious for its time. Instead of relying on humans to submit and categorize websites, Scooter would automatically discover and index web pages. The web crawler was maybe a couple hundred lines of code, very simple. But it was going out to all the websites that existed at the time, which were about a million websites, and downloading all the pages and making them searchable. A million websites might not sound like much today, but in 1995, it was everything that existed on the public web. And Scooter was indexing all of it. This was the beginning of what would become AltaVista. AltaVista launched in December 1995, and it was an immediate hit. On the first day, we had 300,000 queries. By the end of the first year, we were handling 19 million queries per day. Those numbers were unprecedented. No one had ever seen demand for search like that before. It proved that there was a massive, pent-up need for web search that the directory-based approaches just couldn't satisfy. People loved it because for the first time, they could search the entire web. They didn't have to know which directory to look in or which category their topic might be filed under. They could just type in what they were looking for and find it. AltaVista's success was a revelation. It showed that automated search wasn't just possible—it was essential. And it established many of the principles that still guide search engines today. We were the first to do full-text indexing of the entire web. We were the first to do real-time search where the index was updated continuously as new pages were discovered. We basically invented what we now think of as web search. But even as AltaVista was proving the value of automated search, the company was struggling with bigger questions about how to turn search into a sustainable business. I joined DEC in 1995, right around the time AltaVista was launching. And I watched as this amazing technology struggled to find its place within a hardware company that didn't really understand what it had. Paul Cormier is now the CEO of Red Hat, but back then he was a young engineer at DEC. He saw firsthand how the company failed to capitalize on its search engine breakthrough. DEC was a hardware company. They made computers and servers. They didn't understand software as a service, they didn't understand advertising models, they didn't understand the internet economy. To them, AltaVista was just a way to show off how powerful their servers were. This was the fundamental problem that would plague AltaVista throughout its existence. The technology was groundbreaking, but the business model was unclear. We had millions of users, we had the best search technology in the world, but we had no idea how to make money from it. The internet advertising market barely existed, and even when it started to develop, DEC didn't have the expertise to capitalize on it. Meanwhile, other players were entering the search market. Yahoo was evolving from a directory into a search engine. Excite, Lycos, and other competitors were emerging. But none of them had solved the fundamental business model problem either. The late 1990s were a time of great experimentation in search. Everyone knew that search was important, but no one had figured out how to make it profitable. There were subscription models, there were attempts to charge websites for inclusion, there were various advertising experiments. During this period of uncertainty, AltaVista's technical leadership began to erode. The company went through multiple ownership changes, first being sold to Compaq when they acquired DEC, then later being spun off and eventually acquired by Yahoo. Each change in ownership brought new priorities and new confusion about what AltaVista was supposed to be. Was it a search engine? Was it a portal? Was it a technology demo? No one seemed to know for sure. As AltaVista struggled with these questions, two Stanford PhD students were working on a different approach to search. Larry Page and Sergey Brin were developing what would become Google. What Page and Brin brought to search was a fundamentally different algorithm. Instead of just looking at the content of web pages, they looked at the link structure of the web. They treated links like academic citations, using them to determine which pages were most authoritative. This PageRank algorithm, as it came to be known, provided more relevant search results than the keyword-based approaches that previous search engines had used. The PageRank algorithm was a real breakthrough. It solved the problem of relevance in a way that we hadn't been able to do with pure keyword matching. When you searched for something on Google, you were much more likely to find what you were actually looking for. But PageRank wasn't the only advantage that Google had. They also solved the business model problem that had stumped everyone else. Google figured out how to make advertising work with search in a way that was both effective for advertisers and non-intrusive for users. Their AdWords system was brilliant—it showed relevant ads based on what people were searching for, and it only charged advertisers when people actually clicked on the ads. This combination of superior technology and a working business model allowed Google to quickly dominate the search market. By the early 2000s, Google had become synonymous with search. Google's success wasn't just about having better technology. It was about understanding that search was a business, not just a research project or a way to show off hardware. They built their entire company around search, while other companies treated it as a side project. Looking back, it's clear that AltaVista had all the pieces needed to become Google before Google existed. They had the technology, they had the user base, they had the brand recognition. It's a bit sad to think about what might have been. We had such a head start, such advanced technology. But in the end, having great technology isn't enough if you don't have the right business strategy to support it. The rise and fall of AltaVista is a reminder that in the technology industry, being first doesn't guarantee being best. Success requires not just technical innovation, but also business acumen, strategic vision, and sometimes just being in the right place at the right time. The lessons from AltaVista are still relevant today. Great technology is necessary but not sufficient. You need to understand your market, your business model, and your customers. You need to be able to execute not just on the technical side, but on the business side as well. But despite its ultimate fate, AltaVista's contribution to the development of the web cannot be overstated. It proved that automated search was possible and necessary. It established many of the technical principles that still guide search engines today. I'm proud of what we accomplished with AltaVista. We solved a fundamental problem—how to find information on the web—and we did it in a way that made the entire internet more useful for everyone. Today, search is so fundamental to our online experience that it's hard to imagine the web without it. We search for everything—restaurants, news, people, products, answers to random questions that pop into our heads. Search has become the primary way that most people navigate the internet. It's not just a tool for finding information—it's become a gateway to the entire digital world. And the evolution of search continues. Modern search engines use artificial intelligence to understand not just what we're searching for, but what we really mean. They can answer questions, not just find web pages that contain certain keywords. The future of search is probably going to be even more conversational, even more intelligent. We're moving toward a world where you can have a natural language conversation with a search engine, where it can understand context and nuance in ways that would have seemed like science fiction in 1995. But all of these advances build on the foundation that was laid in 1995 by pioneers like Louis Monier and the AltaVista team. They showed that it was possible to automatically index and search the entire web, and in doing so, they made the modern internet possible. When we launched AltaVista, we knew we were solving an important problem, but I don't think any of us anticipated how central search would become to the internet experience. It's gratifying to see how the principles we established have continued to evolve and improve. The story of search is ultimately a story about making information accessible. In a world where knowledge is power, search engines democratize access to that knowledge. They level the playing field, allowing anyone with an internet connection to find virtually any information they need. Search engines have fundamentally changed how we relate to information. We no longer need to memorize facts or keep extensive personal libraries. We can rely on being able to find any information we need, when we need it. But that transformation has also raised new questions and concerns. When one company controls how billions of people access information, that's a significant concentration of power. There are ongoing debates about search bias, filter bubbles, and the responsibility that search engines have to society. At this point in the late 90s, nobody knew for sure how search was going to be monetized. Marketing teams, for example, we're looking to tweak results for their clients. They were poking at AltaVista, and they were poking at Lycos and they were poking at InfoSeek. We now had pressures to potentially make some people look good on results, or how can we give people what they want, or should we sell links? And there was a pressure to include paid for links in the index, and it became a very conflicted space. Van Couvering's research suggests that the tech of AltaVista was perfectly positioned to succeed, but the executives in charge were not. AltaVista had a technically superior search engine, but they did not manage it correctly for the time. In other words, while there was more and more interest in search, the field was anybody's for the taking. AltaVista could have made a comeback, but that opening was about to snap shut. When Larry Page and Sergey Brin founded Google in 1998, they weren't thinking about what the marketing department wanted or how they could support a hardware team. They were thinking about the way academics organize their information. Citation analysis. A way to see which material is more cited, more relevant. And they built that approach into their page rank algorithm. What Brin and Page did was to take that kind of logic and say, "Okay, we are going to look at not just the content of the page." AltaVista and the other search engines up to that time had looked at what was on the page. What was the title of the page? What were the words on the page? But Google looked at the link structure behind the page and identified certain websites as hubs of knowledge. The indexing was still organic and free in one sense, but it also tracked the value of different pages. It ranked them. This brought an elegance to search, and a usefulness that it never had before. And Google brought something else, too. Dollar signs. Page rank aside, Cormier reminded us that Google cracked the funding problem AltaVista and the rest never quite figured out. Finding the business model, the advertising-based business model, was the thing that made it explode. And as people used it more for free, you needed more everything, more R & D, more systems, more software, more datacenters, everything. And so I think once the advertising model that Google brilliantly put in place, that's really what it took for it to really become what it is today, because it's a huge capital investment and somehow you have to pay for it. It's really easy to say now what a dumb mistake to not come up with an advertising model, but when it's never been done before, it's different, right? There was a moment that you'll remember, if you're old enough. That moment in the early 2000s, when somebody said, "Hey, you're using the wrong search engine," and then they pointed you toward Google. Converts raced to the new king of search and soon enough it became literally synonymous with search. You didn't just search anymore, you Googled. Nobody was AltaVistaing. Those early years of competition faded into memory. Here's Van Couvering one more time. Google's dominance has been so complete that it's easy to forget there was ever serious competition in search. But AltaVista's story reminds us that technological leadership doesn't automatically translate to business success. The legacy of that competition, though, is the incredible search technology we have today. AltaVista proved it was possible, Google perfected it, and now we take it for granted. Search was always going to be so much more than a side project. So much more than a curiosity in a computer company's research lab. And after Google, we all understood that. In a massively connected world flooded with content, search is what everybody's looking for. In 2010, Google instant arrived and the search engine started predicting what you were searching for before you finished typing. Advances in AI continue to make the experience more and more intuitive. And it's really become impossible to imagine online life without these tools. But the fundamentals of today's search technology were there in AltaVista. They were imagined into being back in 1995 by pioneers like Louie Monier. He understood that no simple catalog could ever keep up with our sprawling curiosity. We probably used search engines a few hundred times just while putting this episode together. Just today, I wanted to check up on our guests and in a second, I see Louis Monier just did an interview where they call him the "father of search." And Larry Page's net worth is, I'm not even going to tell you. And Paul Cormier is, all right, he's the CEO of Red Hat, the company that brings you this podcast. Want to learn more about the history of search engines? There's a stack of bonus material waiting for you over at redhat.com/commandlineheroes. Next time, we are going global traveling to France, Argentina, and China, to discover how different countries build their own on-ramps to the internet. Until then, I'm Saron Yitbarek and this is Command Line Heroes, an original podcast from Red Hat. Keep on coding.

About the show

Command Line Heroes

During its run from 2018 to 2022, Command Line Heroes shared the epic true stories of developers, programmers, hackers, geeks, and open source rebels, and how they revolutionized the technology landscape. Relive our journey through tech history, and use #CommandLinePod to share your favorite episodes.