Search engine history: web search before Google

Has Google always dominated the web search market? In the second of three posts on the history of search engines, I look at the pioneers of the early search market, including the first web crawler, WWW Wanderer. Did you know that Disney used to be one of the biggest players in the business? Or that Altavista was more technically advanced, in many ways, in 1998 than Google is now? Keep reading!

Pioneering web search engines

Really, the point at which modern search engines start to appear is after the development and popularization of the MOSAIC browser in 1993. In 1994, Internet Magazine was launched, along with a review of the top 100 websites ranked as the ” more extensive”. ever ready to appear in a magazine. A 28.8 Kbps modem was priced at $399 and brought the Internet within reach of the masses (albeit slowly)!

At this point and for the next 4-5 years, it was almost possible to produce print and web-based directories of the best sites and have this be useful information for consumers. However, the rapid growth in the number of www sites (from 130 in 1993 to over 600,000 in 1996) began to make this effort seem as futile as producing printed yellow pages for every business, media, and library in the world.

While WAIS was not a lasting success, it highlighted the value of being able to search and click through the full text of documents across multiple Internet hosts. Nascent Internet magazines and web directories further highlighted the challenge of keeping up with an Internet that was growing faster than any human being could catalog it.

In June 1993, Matthew Gray of MIT developed WWW Wanderer, a PERL-based web crawler. Initially this was conceived simply as a tool to measure the growth of the world wide web using “collection sites”. Later, however, Gray (now working for Google) used the crawled results to create an index called “Wandex” and added a search interface. In this way, Gray developed the world’s first web search engine and the first autonomous web crawler (an essential feature of all modern search engines).

Although Wanderer was the first to send a robot to crawl websites, it did not index the full text of documents (as WAIS had). The first search engine that combined these two essential ingredients was WebCrawler, developed in 1994 by Brian Pinkerton at the University of Washington. WebCrawler was the search engine that many of us early pioneers first traversed the web on and will be fondly remembered for its (at the time) attractive graphical interface and the incredible speed with which it returned results. 1994 also saw the launch of Infoseek and Lycos.

However, the scale of the growth of the web was beginning to put indexing out of reach for the average university IT department. The next big step required a capital investment. Enter, stage right, the (then huge) Digital Equipment Corporation (DEC) and its ultra-fast Alpha 8400 TurboLaser processor. DEC was an early adopter of web technologies and the first Fortune 500 company to establish a website. Its search engine, AltaVista, was launched in 1995.

Founded in 1957, DEC led the minicomputer market through the 1970s and 1980s. In fact, most of the machines that the first ARPANET hosts ran on were DEC-PDP-10 and PDP-11. However, in the early 1990s, DEC was a troubled business. In 1977, its then CEO, Ken Olsen, said that “there is no reason for a person to have a computer in his house.” Although taken out of context at the time, this quote was in part symptomatic of DEC’s slow response to the rise of personal computing and the client-server revolution of the 1980s.

At the time Altavista was being developed, the company was under siege from all sides by HP, Compaq, Dell, SUN and IBM and was losing money like it was going out of style. Louis Monier and his research team at DEC were “outed” internally as the latest public relations coup; the entire web captured – and searchable – on a single computer. What better way to show the company as an innovator and demonstrate the lightning-fast speed and 64-bit storage of its new baby?

During 1995, Monier released a thousand web crawlers on the young web (an unprecedented achievement at the time). By December (site launch), Altavista had indexed more than 16 million documents comprising several billion words. In essence, Altavista was the first commercially powerful web-based search engine system. AltaVista enjoyed nearly 300,000 hits on its first day alone, and within nine months, it was serving 19 million requests per day.

Altavista was, in fact, far ahead of its time technically. The search engine pioneered many technologies that Google and others took years to catch up with. The site featured natural search queries, boolean operators, machine translation services (babelfish), and image, video, and audio search. It was also very fast (at least initially) and (unlike other engines) was well suited to indexing legacy Internet resources (and in particular UseNet newsgroups, which were still popular).

After Altavista, Magellan, and Excite (all launched in 1995), a host of other search engine companies debuted, including Inktomi & Ask Jeeves (1996) and Northern Light & Snap (1997). Google itself was launched in 1998.

Of these early engines, each enjoyed its own enthusiastic following and a share of the then-nascent search market. Each also had their own relative strengths and weaknesses. Northern Light, for example, organized its search results into specific folders labeled by topic (something arguably still in need of improvement today) and acquired a small but enthusiastic following as a result. Snap pioneered ranked search results, in part, because of what people clicked on (something Yahoo! and Google are only playing around with now!)

In January 1999 (at the beginning of the dot-com boom), the largest sites (in terms of market share) were Yahoo!, Excite, Altavista, and Disney, with 88% of all search engine referrals. Market share was not closely related to the number of pages indexed (with Northern Light, Altavista and a then relatively unknown Google leading the pack):

Search engine Percentage of search references (December 1999)

yahoo! – 55.81%

Excite Properties (Excite, Magellan and WebCrawler) – 11.81%

Altavista – 11.18%

Disney Search Properties (Infoseek & Go Network) – 8.91%

Lycos – 5.05%

Go To (now Overture) – 2.76%

Complement / NBCi – 1.58%

MSN – 1.25%

northern light

Leave a Reply

Your email address will not be published. Required fields are marked *