Search Engine History – Web Search Before Google

Did Google always dominate the web search market? In the second of three posts on the history of the Search Engines, I look at the pioneers of the early search market, including the very first web crawler, WWW Wanderer. Did you know that Disney used to be one of the biggest players in the business? Or that Altavista was more technically advanced, in many ways, in 1998 than Google is now? Read on!

The pioneering Web Search Engines

Really, the point at which modern search engines first begin to appear is after the development and popularisation of the MOSAIC browser in 1993. In 1994, Internet Magazine was launched, together with a review of the top 100 websites billed as the ‘most extensive’ list ever to appear in a magazine. A 28.8Kbps modem was priced at $399 and brought the internet within the reach of the masses (albeit slowly)!

At this point and for the next 4-5 years, it was just about possible to produce printed and web-based directories of the best sites and for this to be useful information for consumers. However, the rapid growth in the number of www sites (from 130 in 1993 to over 600,000 in 1996) began to make this endeavour seem as futile as producing a printed yellow pages of all the businesses, media and libraries in the world!

Whilst WAIS was not a lasting success, it did highlight the value of being able to search – and click through to – the full text of documents on multiple internet hosts. The nascent internet magazines and web directories further highlighted the challenge of being able to keep up with an internet which was growing faster than the ability of any human being to catalogue it.

In June 1993, Matthew Gray at MIT developed the PERL-based web crawler, WWW Wanderer. Initially, this was simply devised as a tool to measure the growth of the world wide web by “collecting sites”. Later, however, Gray (who now works for Google) used the crawled results to build an index called “Wandex” and added a search front-end. In this way, Gray developed the world’s first web search engine and the first autonomous web crawler (an essential feature of all modern search engines).

Whilst Wanderer was the first to send a robot to crawl web sites, it did not index the full text of documents (as had WAIS). The first search engine to combine these two essential ingredients was WebCrawler, developed in 1994 by Brian Pinkerton at the University of Washington. WebCrawler was the search engine on which many of us early pioneers first scoured the web and will be remembered with affection for its (at the time) attractive graphical interface and the incredible speed with which it returned results. 1994 also saw the launch of Infoseek and Lycos.

However, the scale of growth of the web was beginning to put indexing beyond the reach of the average University IT department. The next big step required capital investment. Enter, stage right, the (then huge) Digital Equipment Corporation (DEC) and it’s super-fast Alpha 8400 TurboLaser processor. DEC was an early adopter of web technologies and the first Fortune 500 Company to establish a web site. Its search engine, AltaVista, was launched in 1995.

Founded in 1957, DEC had during the 1970s and 1980s led the mini-computer market. In fact, most of the machines on which the earliest ARPANET hosts ran were DEC-PDP-10s and PDP-11s. However, by the early 1990s, DEC was a business in trouble. In 1977, their then CEO, Ken Olsen, famously said that “there is no reason for any individual to have a computer in his home”. Whilst somewhat taken out of context at the time, this quote was in part symptomatic of DEC’s slow response to the emergence of personal computing and the client-server revolution of the 1980s.

By the time Altavista was being developed, the company was besieged on all sides by HP, Compaq, Dell, SUN and IBM and was losing money like it was going out of fashion. Louis Monier and his research team at DEC were “discovered” internally as the ultimate PR coup; the entire web captured – and searchable – on a single computer. What better way to showcase the company as an innovator and demonstrate the lightning fast speed and 64-bit storage of their new baby?

During 1995, Monier unleashed a thousand web crawlers onto the young web (at that time an unprecedented achievement). By December (site launch) Altavista had indexed more than 16 million documents comprising several billion words. In essence, Altavista was the first commercial-strength, web-based search engine system. AltaVista enjoyed nearly 300,000 visits on its first day alone and, within nine months, was serving 19 million requests a day.

Altavista was, indeed, well ahead of it’s time technically. The search engine pioneered many technologies that Google and others later took years to catch up with. The site carried natural search queries, Boolean operators, automatic translation services (babelfish) and image, video and audio search. It was also lightning fast (at least in the beginning) and (unlike other engines) coped well with indexing legacy internet resources (and particularly the then still popular UseNet newsgroups).

After Altavista, Magellan and Excite (all launched in 1995), a multitude of other search engine companies made their debut, including Inktomi & Ask Jeeves (1996) and Northern Light & Snap (1997). Google itself launched in 1998.

Of these early engines, each enjoyed its own enthusiastic following and a share of the then nascent search market. Each also had its own relative strengths and weaknesses. Northern Light, for example, organized its search results in specific folders labeled by subject (something arguably still to be bettered today) and acquired a small – but enthusiastic following as a result. Snap pioneered search results ranked, in part, by what people clicked on (something Yahoo! and Google are only toying with now!)

In January 1999 (at the beginning of the dotcom boom), the biggest sites (in terms of market share) were Yahoo!, Excite, Altavista and Disney, with 88% of all search engine referrals. Market share was not closely related to the number of pages indexed (where Northern Light, Altavista and a then relatively unknown Google led the pack):

Search Engine Share of search referrals (Dec 99)

Yahoo! – 55.81%

Excite Properties (Excite, Magellan & WebCrawler) – 11.81%

Altavista – 11.18%

Disney Search Properties (Infoseek & Go Network) – 8.91%

Lycos – 5.05%

Go To (now Overture) – 2.76%

Snap / NBCi – 1.58%

MSN – 1.25%

Northern Light