Skip to main content

Posts

Showing posts from September, 2014

The transition of web search - From open to fragmented and proprietary

This post is a retrospective on the emergence of web-indexing approaches over the course of my time working in the building and distribution of search engines. Over the course of 25 years I worked at LookSmart, Overture, Yahoo! and Collecta. Some of my colleagues had asked me to summarize the shift I'd seen across the web over this period.     The beginning: open-web indexes Web indexing (also called crawling or "spidering" because of the concept of the web-like structure of html inter-linking) was a means of creating a private index of published html documents on the web in the 1990s.  Popular search engines would use keyword position in the document, frequency of keyword mention, font style and meta-tag data to determine a public document’s relevance to a given keyword query.  This manner of indexing and searching was the first wave of search engine mechanics.  Crawlers could start at the top node of any public domain and follow any link from t...