This post is a retrospective on the emergence of web-indexing approaches over the course of my time working in the building and distribution of search engines. Over the course of 25 years I worked at LookSmart, Overture, Yahoo! and Collecta. Some of my colleagues had asked me to summarize the shift I'd seen across the web over this period. The beginning: open-web indexes Web indexing (also called crawling or "spidering" because of the concept of the web-like structure of html inter-linking) was a means of creating a private index of published html documents on the web in the 1990s. Popular search engines would use keyword position in the document, frequency of keyword mention, font style and meta-tag data to determine a public document’s relevance to a given keyword query. This manner of indexing and searching was the first wave of search engine mechanics. Crawlers could start at the top node of any public domain and follow any link from t...
Musings on web development, apps and the future of the internet.