Inverting the Web 

We use search engines because the Web does not support accessing documents by anything other than URL. This puts a huge amount of control in the hands of the search engine company and those who control the DNS hierarchy.

Given that search engine companies can barely keep up with the constant barrage of attacks, commonly known as "SEO". intended to lower the quality of their results, a distributed inverted index seems like it would be impossible to build.

@freakazoid What methods *other* than URL are you suggesting? Because it is imply a Universal Resource Locator (or Identifier, as URI).

Not all online content is social / personal. I'm not understanding your suggestion well enough to criticise it, but it seems to have some ... capacious holes.

My read is that search engines are a necessity born of no intrinsic indexing-and-forwarding capability which would render them unnecessary. THAT still has further issues (mostly around trust)...

@freakazoid ... and reputation.

But a mechanism in which:

1. Websites could self-index.
2. Indexes could be shared, aggregated, and forwarded.
4. Search could be distributed.
5. Auditing against false/misleading indexing was supported.
6. Original authorship / first-publication was known

... might disrupt things a tad.

Somewhat more:

NB: the reputation bits might build off social / netgraph models.

But yes, I've been thinking on this.

@dredmorbius This is not a fully fleshed out idea yet, but the "L" was the important bit. People generally don't care about the location of the content. They care about the content of the content, and other stuff about the content like the author, etc.

Just think about how people generally navigate the web these days. They don't type a URL into their addressbar or click a bookmark. They type a search query into their address bar, which will generally bring up Google results.

@freakazoid Re: navigation.

1. Google are trying hard to kill off the URL.

2. There may be user-pattern based reasons to do just that.

3. URLs and DNS map ... poorly ... to meatspace notions of locality and identity. In large part due to the actions of websites, search engines, browser devs, SEO, and domain registrars.

4. A namespace with at _least_ a half-million entities and little sensible structure ... is far beyond human scale.

5. It's mostly reputation.

@dredmorbius I agree that killing off the URL is a worthy goal, which makes it a perfect weapon for Google to deal its final killing blow to the open Web.

As for scale, IIRC you can serve 90+% of web search requests with coverage of only about 5% of the space. Something like 99% Google results are served entirely from RAM. They don't even expect to serve useful results from their largest index; it exists primarily to give the impression of completeness.

Sign in to participate in the conversation

Everyone is welcome as long as you follow our code of conduct! Thank you. is maintained by Sujitech, LLC.