Inverting the Web 

We use search engines because the Web does not support accessing documents by anything other than URL. This puts a huge amount of control in the hands of the search engine company and those who control the DNS hierarchy.

Given that search engine companies can barely keep up with the constant barrage of attacks, commonly known as "SEO". intended to lower the quality of their results, a distributed inverted index seems like it would be impossible to build.

@freakazoid What methods *other* than URL are you suggesting? Because it is imply a Universal Resource Locator (or Identifier, as URI).

Not all online content is social / personal. I'm not understanding your suggestion well enough to criticise it, but it seems to have some ... capacious holes.

My read is that search engines are a necessity born of no intrinsic indexing-and-forwarding capability which would render them unnecessary. THAT still has further issues (mostly around trust)...

@freakazoid ... and reputation.

But a mechanism in which:

1. Websites could self-index.
2. Indexes could be shared, aggregated, and forwarded.
4. Search could be distributed.
5. Auditing against false/misleading indexing was supported.
6. Original authorship / first-publication was known

... might disrupt things a tad.

Somewhat more:

NB: the reputation bits might build off social / netgraph models.

But yes, I've been thinking on this.

@enkiv2 I know SEARX is:

Also YaCy as sean mentioned.

There's also something that is/was used for Firefox keyword search, I think OpenSearch, a standard used by multiple sites, pioneered by Amazon.

Being dropped by Firefox BTW.

That provides a query API only, not a distributed index, though.

@freakazoid @drwho

@dredmorbius @enkiv2 @freakazoid YaCy isn't federated, but Searx is, yeah. YaCy is p2p.
@dredmorbius @enkiv2 @freakazoid Also, the initial criticism of the URL system isn't entirely there: the DNS is annoying, but isn't needed for accessing content on the WWW. You can directly navigate to public IP addresses and it works just as well, which allows you to skip the DNS. (You can even get HTTPS certs for IP addresses.)

Still centralized, which is bad, but centralized in a way that you can't really get around in internetworked communications.

@kick HTTP isn't fully DNS-independent. For virtualhosts on the same IP, the webserver distinguishes between content based on the host portion of the HTTP request.

If you request by IP, you'll get only the default / primary host on that IP address.

That's not _necessarily_ operating through DNS, but HTTP remains hostname-aware.

@enkiv2 @freakazoid

@dredmorbius @kick @enkiv2 IP is also worse in many ways than using DNS. If you have to change where you host the content, you can generally at least update your DNS to point at the new IP. But if you use IP and your ISP kicks you off or whatever, you're screwed; all your URLs are new invalid. Dat, IPFS, FreeNet, Tor hidden sites, etc, don't have this issue. I suppose it's still technically a URL in some of these cases, but that's not my point.

@freakazoid Question: is there any inherent reason for a URL to be based on DNS hostnames (or IP addresses)?

Or could an alternate resolution protocol be specified?

If not, what changes would be required?

(I need to read the HTTP spec.)

@kick @enkiv2

@dredmorbius @kick @enkiv2 HTTP URLs don't have any way to specify the lookup mechanism. RFC3986 says the part after the // and optional authentication info followed by @ is a "registered name" or an address. It doesn't say the name has to be resolved via DNS but does say it is up to the local system to decide how to resolve it. So if you just wanted self-certifying names or whatever you can use otherwise unused TLDs the way Tor does with .onion.

@freakazoid Hrm....


There are alternate URLs, e.g., irc://host/channel

I'm wondering if a standard for an:

http://<address-proto><delim>address> might be specifiable.

Onion achieves this through the onion TLD. But using a reserved character ('@' comes to mind) might allow for an addressing protocol _within_ the HTTP URL itself, to be used....

@kick @enkiv2

@dredmorbius @kick @enkiv2 @ is already reserved for the optional username[:password] portion before the hostname.

@freakazoid @dredmorbius @enkiv2 Is ! still reserved (! may be a DNS thing actually, thinking about it further)?

@dredmorbius @kick @enkiv2 @freakazoid
If you mean ! as in the routing control, isn't that even worse? We probably want to specify *less* irrelevant information by default.

@enkiv2 Bang simply as available notation. Now that I think of it, it might make a good routing _mechanism_ specifier:



Again, I'm not sure this is better than individual protocols.

Another option would be to specify some service proxy, which could then handle routing. URI encoding doesn't seem to directly provide that, apps/processes define own proxy use.

@kick @freakazoid

@dredmorbius @enkiv2 @kick @freakazoid
Bang was used in usenet addresses to separate a series of hosts in order to specify a routing, since UUCP would be done by machines calling specific other known machines nightly over landline phones. You'd see bang routing in usenet archives as late as the early 90s. I'd be surprised if it's not still theoretically supported in URLs.

@enkiv2 Email also.

I used (though understood poorly) bang-path routing at the time.

So yes, I'm familiar with the usage and notation. The question of whether or not it's appropriate here is ... the question.

At present, HTTP URL's *presume* DNS.

The problem is that DNS itself is proving problematic in numerous ways, that ... don't seem reasonably tractable. The dot-org fiasco is pretty much the argument I've been looking for against the "just host your own domain" line.

@kick @freakazoid

@enkiv2 That's at best worked with difficulty for large organisations -- domain lapses, etc., occur with regularity.

Domain squatting, typosquatting, and a whole mess of other stuff, is a long-standing issue.

In that light, Google's killing the URL _might_ not be _all_ bad, but they've been Less Than Clear on what their suggested alternative is. And I trust them less far than I can throw them.

For individuals, the issues of persistent online space is a huge issue.

@kick @freakazoid

@enkiv2 Then there's the whole question of how many spaces is enough. There are arguments for _both_ persistence _and_ flexibility / alternatives, and locking everyone into a _single_ permanent identity generally Does Not End Well.

The notion of a time-indexed identity might address some of this. Internet Archive's done some work in this area. Assumptions of network immutability tend to break. In time.

@kick @freakazoid

@dredmorbius @enkiv2 @kick @freakazoid
Yeah. Any immutability needs to be enforced because when the W3C declared that changing web pages is Very Rude all the scam artists & incompetents did it anyway. Content archival projects like waybackmachine become easier if you have static addresses for static content & some kind of mechanism to repoint at a different set of static documents (like IPFS+IPNS).

@enkiv2 I'd argue that there's a place for redacting content -- see the Bryan Cantril thread from 1996 previously referenced. That's ... embarassing. Not particularly useful, though perhaps as a cautionary tale.

There's a strong argument that most social media should be fairly ephemeral and reach-limited.

There are exceptions, and *both* promiting *and* concealing information can be done for good OR evil.

@kick @freakazoid

@dredmorbius @enkiv2 @kick @freakazoid
In terms of negative feedback -- I don't consider redaction of already-published material to be the best or most useful form. We see problems that could be solved by this, if mirroring & wayback machine & screenshots didn't exist. I'm more hopeful about solving the dunking problem with norms.
Reach is a lot more nuanced & powerful. Permanent & reach-limited like SSB feels like the right thing for nominally-public stuff.

@enkiv2 For ordinary citizens, the ability to unpublish / recall content seems fair -- that's the EU's RTBF.

For organisations, governments, highly significant individuals, criminals, and others with significant social obligation or power, the ability to capriciously unpublish is much more problematic.

The nature of online communications makes what were previously _streams_ into _records_, which can have tremendous durability. Everthing needn't last forever.

@kick @freakazoid

@dredmorbius @enkiv2 @kick @freakazoid
I find RTBF problematic because it's not very useful in the absence of norms against personal archiving (itself a problematic thing). We're better off developing norms about carefully checking the context around claims of wrongdoing before acting on or spreading those claims -- something that becomes easier when public information cannot be modified after publication. That's a tangent even by the standards of this thread tho

@enkiv2 @dredmorbius @freakazoid I need an image macro for "That technical problem is too hard! Let's change the world instead!"

NB. You may very well be right, but it still feels very comical in some way.

@kick @enkiv2 @dredmorbius @freakazoid
It's more a matter of: the social problem cannot be fixed by a technical change, so we should employ a social change instead. No matter what we do on a technical level, we can't really move the needle on this.

@enkiv2 @kick @dredmorbius @freakazoid
Changing norms is harder than employing technical systems because power is not as lopsided. To change norms, you need buy-in from most participants; to change tech, you just need to be part of the small privileged group who controls commit access. This is why it's so important, though. Norms aren't set in stone but they'll only change if you can actually convince people that changing their habits is a good idea!

@enkiv2 @kick @dredmorbius @freakazoid
Most people online have had bad experiences with people weaponizing out-of-context information -- that's why technical solutions like RTBF exist. RTBF not actually working, while simultaneously pushing power into the hands of centralized corporate services, is obvious to most people too. Saying "it's impolite to dogpile on somebody without checking whether or not you've been misled first" is way less extreme.

@enkiv2 @kick @dredmorbius @freakazoid
Re: the speed at which norms can change, consider content warnings. They went from something that only a handful of folks with PhDs trying to work out experimental ways to avoid meltdowns in extreme circumstances having even heard of them to something that everybody is aware of & only jerks believe are never justified in a matter of ten years. We still argue about when they're justified but there isn't a serious contingent against using them at all.

@enkiv2 @kick @dredmorbius I'm not arguing for RTBF. I'm arguing for not making it impossible to unpublish content.

CWs are nowhere near universal and the fact that they're not proves my point quite nicely.

@enkiv2 @kick @dredmorbius There's also the fact that people deliberately exploit immutable systems to publish stuff that's damaging. For example, there's kiddie porn in the Bitcoin blockchain.

@freakazoid @enkiv2 @kick @dredmorbius
This is a fair point, though I wouldn't pick CP as a good example of infohazard. Depending on one's model, CP is contraband either because a market for it incentivizes abuse or because exposure to it incentivizes abuse. Under the former model, having it on the blockchain lowers abuse potential. Obviously a complex & emotionally charged topic (even more so than "if you burn a million dollars does the value of a dollar bill go up or down")

@enkiv2 @freakazoid @kick @dredmorbius
The risk profile of putting contraband or blackmail material on a blockchain is basically the same as the risk profile of keeping a copy on paper in a safety deposit box & periodically mailing out photocopies -- except that this latter *only* works for people with an incentive to store info indefinitely. In other words, it puts the power to select what gets remembered in the hands of whoever thinks they will want to distribute it in the far future.

@enkiv2 @freakazoid @kick @dredmorbius
Really, norm-based solutions can't work unless practically everything is immutable either. If everything is immutable then context can be retrieved in the future even if nobody thought to preserve it at the time. This functionally defangs blackmail because lies-by-omission are not backed up by layers of friction between everybody & whatever information was omitted.

@enkiv2 @kick @dredmorbius I don't see how things' not being unpublishable could defang blackmail. Blackmail will just apply to information that hasn't been published in the first place.

This goes beyond mere disagreement; this is a system I would kill to stop.

@enkiv2 @kick @dredmorbius This is the argument 4channers make against outlawing revenge porn. "Women just need to learn to stop allowing boyfriends to photograph them naked, or accept that naked pictures of them are going to be on the Internet."

No. We live in a society. You publish shit that hurts someone else, you get hurt yourself.

@freakazoid @enkiv2 @dredmorbius Why is it always 4chan users who get blamed for bad culture on the internet? It's literally the queerest place on the entire network, yet without exception it gets blamed for the things that redditors are primarily responsible for.
@freakazoid @dredmorbius @enkiv2 *No. We live in a society. You publish shit that hurts someone else, you get hurt yourself.*

This is a slippery and stupid slope, and it justifies what's currently happening to people like Snowden, Manning and Assange, despite them not doing anything that was actually morally wrong. I'd accept a claim like this with reduced scope, but as it stands that's way too wide.

@kick @enkiv2 @dredmorbius "sometimes people get punished for things we don't think they should be punished for" is not an argument in favor of not having any limits at all, so I'm not super interested in debating it.

@kick @enkiv2 @dredmorbius Super uninterested in a 4chan vs Reddit debate. I couldn't care less about 4chan getting blamed for terrible shit they aren't actually responsible for given all the terrible shit they (or rather the shitheads they allowed to take over) were responsible for.

@kick @enkiv2 @dredmorbius Actually I'm being too generous. The folks there were plenty comfortable with racist, homophobic, and transphobic language from the very beginning. If Moot had deliberately set out to build a Nazi indoctrination camp, I have no idea what he would have done differently.

@freakazoid @enkiv2 @dredmorbius

You're being kind of ridiculous, which is kind of frustrating to see from someone who otherwise has been mostly at least together, view-wise.

There's a board dedicated to queer people (three of them if you include boards dedicated to queer anime/manga/etc), 90% of boards have zero political discussion (I'm not joking about this, some boards even ban it if I recall correctly), and Moot wasn't "comfortable" with any of that stuff; he's not a Nazi nor Nazi sympathizer, hell, he works at Google now.

He (rightfully) believed that spaces where people can interact without identifying themselves are important, which is the correct view to have.

@freakazoid @enkiv2 @kick @dredmorbius
I'm not opposed to ramification for bad behavior. I'm trying to figure out how to encourage punishment to be equitable. Part of that is preventing motivated misrepresentation (and power asymmetry in misrepresentation). Right now would-be blackmailers choose what gets to become history, so they can spin anything as a sin.

@freakazoid @enkiv2 @kick @dredmorbius
I'm not sure, in that case, what risk profile you're talking about. Are we talking about a case where someone publishes something about themselves that they later regret? Where someone publishes something about themselves & another party takes it out of context? Or where someone publishes information about someone else without permission?

@enkiv2 @freakazoid @kick @dredmorbius
I can't think of an example of a problem that being able to unpublish only things that you yourselve have published will reliably solve, in a world where backups & blackmailers exist. (It solves the pseudo-problem of deciding that a post you've published is potentially risky and undoing it before it has actually caused a problem. I don't think that's what you're talking about, though.)

@enkiv2 @freakazoid @kick @dredmorbius
And, on the other hand, unpublishing what *other people* have published doesn't appear to be on the table. It has a lot of issues and complications, & is generally handled by lawsuits or by corporate simulations of lawsuit-style deliberation. It can be handled by admin fiat in federated systems but scaling to distributed systems means it becomes a per-post version of transitive blocking. (Cancel messages, etc.)

@enkiv2 @kick @dredmorbius My goal is to make it easy to indicate to people who don't want to publish stuff against the will of folks who are impacted by it that you'd like them to take it down.

@enkiv2 @kick @dredmorbius The situation is one: even though they will take stuff down on request you have to separately ask them and everyone else.

Yes, there will be attempts to abuse such a system, which is why it should not be legislated into place by government but built by people who want to have a robust publication system that at least makes an attempt to minimize harm.

@enkiv2 @kick @dredmorbius I think the big issue here is reachability vs discoverability. This was an issue Mark Zuckerberg did not understand when designing graph search, until Facebook employees practically revolted and told him that it was a bad idea to let people bypass permissions like friends list visibility just because it was possible to construct someone's friends list by scraping others' pages. It's also encountered when public records go online.

@freakazoid @enkiv2 @dredmorbius Graph search lasted for six years with full functionality, and it doesn’t seem like it was that bad of a solution for Facebook.

Also, it wasn’t designed by Zuckerberg, it was designed by Google employees.

(And further, it was a great idea. So much was dug up on politicians because of it that the world was in an undeniably better spot.)

Show newer
Show newer
Show newer

@freakazoid @enkiv2 @kick @dredmorbius
Absolutely! I've sort of been arguing for this. When I pushed transitive blocking over unpublishing, it's because I think the biggest issue is the flatness of addresses/access: folks outside your group, who do not share your norms, can read your messages and force replies on you.

@freakazoid @enkiv2 @kick @dredmorbius
OK. I'm fine with that, and most mature systems for static content have facilities for that (ex., IPFS has a hash blacklist for both fetching & forwarding that's basically the same as a killfile, along with mechanisms for folks to share these blacklists with each other).

@enkiv2 @kick @dredmorbius It's not a question of unpublishing what others have solved. It's about supporting the ability to ask that others unpublish things they have published. It need neither be reliable nor perfect in order to reduce harm. But it needs to exist.

@enkiv2 @kick @dredmorbius At any rate I feel that I've given conclusive proof that this needs to exist. If you remain unconvinced then there seems to be little point in my expending additional effort trying to convince you.

@freakazoid @enkiv2 @kick @dredmorbius
OK, yeah, I'm perfectly fine with this as harm reduction. I wouldn't call it 'unpublishing' because on a technical level, on a service that otherwise supported static content, it would be implemented as a blacklist of addresses (which eventually would become un-hosted as the number of nodes with a copy approached zero).

@freakazoid @enkiv2 @kick @dredmorbius
CWs are not universal, but they are near-universal in communities in which they exist at all. The risk of immutability of public material is mostly around blackmail (which only works within one's in-group or in groups where one is forced to operate) & centralized enforcement (which can't be performed against sufficiently large groups). When immutability & good norms around context coincide, outsiders are largely irrelevant -- unless it ultimately loses.

@freakazoid @enkiv2 @kick @dredmorbius i feel like making it possible to unpublish information requires criminalizing private information sharing and archival and anonymous communication; is there a less severe way?

@kragen @freakazoid @enkiv2 @kick @dredmorbius
This is sort of my point -- either we have an 'ask nicely' or we have a state-enforced 'ask nicely', & 'ask nicely' without state enforcement is a social norm thing.

@enkiv2 @kragen @freakazoid @dredmorbius For certain classes of published stuff, in Europe we already have legal ways to demand unpublication and the sky has not fallen as a result. And I think it's forcing techies in EU to (usually grudgingly) accept that technical choices are rarely free of social consequences.

@cathal @enkiv2 @freakazoid @dredmorbius I am skeptical; we will see how much longer the sky of open societies remains standing. Brexit, BoJo, the yellow jackets, Erdogan, Orban, and the Ukraine crisis do not seem like promising developments.

@cathal @enkiv2 @kragen @freakazoid @dredmorbius
RTBF has some issues, mostly because it's a mechanism where the EU deals directly with Google. It's hard to see how it would apply to somebody running a site off their home internet connection, let alone a p2p system. It's not like it doesn't do some good, but because of its structure it's limited & increases the de-facto power of the stacks.

@enkiv2 @kragen @freakazoid @dredmorbius Well, the RTBF got codified and generalised significantly by the GDPR - the right to demand the amendment of false information, to require delisting or deletion of personally identifying data that is not in the public interest, all that looks like RTBF to me.

@cathal @enkiv2 @kragen @freakazoid @dredmorbius
Fair enough. I always considered the primary result of the GDPR to be more reasonable defaults about data collection -- nobody will actually *agree* to all their traffic being vacuumed up the way it has been, given the choice, & it can't be justified as necessary, so if you want to do business in the EU you just delete logs more often and ditch the tracking pixels.

@enkiv2 @dredmorbius @freakazoid CWs were an organic change, first-propagated in relatively insular communities where norms permitted (non-religious) preaching & crowd-shaming. Not sure how that change would go if not sparked in a similar community, but there doesn't seem to be a similar community at the moment.

@kick @enkiv2 @dredmorbius @freakazoid
Well, the communities I had in mind were IPFS and SSB (mostly SSB -- IPFS is much more tech-libertarian with a civil-libertarian streak, while the core SSB developers are very interested in the problem of community norms & how to deal with an environment of high speed off-the-cuff communication with permanent messages).

Sign in to participate in the conversation

Everyone is welcome as long as you follow our code of conduct! Thank you. is maintained by Sujitech, LLC.