@alcinnz A huge amount of JS could be eliminated if basic HTML had some notion of a post/comment response tree, relation, and interactions (scoring, filtering, replying, muting, blocking). That should be a fundamental part of the epistemic (content-based) Web.
I've also argued that the Web should be divided into four roles:
- Text/content and interactions.
- Commerce, including payment and trust mechanisms.
- Multimedia: video/audio playback.
- Apps beyond these.
@alcinnz The Web *began* as a content delivery / publication systems (though lacking Critical Bits such as Search and Archival).
Media got bolted on via the <img> tag (later audio and video), and commerce was later added. Both remain problematic.
The absence of sane defaults for styling, a recognised set of standard page formats (index, article, gallery/catalogue, discussion, stream, etc.) and uniform formatting, is a huge part of the problem -> CSS and JS paper that over.
@dredmorbius Wow! That's plenty to think about!
After next week I'll start exploring tag-based bookmark management, and I'll be keen to get your feedback on my take.
I can't say I'm 100% against CSS (I prefer webdevs to apply style that way rather than using HTML), but at the very least userstyles need to significantly more prominant. Whether or not we want to block author styles.
@dredmorbius Certainly I do want to seperate apps out into their own platform, for which I generally like proposing an offline Elm sandbox.
And it strikes me when creating a browser for a voice assistant (and maybe eventually Smart TV) form factor I am tending to seperate forms (as used in checkout) into their own UI.
So I think we have some very similar ideas here!
@dredmorbius Oh, can you elaborate on how you imagine this comment tree working in HTML?
@dredmorbius P.S. I've linked to your post from my page advising webdevs on how to help transition to a less bloated Web.
And I'm keen to describe the web experiences I want to design in a few days, for feedback.
@alcinnz So a comment tree in straight HTML might require some thinking. There'd have to be _some_ smarts, client or server side.
Basically: you need elements. And/or relationships. I think you can even skip specific <article> vs. <comment> elements if you have instead <parent> <child> or <in-reference-to> / <citations> type relations. See email threading and specifically Mutt and jwz's threading model for Netscape.
@alcinnz Then there's the question of _what elements are actually within a particular comment thread or tree_. We've relied entirely on server-side, parent-centric, site-specific models for that to date. Basically: everyone codes their own and they don't interoperate.
Also: we've had interoperable systems (Usenet, Listserve, Mailman), and ... they had ... issues.
Lots of motherloving spam.
So: the parent might compile a list of validated responses as comments. Option # 1.
@alcinnz A feature (problem / benefit depends on your viewpoint) is that parents could determine what responses they do or don't want to acknowledge. If you're mitigating stalkers / hackers / trolls, that's good. If you're trying to deceptively manipulate conversation / narrative, maybe not so much.
So option # 2 is for freestanding references to be able to cite what they're referencing. The tree can be acquired from the branches rather than root.
@alcinnz Oh, and inherent in this is that parents and children are explicitly distinct documents with relationships. Rather than chunks of stuff within a single page, though they might be _represented_ that way.
Mind: you could (and ... some poor lost disillusioned soul might) try to throw it all in one bucket. But That Would Be WRONG!!!
Within a rendered page ... some sort of inclusion reference. Which does, natch, raise all kinds of further issues (security, snooping, etc., see IFRAMEs).
@alcinnz Any time you allow content from multiple authorities / entities to be intertwingled, you're going to want to set some sort of baseline of maximum allowed features. Say, just straight text and styling, perhaps.
(And just having a limited feature set doesn't preclude misusing that feature set, though it may make misuse easier to identify.)
@alcinnz Then there's interactions.
The idea of a comment thread is ... well, it's _dynamic_. At the very least, it's additive: others can add their Considered, Reasoned, and Illuminating Thoughts.
There's the question of ranking and rating, ordering, filtering. Possibly editing (and questions of versioning). Deleting. Moderation.
Some minimum featureset should be specifiable here, and generally, implemented through HTML, HTTP, and browser features.
@alcinnz And given that We Have Been Doing This Shit For Neigh On Forty Years, there might even be some sort of rough reasonable concensus as to What Said Features Might Be.
(Apply the exponential doubling cost rule to any proposed extensions as with page schemas.)
@alcinnz But again, in brief:
Within page layout: a parent-child hierarchy.
Between documents: a citations / in-reference-to relationship. Children know their parents, parents know some of their children.
For elements: a set of defined actions: reply, rate, hide/dismiss/flag, classify. Moderation and reputation features. Spam classification.
@dredmorbius I really don't have much of an opinion on what the best approach would be, but from my perspective of being deep in the weeds attempting to implement new browser engines:
It'd be reasonably easy for the browser to support inserting HTML snippet HTTP responses into arbitrary elements on the page. Many existing forum/commenting systems could easily be recreated with that...
I would simplify HTML error recovery, and not allow any additional CSS & ofcourse JS.
@alcinnz Easy != Good Idea.
Look at where <iframe> got us. Yes, you can embed entire Web pages inside other Web pages. The trust / risk implications turn out to be ... immense.
This gets to a general model of coming up with some kind of document complexity and capabilities model, and granting capabilities based on trust. uMatrix is effectively one (fairly crude) tool for doing this. I've taken a few stabs at defining this, none are especially satisfactory.
Starting at base levels: there are characters, sentences, linefeeds, paragraphs, images, styling (fonts, colours, sizes), structured docs (metadata, titles, sections, lists, tables, references, equations), explicit layout blocks, multimedia (video/audio), programmatic content, remote / cross-site includes.
Each carries capabilities but also risks. Defining, say, a "comment trust level" of "text, paragraphs, bold, italic, superscript, subscript" might be a decent base level.
@alcinnz Elements such as images, audio, and links, would have to be granted based on specific trust relations, though whether those are explicitly granted by a site (or page) author / moderator, through SW heuristics, or some mix, might be open to discussion.
Mind that abusers can abuse virtually anything.
@alcinnz I mean, Usenet had MMF:
My general view is:
- Focus on *behaviour* rather than *features*
- Recognise that *complexity* enables both more bad behaviour, and can mask it.
- Reputations matter. Rather than identify *content*, track the *creators*, both good and bad.
Effective trust networks tend to be *small*. A few tens, *possibly* hundreds, of actors. Trust scales poorly.
@alcinnz Complexity: more pieces, more types, more relationships, more types of relationships, more change over time.
Certainly if/when I do I should start from the behavior their intended to preserve, reimplement that without JS. Make sure they have a use!
But also I'm keen to maintain graceful degradation so, amongst other things, I can always drop those features if they're not worth it.
@alcinnz Some discussion, "Platform Types":
@alcinnz Also, "Features & Capabilities":
@alcinnz Both of those links are fairly rough, but draw from much experience and several sources. It's a wiki, those can be edited.
@dredmorbius True. While I do think it's a nice evolution of hyperlinks to cover many/most use cases for Ajax, we probably shouldn't be embedding third party HTML that or any other way.
Though I certainly wouldn't include that in the voice-assistant form factor I mentioned, it doesn't fit the experience I'm designing.
@alcinnz CSS isn't *necessarily* bad, but it *often* is. It's a near-necessity because browser defaults are so poor AND there's no reasonable set of default templates.
As the article says: "Web design isn't the solution, Web design is the problem."
What I often resort to is using some sort of "reader-mode" tool *and then restyling that to my preferences* for an optimal reading experience. The Reader Mode at least creates a consistent base state I can work from.
When you distribute design...
@alcinnz ... you end up with a bunch of different sources all reinventing the wheel, poorly and with a limited or nonexistent grasp of end-user needs, experiences, and interests.
If the default appearance of an unstyled Web page were closer to a Reader Mode page (margins, limited line length, decent fonts, images generally fitting to display width), we'd be a lot better off.
A catalogue of standard layouts which could have standard client-side stylings or optional site-supplied CSS would help.
@alcinnz Note that that last _still_ allows sites to choose their own style *if they want*, but makes it far easier (and more uniform) for users to override that.
Leaving some stock styling options (header/footer colour schemes, logo and art) would allow personality and "branding" to be applied, but within sane limits.
Then you've got idiots (and tools) with absolutely no grasp of what HTML is or how it should be used, and ... I have seen things. Things you wouldn't believe. Ships on fire...
An old newspaper columnist who wraps everything in <h4> tags. Presumably because <p> is too small for their old eyes.
A site where *every motherloving paragraph* had explicit pixel-precise location placement.
Another old writer who manually inserts leading whitespace to indent paragraphs.
And don't get me started on the crap in RSS feeds.
Obfuscated "minified" Cloture JS and CSS from Google and others. Utterly nonsemantic crap.
NYTimes' homebrewed JS <table> alternate.
@alcinnz There's the fact that stock HTML lacks equation support.
(One of my dreams is to scrap HTML for LaTeX and call it a day.)
Or reasonable endnote / citations / footnote support (also in LaTeX.)
I know a lot of this because I restyle a ton of sites *for my own use and ability to even motherloving tolerate them* through Stylus. Pretty damned much learned CSS by fixing Google+'s craptacular shit.
Again: sane defaults would make much of this go away. But we don't have those.
@dredmorbius There's MathML, but everyone other than WebKit and Igalia apparantly had better things to maintain than that. Yuck!
@alcinnz MathML already puts you in JS land.
@dredmorbius Except if you exclusively target WebKit browsers. That was comment about browsers apparantly having better things to maintain was referring to.
@alcinnz There've been some attempts. The hNews Microformats are one. I think that's languished if not dead, but also somewhat exemplifies the problem: it's *way* possible to go overboard with specifications.
I think we might need to front schema specs something like 16 elements, and then charge them $100 for the next, doubling the charge for each additional element specified. Want 10 elements? That's $102,400, please!
Want 100 elements? $1.27 * 10^32.
Keep. It. Fucking. Simple.
@dredmorbius Certainly, I've seen plenty of that.
I'll look into hNews...
There are so many shit websites lately we need a certification of some kind. Call it "No Bullshit Certification". To earn this cert, a website must:
* function to some reasonable extent with j/s disabled
* all j/s must conform to #librejs
* be usable in lynx
* respect #netneutrality and thus not use #cloudflare
* be #CAPTCHA-free, or at least respect ppl w/disabilities
Then we need a search engine with scoring that puts "No BS Certified" results at the top.
@mathew Elements are good, but mechanism and enforcement matters.
By the mid 1990s, Usenet had a whole mess of newsgroups for governments, groups, topics, etc. Structure and elements. But they weren't used, or were spammed to death, or most often: both.
There was a saying for a while that "Google is a blind user". That favoured accessibility for a while. Google's gotten both less blind, and less significant with social / algorithmic discovery.
@mathew Reputation, _appropriate_ use of tools, penalties for misuse & misrepresentation, publishing tools that follow the fucking specs, and some means for dragging the old into the New World, would help.
Then again, look at books. We don't rely on publishers to do cataloguing correctly, we have librarians and cataloguing specialists.
Or did. That seems to be a dying art, in favour of full-text search and various Majickal Diskovery Tuuls.
@mathew I look at, oh, say, to pick an entirely random example, PDFs.
Here's the "pdfinfo" dump from a randomly selected PDF I've downloaded:
Producer: Adobe Acrobat 9.13 Paper Capture Plug-in
CreationDate: Mon Sep 22 17:12:13 2003 CDT
ModDate: Wed Sep 2 12:32:37 2009 CDT
Note conspicuosly: no author, title, or publication date.
@mathew That is, as it happens, the "IBM 360 System Principles of Operation" manual, published by IBM in 1964, and posted online by some Good Soul, recently shared to Hacker News.
None of which is apparent from either the metadata or filename.
(I've been trying to formulate a reasonably consistent, useful, file-naming format, for a few years. Surprisingly difficult also. Generally:
Author-Name_Title-words_Publisher_Date.extension is a good start. Not all fields may be present.
@mathew PDF *HAS* the structure to capture metadata.
And not only isn't it used, it's a motherfucking pain in the ass to update, correct, or even *find* this fucking information. Particularly at scale.
Again, it's often the pirates (Libgen, ZLibrary) or librarians (Internet Archive) who Get This Mostly Right.
@dredmorbius @alcinnz I use a handy Mac app called PDF Attributes which lets you edit core metadata for any PDF easily.
@mathew But yeah, basically "we've got the fields / structure / metadata formats / microformat specs for that" DOES NOT SOLVE THE MOTHERFUCKING PROBLEM.
I've watched that approach be trotted out for about 35-40 years now. GO READ SOME MOTHERFUCKING HISTORY OF INFORMATION. Seriously.
(Yes. I'm slighly unhinged. But the field / discipline is *MASSIVELY* motherfucking delusional or intentionally ignorant.)
You need some form of Cataloguing Authority, with balls and licence to kill.
@mathew Churn (organisational, technological, standards, etc.) is a major challenge here.
One of the big issues is that you pretty much *cannot* have a Consistently Applied Well-Formed Accurate and Useful index. You're going to have a smattering, probably best described by a sort of maturity levels index model:
- Algorithmic / AI/ML
- Expert review & contextualisation
- General domain concensus (multi-expert)
I have numerous hand-scanned books I still need to convert to PDF or djvu format.
Latter is nice, but the toolchain is still very immature. Sadly, we're going to be with PDF for another decade or more IMO.
Or I've hand-typed (or corrected) docs, e.g. https://pastebin.com/raw/Bapu75is from https://archive.org/details/technologicaltre1937unitrich/page/39
@mathew Many public / school libraries have these, either dedicated units or integrated into copy machines. Can save as jpeg or other format to a flash drive.
I prefer the face-down bed scanners to the face-up versions (flatter, more uniform results). Similar to this:
Generalistic and moderated instance.