A large chunk of the Fediverse was scraped; your posts are being “released” 

@tastytea Fact remains that what's publicly accessible will all but certainly be accessed. And much of Mastodon postings *is* public.

I'm not addressing the moral, ethical, or research dimensions. And I honestly don't know where GDPR falls on this. But access to public (and private) content can be highly socially beneficial, see and generally.

It's also been a huge part of the work *against* resurgent fascism and media manipulation, and others.

@dredmorbius @tastytea

I guess, but honestly it still feels pretty creepy to me.

What measures were/are in place to prevent them from helping themselves to stuff I specifically locked or sent as DMs? Anything? Thanks in advance.

@xenophora An honestly good question.

"Intent" would be a large part of that. Posting publicly (or unlisted) *provides access to anyone to read*. Posting "followers only" (a pretty bad design option, honestly, given that *anyone* can follow an account, unless locked), or DM, would be a *strong signal* of intent that "this is not public".

A huge problem is that people really don't get this or think like this, at least in many cases. Which argues against the rule, not the people.


@xenophora FWIW, I don't think that the argument "you've got no expectation of privacy for what happens in public" is in the least bit valid. Evidenced in large part by *published documentation* by intelligence agencies to *seek* privacy *by going to public spaces* (parks, restaurants, museums, etc.).

I also see *disaggregated public information* as hugely different from *aggregated public information*, due to the vastly lower search costs.

Which means specific details matter.


@xenophora Simple hashing of user identifiers is *not* sufficient for privacy guarantees, in virtually all instances, but especially where the search space is relatively few. Across a few hundred Mastodon instances, the active public profiles number in the 1,000s - 10,000s, I suspect, which is a *hugely* tractable search space, no matter how fine your hash.

Rehasing over time, say, every day or week, to allow tracking threads, but not personal activity over time ... maybe.


@xenophora Another option might be to disaggregate content into small-n tuples, at the word level, so phrases of 2-4 words (analysis rapidly explodes above, or even at, this level), and come up with a *per conversation* or *per thread* identifier, rather than persistant individual identifiers over time.

This preserves much ability for meaningful analysis *and* a lot of privacy. I'm not sure that it's sufficient.


@xenophora Keep in mind too that resistance to scraping tends to come from two specific constituencies:

1. The powerful or anti-social, who seek the shield of being able to comment or post content with impunity and protection from copying. I include the publishing industry, but also most establishment power.

2. The weak and oppressed, with genuine, or at least credible, fears of being attacked either now or in future based on comments.

Very different groups, similar concerns.


@dredmorbius @tastytea

I'm kind of flabbergasted at the thought that these strangers whose ultimate motives I don't even know could be archiving my damn artwork longer than .Art itself does. 🖕


I believe it's images that expire after a year or so, though the posts they're attached to stay on.

@dredmorbius @tastytea

Yeah, I'll be honest: even if I send a DM it's always in the back of my mind that someone who isn't supposed to get hold of it could still get hold of it. It's a necessary thought but I still hate having to think it.

@xenophora Very much this.

It's front-of-mind for me, but then, that's how I roll.

I don't expect everyone to think similarly, though I really wish they would.


@dredmorbius @xenophora @tastytea IMO the default should be self destructing posts unless explicitly marked as permanent. i feel this would closer match how people seem to expect social posting to work. as close as tou can get, anyway, since people’s mental models of how this work are highly self contradictory.

@zensaiyuki Yes, this.

I think we're going to have a Come to moment on this eventually, based on ... a whole mess of stuff. But basically the non-tenability of saving Everything, All the Time.

Digital Information Archival reminds me a hell of a lot of Borjes' Map. At some point, 1:1 correspondence (or worse, n:1) of record-to-reality loses utility.

My concept might be an approach: nuke most things, save the best + some sample ("best" is time-contextual).

@xenophora @tastytea

@zensaiyuki I've toyed with the idea of phasing in and out identities over time.

Might be a "lives for a year, sleeps for a year, is destroyed", or a longer or shorter interval.

Retention periods and policies have Been A Thing in business for decades, mostly paralleling the adoption of computer tech, and the notion of (legal) document discovery:


@xenophora @tastytea

@dredmorbius @xenophora @tastytea as valuable as old posts are, and how annoying it is to see an awesome old post or thread disappear, that should be weighed against the potential harm of old postings being discovered from years ago and pulled out of context to ruin people’s lives. it’s a nice weapon against those in power… sometimes. but it’s much more potent against vulnerable people.

@zensaiyuki And *really smart people* have been *absolutely wrong* about the potential negative use of infotech. Herbert Simon very pointedly, claiming the Nazis operated without "mechanized data processing":



But they didn't: ibmandtheholocaust.com/

@xenophora @tastytea

@zensaiyuki @dredmorbius @xenophora @tastytea you can't force this technically. Instance would save data forever for even non-malicious reasons, like backup.

Forgetting is really a thing endemic to human memory, it can only work on computers by a mild convention and requires a lot of trust.

The closest thing to forgetting for computers is drowning data in oceans of other data in hopes that indexing bots have limited cataloguing capacity.

@isagalaev @dredmorbius @xenophora @tastytea you’re not wrong, necessarily, but there’s no reason an instance couldn’t be set to automatically cull old posts. as you say, it wouldn’t prevent scrapers from making backups, but it would drastically reduce the opportunity to do so.

@isagalaev @dredmorbius @xenophora @tastytea and although “the internet is forever” is a meme, keeping data alive and backed up actually does take an enormous amount of human effort. it is *not* the default for computers to remember.

@zensaiyuki @dredmorbius @xenophora @tastytea I'm mostly concerned about the false sense of security from knowing that most servers are well-behaved. You only really need one that isn't.

@isagalaev @dredmorbius @xenophora @tastytea the only solution I can think of for that is don’t be on social media, don’t run a server. none of this is okay or secure, and thinking any of it is, at all, is false.

@isagalaev @dredmorbius @xenophora @tastytea that would be rather extremist though, but no more extremist than concluding everything is fine, change nothing.

@zensaiyuki @dredmorbius @xenophora @tastytea that's a bit absolutist :-) Why not just accept the fact that all your public posts could potentially be linked back to you? For most people it's not a bad thing. What we do need in society is to get rid of stigma associated with it. If someone said a horrible phrade 10 years ago or posted a racy nude, it shouldn't be a reason to ostracize a person and disqualify them from life.

@isagalaev @dredmorbius @xenophora @tastytea that is a position only a privileged white heteronormative cis male can take. lucky if that’s you.

@isagalaev @dredmorbius @xenophora @tastytea and just to coddle your feelings a bit, that’s not dismissing you. it’s just pointing out the literal fact that position only works, in practice, for that one group of people

@zensaiyuki @dredmorbius @xenophora @tastytea my feelings are fine :-) I know from what position I'm talking and thank you for providing a different perspective. And to clarify, I don't think the status quo is fine, I'm just saying that relying on a "should-delete" convention probably won't work.

@isagalaev @dredmorbius @xenophora @tastytea i will concede that it isn’t a perfect solution. but “won’t work” requires a very narrow definition of “work”. I think though, it *would* work, in the sense that it would be an improvement over how things work now, and reduce opportunities for abuse. that it doesn’t completely eliminate abuse doesn’t strike me as a strong argument against it.

@zensaiyuki I'd shift the notion of privileged group to _whatever_ group enjoys protection, lattitude, impunity, and immunity. Often but not always cis white males.

More generally: the privileged oligarchy, whatever it may be, as well as various populist supporters, though those may find themselves discarded on little notice.

@isagalaev @xenophora @tastytea

@dredmorbius @isagalaev @xenophora @tastytea naturally. the definition of “white” changes on the convenience of whoever has power.

concerns ethnic prejudice 

concerns ethnic prejudice 

concerns ethnic prejudice 

concerns ethnic prejudice 

@isagalaev "Stigma" is a canard. There is a very real, and unquantifiable, risk.

One element of the movement has been a tremendous shift in the actionability of records (and memories) thought old and dead.

Eric Schmidt's famous "Google's policy is to get right up to the creepy line" quip fails to recognise that *the creepy line is not fixed*. What's considered acceptable or unacceptable changes dramatically with time. We're in the midst of a shift.

@zensaiyuki @xenophora @tastytea

@isagalaev What makes unusual is that instead of disempowered and minority populations or cultures being targeted, it's *specifically* power and its abusers. Not unheard of in history, but also not the usual dynamic.

Skin colour, politics, sexual orientation, past relationships, thoughts, writings, etc., have all proved fatal for some.

Antoine Lavoisier comes to mind.

Witch hunts, purges, pogroms, genocides, reeducation, and other realignments.

@zensaiyuki @xenophora @tastytea

@dredmorbius @zensaiyuki @xenophora @tastytea "What's considered acceptable or unacceptable changes dramatically with time." — this is tremendously important. Whenever something "creepy" floats from times long past it's worth to consider that it might have been downright normal back then. I think this shift you mention should include acceptance of this fact as well. (I'm still not sure I'm communicating my mind well.)

Triggers: pretty much all of them. 

@isagalaev NB: Historians are very familiar with this concept. Historicism / presentism is a constant bias to fight against.

Often unsuccessfully.

@zensaiyuki @xenophora @tastytea

@isagalaev It's possible to mark records for deletion-on-restore, and offline-backup is different from online/nearline access. Though it's fungible with that.

A backups-retention policy where media *are* aged out over time is another option, though that Requires Procedures and Adherance To Them.

Transmission, retention, and destruction of data all rely strongly on trust.

@zensaiyuki @xenophora @tastytea

@xenophora @dredmorbius @tastytea as far as I understand your DMs are safe (as long as you trust the admins of involved instances). @dredmorbius is correct on the whole "intent" thing.

@dredmorbius @tastytea finally someone get it. anti-copyright and anarchist until someone scrapes your website and now you want to sue
Sign in to participate in the conversation

Everyone is welcome as long as you follow our code of conduct! Thank you. Mastodon.cloud is maintained by Sujitech, LLC.