Mastodawn

R. Leigh Hennig Sep 8, 2023

It’s not just about selling you ads.

Ex: you’re a teenager living in a highly conservative state. You’re visiting sites your ultra religious family don’t want you to. Google tracks you NATIVELY IN THE BROWSER and informs 3rd parties of your interest in LGBTQ sites.

You’re NOT SAFE using Chrome.

Voron Sep 8, 2023

@semioticstandard #lgbtq safety just got the big middle finger from #google

Lira

Sep 10, 2023

@voron @semioticstandard profits will always take priority over our safety

Pretty Good (at growing a beard)Sep 8, 2023

@semioticstandard never have been

WarmasterPalak Sep 8, 2023

GOOGLE IS NOT YOUR FRIEND!

GOOGLE DOES NOT HAVE YOUR PRIVACY OR INTERESTS IN MIND!

YOU ARE A COMMODITY THAT THEY WILL BUY AND SELL BY THE POUND!

publictorsten Sep 8, 2023

@semioticstandard You are actually describing the status quo. That's happening all the time right now. If Google's privacy sandbox works as intended, no one will have a clue about who is behind this list.

Misuse Case Sep 8, 2023

@publictorsten @semioticstandard Yeah this is what drives me nuts about this whole discourse. The status quo of tracking, which collects 1,000+ data points about you and stores them forever in places you don’t even know about, knows your sexual orientation. Topics/the privacy sandbox doesn’t have the means to ask or know, by design.

But nobody kvetching about it has read the spec, at all.

Krupo Sep 9, 2023

@MisuseCase @publictorsten it's important to therefore run a bot that runs countless queries to mask your "true" interests.

Finally, a good use for AI

@krupo @publictorsten Topics does a little of that too, it throws in one random topic marker for something you haven’t actually visited during each three-week cycle.

@MisuseCase @publictorsten so because the spec was written to seem legitimate, that means google is definitely implementing it precisely according to the spec, right? because google is so well known for doing things that hamper its ability to track users? and its shareholders agree that removing its ability to track people is the right decision?

@hipsterelectron @publictorsten I mean if you’re going to be like that about it you might as well subscribe to the theory that birds aren’t real and all the people around you are cloned impostors of themselves.

@MisuseCase @publictorsten i was trying to understand why you're mentioning the spec when nobody else is talking about the spec but feel free not to engage i know it's much easier to avoid interrogating your preconceptions

@hipsterelectron @publictorsten I’m mentioning the spec because nobody else appears to have actually read the spec or know what’s in it. Occam’s Razor. Talk about interrogating your perceptions.

@MisuseCase i personally am not concerned with the contents of the spec as i view the spec as largely a marketing document. half of google's press releases these days are about some security work they're doing in order to give the impression that they care about privacy, especially after the google+ breach. google has established a reputation as a company that will lie whenever possible and i consistently advocated against further integration with their services when i worked for the federal government because of this.

i'm sure many people worked very hard on the spec, and part of why i won't work for google is because i know they won't respect my output unless it aligns with their extremely cynical corporate objectives

@hipsterelectron A spec is not a marketing document or a press release.

But, whatever, you can’t reason someone out of a position they didn’t reason themselves into.

@MisuseCase of course. i'm simply being unreasonable

@hipsterelectron @MisuseCase At least from my perspective, the spec is useful to look at because it's what sites/advertisers/adversaries will program to. If it's not in the public spec/API, then sites can't use it.

If we presume the existence of API calls/parameters that are not public then exploitation would either require accidental or intentional exposure on the side of the browser vendor and discovery/collusion between the browser vendor and the site/attacker. While this is eminently possible, the existence or absence of topics doesn't enable (or preclude) said undisclosed API from existing.

As a result, I think it's useful to consider the public API when evaluating this feature because it's what most sites/adversaries will program against. Public APIs don't beget or prevent private APIs from existing, so the potential existence of the latter is disjoint from the danger posed by the former.

@hipsterelectron Looking at the public API, the way it works is that you get a vector of topics from a predetermined list about a user. Based on this, the obvious risks (assuming that the API is followed, due to the above logic) are:

* Some topics expose dangerous or sensitive information about a user. A few obvious examples would be that there are some "job seeking" topics in the current list. I don't see any obvious health topics, but that leads into...
* Some collections of topics leak information that is not explicitly enumerated by the topics themselves. Suppose that there's a strong correlation between the presence or absence of some subset of topics and some additional property about the user; this could leak additional information, though not as much as...
* A given collection of topics may fairly uniquely identify a user, as their existence has enough entropy to improve or confidently identify a given individual.

@hipsterelectron There's a paper talking about the risk of the third and it makes a fair argument that the current design makes it difficult (particularly through the injection of entropy by randomly adding topics). I think that the former two are of the most concern, since:

* The selection of enumerated topics to avoid dangerous or sensitive topics may not consider the risks to specific marginalized groups, and
* Topic collections may still contain enough information to imply dangerous personal properties, even if they may not themselves intrinsically identify an individual. A strong correlation between a topic bag and an at-risk category in conjunction with additional uniquely identifying information would be dangerous.

It's not clear to me exactly how dangerous these concerns are. Getting a wide range of opinions on the topic list may help with the former, but the latter is hard to quantify without a broader statistical sense of properties-of-concern and correlated topics.

@hipsterelectron The current API spec makes implementing the latter approach difficult: the browser picks 5 topics per week and will only ever provide those 5 topics within that week. Pages are also only ever supposed to get 3 topics at a time (and at most one topic per week), though this seems like it could be worked around via different domains and similar.

As a result, building a large topic profile for a given user (assuming that additional information was available to uniquely identify them) would require observation over a large period of time or for the sudden appearance of a topic as a top topic to be relevant in and of itself (e.g. the job hunting example).

I think then that the API is potentially dangerous but it's hard to generally exploit (needs to be a site you visit a lot and for the risky topic's frequency to be relevant).

@hipsterelectron Interestingly, the spec identifies a number of these concerns https://github.com/patcg-individual-drafts/topics?search=1#privacy-and-security-considerations and actually notes that colluding hosts (or one host with a bunch o'domains) could get up to 15 topics. This makes the correlation case much more potentially risky, and they do absolutely nothing to mitigate that.

I think then that the potentially poor selection of topics and the risk of correlation of topics/tracking of topics over time is the riskiest part of the API, made more concerning by its standardization. In some senses, tracking cookies are de-risked by their information intrinsically being federated, but if you know that almost all users have this thingie then it's easy to target and exploit even if you're not an advertising house.

GitHub - patcg-individual-drafts/topics: The Topics API

The Topics API. Contribute to patcg-individual-drafts/topics development by creating an account on GitHub.

GitHub

@hipsterelectron @MisuseCase Google will still keep the ability to track users without manipulating the privacy sandbox. Which would be a really stupid thing to do, because hundreds of courts and agencies plus the entire market are watching closely.

Duco Sep 9, 2023

@MisuseCase @publictorsten thanks to the EU Data Protection Law GDPR they have to name every company they share data with and you have the right to get a copy of the data. Also they have to ask you before tracking you.
But I would rather recommend to use Firefox with uBlock Origin instead. Or something similar.

Alexander The 1st Sep 9, 2023

@MisuseCase @publictorsten They don't need to ask or know - they can use zero knowledge proofs or deanonymization tactics to get the information that way instead.

And it's not like the 100K places that have our information will just give up the access they already have just because Google made a new setup; it would take regulation to require them to drop the information they currently have.

@AT1ST
Actually they ask for permission. In the European Union, users are asked whether this feature can be turned on. But if the data is anonymized (and not just pseudonymous) no extra permission or info is required.
@MisuseCase

@publictorsten @AT1ST It’s nice that the EU requires affirmative consent for stuff like this but one of the problems with the GDPR (IMO) is that tech companies and advertisers can and do overwhelm users with pop ups asking them permission for things, often in an unclear way, to the point where they become essentially meaningless and people are just clicking through them.

Also from what I’ve seen on here people aren’t necessarily clear on what they’re saying yes or no to when it comes to Topics.

VoxelTrots Sep 9, 2023

@MisuseCase @publictorsten @AT1ST Your defense of this tech seems to revolve around “it’s better than it was” which is incredibly dystopian. The solution is not to track. To be the good guys and aggressively fight it. Not to provide a “better path.” When you use a browser built by an ad company, you’ve resigned yourself to be sold to.

@davet @publictorsten @AT1ST “It’s a big improvement over the status quo that invasively tracks people including sensitive personal information about them like their health status and sexual orientation” is not “dystopian.” Words mean things!

And Google is coming up with this because they see the writing on the wall and expect increasing robust privacy legislation even in the U.S. This is their compromise. It’s a fairly decent compromise.

VoxelTrots Sep 9, 2023

@MisuseCase @publictorsten @AT1ST Why defend the billion dollar ad company? What’s to gain by stating that their new tracking is better than the old tracking? Why not the third option - not tracking.

@davet @publictorsten @AT1ST What I am doing here, and the *only* thing I am doing here, is saying how Thing B actually works, compared to Thing A which is currently in place (and very bad), because it looks like nobody around here has looked at how Thing B actually works.

I would also like Thing C but it’s not on the table. Thing B is the compromise between Thing A and Thing C.

@MisuseCase
Many, many have said they can achive C. But they have not. Programmatic advertising is still on the rise.
@davet @AT1ST

@MisuseCase
Yes.
@AT1ST

@AT1ST @publictorsten Google is introducing this because they see the writing on the wall and they don’t think they will legally be able to collect information that way anymore soon (IMO). This is experimenting with a possible compromise.

Matthew Walton Sep 9, 2023

@MisuseCase @publictorsten I'm pretty sure there's going to be a horrible flaw in this that someone will exploit for incredibly nefarious purposes - Mozilla refusing to touch it gives me pause - but then Mozilla oppose ALL tracking and we must remember that the status quo of tracking is everyone building profiles of us all absolutely unchecked.

@mathw @publictorsten There might be a flaw in it, but there is a flaw in most protocols or software upon release TBH. Part of being responsible is assuming that you will have to maintain and upgrade it as you find weaknesses and vulnerabilities.

The biggest weaknesses with this, at the moment, are not technical ones, but social ones. Other players in the industry may not want to adopt this and of course people are freaking out about it.

Matthew Walton Sep 9, 2023

@MisuseCase @publictorsten one can understand other advertisers will be reluctant to use it because then they'll be relying on Google's categorisation algorithms where they might feel their own could give them a competitive edge.

@mathw @publictorsten Mm-hm. This also doesn’t allow for micro-targeting the way the existing tracking regime does, the history disappears in 3 weeks, and you can’t sell it to the U.S. government’s 3-letter agencies or anyone else to let them get around the 4th amendment or prohibitions on direct surveillance because it’s not useful for that kind of thing (by design).

@publictorsten @semioticstandard Google’s spec includes a description of how the list is put together, who can add things to it, and how. There is nothing like that for the current tracking regime.

@publictorsten

I might have believed old Google to treat the secrecy of its customers seriously, many, many years ago, when I was at Google, working to take the secrecy of its customers seriously.

But over the last several year, they have quite clearly taken a turn towards the evil, and I would now definitely advocate against trusting that Google treats the secrecy of its customers seriously.

@riley
I don't think it is a strategy by choice. But of course, if governments and courts refrain from pushing further, it will bring no improvements whatsoever.
@semioticstandard

@publictorsten I believe the change has likely to do with a significant change of the upper middle management at Google. It's probably not that any single person made a deliberate decision to now do evil shit just for the giggles, it's that a management recruitment policy shifted over time, and the new bunch has a different, more maleficent, idea of what is a normal thing for a megacorporation to do.

publictorsten Sep 10, 2023

@riley @semioticstandard many factors can come together.

Riley S. Faelan Sep 10, 2023

@publictorsten In complex systems, that's usually the case.
@semioticstandard

publictorsten Sep 10, 2023

@riley @semioticstandard And Online Advertising is enormously complex.

sandywb14 Sep 8, 2023

@semioticstandard 😡

Aleggra Sep 9, 2023

I hate all search engines! They know more about me than my mother does.

Urban Hermit Sep 9, 2023

@Aleggra @semioticstandard and that is totally unjustified. When I search for something, I want topics on that word or related to that word, not results that are biased towards things I have searched before. Using customized searches as an excuse is literally just an excuse to gather marketing data, it doesn't add value. When any of us search the world for information we want and need objective information, otherwise we end up stupid and misinformed... hmm... wait a sec...🤔

@Aleggra @semioticstandard

@Urban_Hermit The justification for using things that you have searched before is, you might be on a multi-search spree trying to find out something tricky that your previous searches help to put into their proper context. For example, if you're searching for tulips, and your previous search was for bubbles, you might be interested in the articles about the history of the great exploits in tulip finances, but if your previous search was for chocolate, you might instead be interested in places that deliver chocolate and tulips.

Search engines like to over-stretch that justification, though.

Urban Hermit Sep 9, 2023

@riley @Aleggra @semioticstandard
Straight up, I don't want the search engine combining topics and making up new searches based on assumptions. I am capable of combining words into sentences or adding adjectives to my searches as I go. I think we are all experts at rephrasing searches as we go by now, if any thing, it is more often the search engine that can't understand added adjectives. But thanks anyway.

@riley @Urban_Hermit @semioticstandard

Aleggra Sep 9, 2023

How bizarre. This is my first time hearing this about search engines.
Thanks so much 🙏

R. Leigh Hennig Sep 9, 2023

@Aleggra https://kagi.com is a good option, as is https://duckduckgo.com

Kagi Search - A Premium Search Engine

Better search results with no ads. Welcome to Kagi (pronounced kah-gee), a paid search engine that gives power back to the user.

Don Marti Sep 9, 2023

@semioticstandard Good point. And for those who are low risk, your web activity would be helping to train ML for hard-to-detect discrimination against others.

IMHO people who can switch browsers or turn off tracking are in a position to help with cooperative protection even if unlikely to be targeted themselves (https://blog.zgp.org/prejudiced-landlord/)

browser topics tracking and the prejudiced landlord problem

Dalyrceri (Ann B.)Sep 9, 2023

@semioticstandard be a good idea to mention a much better alternative: Brave. Versions for Windows, MacOS, iOS, Android, ChromeOS & Linux. Secure and private right out of the box. Tight as Firefox with every available privacy/security addon installed.

Reid

Sep 9, 2023

@semioticstandard tbh you're also not safe using Windows or Android or macOS or literally any corporate OS either since they do the exact same shit at an OS level, however once you bring that much truth on the table people start calling you a tinfoil hat conspiracy theorist and stop listening to you

@semioticstandard And DeSantis' district attorneys, when they come with a flimflamsy warrant.

Chris Mahan Sep 9, 2023

@semioticstandard Is Safari better?

R. Leigh Hennig Sep 9, 2023

@chris_mahan Yes, Safari is a good option, so is Firefox. There are some tweaks you can do to make Firefox even more protective of your privacy and you can look those up, but it’s a great browser

Chris Mahan Sep 9, 2023

@semioticstandard I've used firefox since oh god the netscape days!!!

Philip Hofstetter Sep 10, 2023

@semioticstandard @akosma to be fair: such a visit would be tracked under the broad topic „health and lifestyle“ and only offered to ad providers running ads on other sites that fall under the „health and lifestyle“ topic.

I don’t think topics offers more user profiling than third party cookies