r/degoogle 2d ago

Question Privacy concerns for the data copies already on those ugly data centres

So I know that Google keeps copies of everything that's done through it's websites and other info but what happens when you edit the content? Does it rewrite over the previous copies or just makes a new copy of the edit and stores that effectively storing all the edit logs? Contents like images, notes, docs, etc

And uh another question, what are the odds that the personal info of a random Jake is being read? Like Jake's emails, his photos etc? The population of Google users is kinda huge so I'd assume that not everyone's data is being read into?

7 Upvotes

3 comments sorted by

4

u/xeptoh 2d ago

First of all, only someone working at Google can really answer those questions.

From my experience (data-relelated job in a highly regulated industry) I can imagine that they store all the history. I have seen huge tables with full daily snapshots even if only a few rows changed (and there better ways to to this). Consider that storage is cheap and data is really precious so companies are reluctant to delete anything. It is way easier to see duplicate data rather than deleted data. I would assume they are storing all changes.

Regarding your second question, let me start saying that usually only a subset of employees has access to a certain subset of data (this is based on the need-to-know principle). Access is mainly granted on two levels: source and rows. For example, one employee could have access to emails and another one to ads; the one that has access to emails could have access only to clients from Europe. This access pattern is true for most employees, so only a subset of the employees that have access to data could potentially see yours. However there are exceptions: there could be employees that are creating global reports or working on ML models that have access to many sources and data points. IT usually can get unrestricted access to data (but only for a limited period) and this is necessary for maintenance or deployment activities (a data breach can occur due to IT guys that have unrestricted access to data during backup or recovery activities because they also need the possibility to "transfer" data during those activities). After this long explanation, the question is whether some of those guys is interested in your data or not. The answer is probably not, you are just one out of millions of clients, your are not special. Unless someone with access to data knows you and really wants to check your data, I would not worry too much. I have seen emails with personal details of clients used as "examples" of some strange findings in the data (e.g. field with a wrong value) but that is completely normal in a company and your data is just used in a functional way. In my opinion, the threat is not that a employee from Google can read your data (because probably none cares about you specifically unless you are a VIP), the threat is that when a data breach occurs, everyone would read your data and someone close to you, will.

1

u/Anon-Trans 18h ago

It's a security principle called "Position of Least Privilege;" you only access the data you need to do your job and nothing more.

2

u/xcbsmith 2d ago

If you are a resident of California, you are eligible for CCPA protections. You can request that they delete all the info tied to your identity.