Unpredictable Patterns #119: Future privacies on the way to AGI
Predictions, scenarios and the long view of privacy and data protection in an AI-enabled world
Dear reader,
I am just back from a wonderful couple of days at the TUM Think Tank in Munich, with inspiring discussions about technology, law, philosophy and sociology. So, it is with a hat tip to the exceptional human beings there that I want to dig in to the future of data protection and a few predictions here, that I think can be interesting to explore more in detail.
This newsletter is also the first with a tl;dr. I am writing these to learn, explore and research topics, and not everyone is reading them to follow along on those often winding excursions - to make sure that you get value directly, each email will now start with a tl;dr. But I will judge you silently if you only read that, of course. :)
TL;DR
Why data protection matters now: AI turns personal data from an ad-tech input into a strategic resource for health, security, and science, so yesterday’s GDPR trade-offs no longer fit today’s stakes.
Three predictions to watch:
GDPR-2.0 within 3 years – Germany’s BfDI and scholars (Wendehorst) are already floating a risk-tiered “Erlaubnisnorm” that would legalise AI training on personal data under clear safeguards, softening the current ban-by-default.
Tech-differentiated privacy rules – Edge processing, true anonymisation and federated learning will earn lighter regimes, mirroring the AI Act’s risk ladder.
Data as dual-use export commodity – Following the new U.S. Data Security Program and China’s 2022 rules, bulk personal data will fall under national-security export controls, limiting cross-border flows irrespective of privacy adequacy.
Signals: German regulator rejects the “infection thesis” (illegally trained ≠ illegal to use), sets up a ReguLab to test AI under supervision; U.S. DOJ already treats genomic & location data like chips; EU dual-use law review is pending.
Bigger lens: Privacy ultimately mediates autonomy, identity, and power. A toy simulation shows decision power shifting among individuals, corporations, and the state, modulated by regulation strength, security pressure, innovation dividend, and PET efficiency.
Take-home: Expect simultaneous liberalisation (risk-based GDPR update) and securitisation (export walls). The long arc is toward a new equilibrium that balances autonomy with collective learning—and the window for shaping it is now.
Predicting future data protection regimes
Privacy is a key concept in all technology policy, and naturally it is important also for AI-policy. We know that privacy and data protection is changing as the technology develops, and we get to choose how - at least to some degree, so we should explore future privacies and models in depth to figure out what we can get, and what we want. And we should try to predict where we are going.

Now. predictions are hard, as Yogi Berra reputedly noted, especially about the future, but here are three questions, the answer to which I think is worth thinking through in detail.
Will we see significant GDPR-reform in the coming 3 years? With significant we mean a shift of the balance between data protection and data use to the advantage of the latter, such that the balance between the individual’s control over data and the ability for corporations and the state to use that data is tilted in the direction of the latter.
Will we, in the next 5 years, see differentiatied data protection regimes for data sets that are anonymized, processed on-device or at the edge or otherwise subject to technical processing that lowers the risk for harm? I.e. will we see a risk-based approach to data protection that mirrors the one that we have in the AI-act?
Will personal data, in the next 5 years, be embedded in export regimes and under export controls, much like hardware of different kinds - where personal data is considered a key strategic asset in geopolitical frameworks? That is - will we see less of a focus on the equivalence doctrine, and a move to a capability regime, where what you can do with the data, and which countries get to do that is key?
In order to make these questions into good predictive questions we would need to really specify them more and get into how we would resolve the predictions - but let’s take these as a starting point. We will go through them one by one.
GDPR-reform
When we predict the future of data protection, we also need to look at the past. Data protection as a legal regime finds its beginnings in Germany, in the 1970s, informed by a worry that the state would collate information about citizens and use it to control them - it then quickly becomes a legislative trend and we get the 1980 OECD principles, and from there we move to pan-European legislation in the first data protection directive (95/46/EC). This directive is then re-framed in the context of US Internet dominance, consumer technology and advertising into the GDPR. As we trace this development we see different driving concerns - from state control to advertising and manipulation - and one reason it is reasonable to predict GDPR-reform is that the driving forces have changed - the emergence of artificial intelligence is forcing us to re-think the balance.
Why is that? It is worth not just settling for “uh, AI, so different”, and instead trying for a more detailed understanding of the causalities here and what they look like.
The first thing we then notice is that artificial intelligence generalizes the predictive value of data from advertising to learning. Where early use of personal data focused on advertising and personalization, we can now use data to find new cures for diseases, understand social problems, make progress in science, teach and educate in entirely new ways, search for solutions to climate change…the list is impressive and should make us think of the balance here. It is easy to see that there is a difference between:
(i) How should we balance the individual’s fundamental right to privacy with the commercial interests of advertising companies?
and
(ii) How should we balance the individual’s fundamental right to privacy with our collective capability to learn, make scientific progress and solve social problems?
The GDPR does a decent job in balancing the equities in the first questions, but never aspired to try to answer the second.
Now we can answer the second question by simply stating that since privacy is a fundamental right, we should assume that all of those other values - learning, scientific progress and the solution of social problems - are secondary, and that we should make no changes. There is a possible position here that could either be based on the view that the value of AI is grossly overstated, and that big tech companies are using AI as a wedge to re-balance data protection law or the view that none of those things are worth anything without privacy - but my prediction would be that that position will weaken over the coming decade, and rapidly so.
Question (ii) will exercise pressure on the GDPR to the extent that we will see significant reform, and I think it may well come much faster than we think.
A helpful barometer in assessing this is Germany. Data protection as a regime originated in Germany, and the German political thinking about data protection is perhaps the most rigorous in Europe, at least. German academics will point out that both history and a general German temperament make the questions around data protection core to the political discourse.
If we want to predict the future of data protection, at least in Europe, we need to go to Germany, then.
On the 13th of May, the federal data protection commissioner gave a talk on the future of data protection that provides a key signal for anyone interested in thinking about the future of data protection. In this talk she identified three key areas for data protection in the coming years: first health, second AI and third security. But she also laid out a vision that shows us how much we have actually moved towards question (ii).
First, Prof. Dr. Louisa Specht-Riemenschneider at the BfDI, notes that even data protection needs to be viewed in the light of the geopolitical situation, and that is, in itself, a shift: whereas the EU has prided itself on the Brussels effect spreading the GDPR, the geopolitics of data are now forcing a shift at home, where data is emerging as a key competitive asset. We will come back to this, because it feeds into our third prediction.
Second, it is interesting to note that she spends some time discussing the so-called infection-hypothesis: that a model trained illegally on personal data itself should be illegal and hence any use of it illegal as well. On the one hand this is a fairly technical question, but on the other it has enormous signal value to bring it up - and the way she does it is by rejecting the hypothesis, and lifting this up to be one of the core messages of the talk, featured in the written version at the very beginning:
Im Bereich KI liegt der Schwerpunkt auf der Rechtmäßigkeit von KI-Training und einer Beantwortung der sog. Infektionsthese. Aus Sicht der BfDI ist auch bei rechtswidrigem KI-Training eine rechtmäßige KI-Nutzung möglich.
Freely translated: in the AI-space the center of gravity lies in the legality of AI-training, and a response to the so-called Infection thesis. From the viewpoint of the BfDI legal use of an illegally trained AI is possible.
The position advanced here by the BfDI—that AI systems trained in violation of the GDPR can nevertheless be used lawfully—is a significant and arguably, for traditionalist intepreters, untenable departure from the strict regulatory logic of the GDPR. At its core, the General Data Protection Regulation insists on the principle of lawfulness, fairness, and transparency for all forms of personal data processing, including use. This is not a narrow procedural requirement at the point of data collection—it applies throughout the data lifecycle. The GDPR’s definition of “processing” (Article 4(2)) explicitly includes “use,” which means that even if a model was unlawfully trained, each subsequent use of that model potentially retriggers data subject rights, including access, objection, restriction, and erasure. The notion that unlawful origins can be bracketed off misunderstands both the structure of the Regulation and the nature of ongoing processing in dynamic systems like AI.
Moreover, the idea that a downstream user can simply disclaim responsibility by pointing to a third party's unlawful training contradicts both the accountability principle (Article 5(2)) and the purpose limitation principle (Article 5(1)(b)). GDPR creates a chain of accountability: users of a model built on unlawfully processed personal data cannot insulate themselves from liability by disavowing knowledge of the training context. This is especially true in systems that retain memory or are probabilistically influenced by their training data. In such cases, user-triggered processing may still surface personal information or derivative inferences. The rights of the data subject are therefore not extinguished by transfer of use—they persist and are triggered anew by every act of model deployment that processes personal data, directly or indirectly.
This interpretation would probably, then, be rejected by the European Court of Justice, which has consistently emphasized in cases such as Schrems II (C-311/18) and Digital Rights Ireland (C-293/12) that the rights of data subjects must be effectively and continuously protected. A model that benefits from or embeds unlawfully obtained data constitutes not just a static breach but a continuing violation of GDPR if used without remediation. The CJEU has shown little tolerance for formalistic loopholes that undermine the substantive guarantees of the Charter of Fundamental Rights (Article 8). In this light, a system that profits from unlawful data while avoiding downstream obligations would likely be struck down for hollowing out the architecture of rights the GDPR was built to uphold.
Finally, the BfDI’s position lacks support from the European Data Protection Board (EDPB) and rests on an uncertain reading of “legitimate interest” under Article 6(1)(f). The EDPB has consistently required a rigorous balancing test and the presence of appropriate safeguards for this basis to be applicable. It has never endorsed a framework where one party’s unlawful processing could be laundered into legitimacy by another’s subsequent use. In the absence of robust anonymisation or demonstrable technical and legal severance of personal identifiers, continued use of such models violates the rights regime of the GDPR.
And this is why I think the signal being sent here implies rapid GDPR-reform. The BfDI’s position is clearly not just pragmatic, but realistic - especially in the wider frame of technological change and geopolitical tensions.
Another clear signal in the talk is the focus on evidence and empirical investigation. The BfDI is even setting up a lab:
Wir ergänzen sie noch in diesem Jahr durch die Einrichtung unseres KI-Reallabors mit dem Namen „ReguLab“, damit KI-Systeme unter unserer aktiven Begleitung erprobt und anschließend datenschutzkonform in die reale Welt entlassen werden können.
This lab also represents a shift in balance and approach: from a purely regulatory position, to an enabling one: from datenschutz to datennutz.
What will GDPR-reform look like, then? One answer to that is found in Christiane Wendehorst’s draft ideas.
Wendehorst suggests that we think about data protection through the lens of a risk-based approach, essentially mirroring the way that AI-act is set up (she explicitly references the AI-act). Wendehorst is also directly referenced in the talk:
Dieses Thema wird uns in der digitalen Ära noch lange begleiten. Christiane Wendehorst hat hierzu in ihrem Entwurf einer KI-DatenschutzVO anregende Ideen präsentiert, wie z. B. eine Erlaubnisnorm, die Rechtmäßigkeitskriterien für KITraining gesetzlich statuiert.
In this context, an Erlaubnisnorm für KI-Training would be a statutory rule that explicitly permits the training of AI systems on personal data, provided certain legal conditions are met (e.g., purpose limitation, proportionality, data minimization, safeguards). It contrasts with the current approach under the GDPR, where such permission must be inferred indirectly via general legal bases (like legitimate interest under Art. 6(1)(f)) and balancing tests.
Thus, the sentence implies a move toward clear, ex-ante legal certainty for AI developers—removing ambiguity by legislating when and how personal data may be used for training AI systems.
And all this from Germany. This, to me, is a strong signal that we will see significant reform and soon.
Technical differentiation
The other prediction here is one that is more specific: it is about the differentiation of data protection regimes and rules for different kinds of technical processing. The early signs that we could be moving into a regime that would allow this may not seem very strong, and the somewhat awakward attitude to anomymization we find in, for example, the European Data Protection Board’s report on AI, the bar seems high:
In sum, the EDPB considers that, for an AI model to be considered anonymous, using reasonable means, both (i) the likelihood of direct (including probabilistic) extraction of personal data regarding individuals whose personal data were used to train the model; as well as (ii) the likelihood of obtaining, intentionally or not, such personal data from queries, should be insignificant37 for any data subject. By default, SAs should consider that AI models are likely to require a thorough evaluation of the likelihood of identification to reach a conclusion on their possible anonymous nature. This likelihood should be assessed taking into account ‘all the means reasonably likely to be used’ by the controller or another person, and should also consider unintended (re)use or disclosure of the model.
But this at least allows for the argument that a model is truly anonymized. What is more promising is Wendehorst’s exceptions, in article 9, that in its entireity says:
Regulation (EU) 2016/679 shall not apply to the processing of personal data, including sensitive personal data, where
the processing is merely transitory in nature [or for a limited period of time not exceeding …]; and
the processing is for a purpose that is unrelated to the data subject as an identified or identifiable natural person; and
appropriate technical and/or organisational safeguards are in place to prevent any use of the data for a purpose related to the data subject as an identified or identifiable natural person up to the point when the data are irreversibly anonymised or erased.
Data shall not be considered as personal data within the meaning of Regulation (EU) 2016/679 and other Union and national law referring to or relying on the notion of personal data within the meaning of Regulation (EU) 2016/679 insofar as
the data relate primarily to an entity other than a natural person, such as an enterprise or an object, and the data subject is associated with that entity exclusively as owner, employee or in a similar function; and
processing of the data is for a purpose that is not specifically related to the data subject as an identified or identifiable natural person; and
appropriate technical and/or organisational safeguards are in place to prevent any use of the data for a purpose specifically related to the data subject as an identified or identifiable natural person.
This paragraph is intriguing and could be radically improved if c) in both cases added the words “…by anyone else than the data subject”. This would allow for anyone to process data about themselves in a way that seems obvious when we discuss it, but rarely is reflected in the law. There is no privacy-risk associated with such self-processing, and it is increasingly architecturally possible. Now, the storage of data etc might have to be encrypted, but if we think this through we should be able to set out rules such that software developers are encouraged to build such safe-guards.

The trick here will be to formulate regulatory frameworks such that they encouraged architectures that maximize benefits and control for risks. Very little such incentive was built into the GDPR - and this is one the key flaws in it, I think. Anonymization, edge processing and self-processing should all be categories we explore more in-depth, and I suspect we will.
Techniques like local differential privacy and federated learning deserve real, technical attention from legislators.
Personal data, trade and export controls
So, for the last prediction, then: will personal data come under export control rules? This one may seem weird, but I think it is worth considering, especially as it counteracts some of the other predictions. The two first predictions are about a different, perhaps more permissive regulatory evolution: this one suggests that one of the most restrictive regulatory frameworks we know, export controls, may be increasingly applied to personal data.
Now, this is not a prediction that we will end up in another case where the EU courts decide for privacy reasons that data cannot be shared with US-companies. That may well happen, but what I am more interested in here is the idea that personal data, aggregated, about a population is actually a dual use asset.
The strongest evidence of this that we have seen recently is the US Department of Justice issuing a limited enforcement policy and advice on this issue, on the 11th of April this year. They write:
To address this urgent threat, the Data Security Program establishes what are effectively export controls that prevent foreign adversaries, and those subject to their control, jurisdiction, ownership, and direction, from accessing U.S. government-related data and bulk genomic, geolocation, biometric, health, financial, and other sensitive personal data. To assist the public in coming into compliance with the Data Security Program, NSD has issued a Compliance Guide, an initial list of over 100 Frequently Asked Questions (FAQs), and an Implementation and Enforcement Policy for the first 90 days. NSD will be taking additional steps over the coming weeks and months to implement the Data Security Program, including publishing an initial Covered Persons List that identifies and designates persons subject to the control and direction of foreign adversaries. The Data Security Program went into effect on April 8, 2025.
If I am right, this is just the beginning, and we will see Europe and other countries taking the same kind of measures soon. This means that data protection regimes will add a layer of trade and export restrictions on top.
China has, of course, been there since 2022. The UK explored such rules last year as well — understanding that there are real risks here. India has them.
It is worth noting that this is not for the protection of the data subjects, but for national security reasons, and this is the key to the prediction here as well: while we may see a liberalization of protections for data subjects we see that balanced out by a national security angle.
Scenarios
Let’s take the predictions and sketch out three short scenarios, just to make this more concrete.
1. Precision-Learning Pact – risk-tiered GDPR 2.0
Core move. The EU passes an AI Data-Protection Regulation that sits next to—rather than inside—the GDPR, modelled on the Wendehorst draft. It creates a clear statutory Erlaubnisnorm for AI training, a tiered risk test for downstream uses and fast-track approvals for provably anonymised or edge-processed data.
Why plausible.
Germany’s new BfDI has openly called for “handfeste Kriterien” that let KI training happen lawfully and dismisses an absolutist Infection Thesis heise online.
The academic text circulating in Brussels already sketches the needed legal language zivilrecht.univie.ac.at.
The EDPB’s 2024 opinion accepts that some models can be demonstrably anonymous, hinting at a measurable threshold European Data Protection Board.
Early possible indicators to watch. Commission white paper on “GDPR-Lite for AI”, EDPB pilot audits using ReguLab, a Franco-German non-paper proposing a risk-tiered Article 6(1)(f) rewrite. Who actually gets the AI-oversight in Germany.
2. Sovereign Data Walls – export controls eat privacy law
Core move. Personal data above a “bulk sensitive” threshold is classed as dual-use technology. GDPR stays largely intact, but outbound transfers now need export licences akin to those for advanced chips.
Why plausible.
The U.S. DOJ Data Security Program—already in force—treats genomic, biometric and geolocation data exactly like a controlled commodity Department of Justice.
China’s 2022 Data Export Security Assessment Measures require CAC sign-off for any set deemed a national-security risk China Law Translate.
EU dual-use law (Reg. 2021/821) was widened to cover “cyber-surveillance items”; a 2023–25 review explores extending the list Trade and Economic Security.
India’s DPDP Act lets India blacklist destinations for “sovereignty or security” reasons Leegality.
Early possible indicators to watch. Draft annex to 2021/821 adding “population-scale personal data”; CJEU ruling that model-weight export constitutes an intangible transfer.
3. Edge-Sovereignty – technical differentiation without heavy law-making
Core move. Regulators decide that architecture can substitute for paperwork. GDPR gets only surgical tweaks, but data kept on-device, in trusted islands or under local differential privacy is treated as out of scope. Federated learning consortia become standard for health and mobility AI.
Why plausible.
EDPS tech-sonar already frames federated learning as a tool for data-minimisation European Data Protection Supervisor.
Research shows edge-DP frameworks hitting utility targets under GDPR ResearchGate.
The EDPB opinion’s case-by-case anonymity test implicitly rewards designs that make re-identification “insignificant” European Data Protection Board.
Early possible indicators to watch. ISO spec for “Edge-Privacy Assurance”, insurers offering lower cyber-premiums for on-device analytics, venture funding spikes for TEEs and FL toolchains.
The long view - simulating future privacy
Our predictions here have mostly been about the short term equilibrium after the GDPR - but there is also another interesting question here. The last couple of decades we have seen the constant evolution, re-calibration and re-consttuction of data protection law, with the changing technologies, social pressures and economic realities. But there is another way to approach all of this, and that is to try to imagine what the long term equilibrium could look like, and if there is such a thing as a stable state for privacy in an advanced, AGI-enabled society. If we can imagine this equilibrium, we could also explore if our current regulatory moves bring us closer to, or further away, from that imagined state of stable data protection.
While highly speculative, such an exercise forces us to build some kind of basic model of data protection and privacy, and it forces us to ask what functions data protections protect and select for. The evolution of data protection selects for something. But the question is what.
One possible model here is that there are core values here that we want to protect, and as we imagine future equilibria we should focus on those to anchor ourselves.
Let’s explore what those values could be.
One core value that I think we find when we explore the motivations behind privacy law is autonomy. Protecting individual autonomy from the state (first) and commercial interests (second) seems to be a key function of data protection law, and one that has impacted which rules we select for and work with. The selection here is fuzzy, and there is plenty of drift, but data protection undeniably functions as a social guarantee for autonomy.
This suggests - then - that what we should think about when we think about the future of privacy is the mechanisms of autonomy. At the heart of autonomy lies decision making - and so maybe there is something like an equilibrium here?
The other thing we have to factor into any such model is that privacy is just a secondary concept, and the primary one is identity. Just as Wittgenstein notes that you can have no doubt without belief first, you can have no privacy unless you have an identity, or at least the potential to form one.
The careful, social and collective construction of identity is key here. Identity is designed through autonomous decisions - but autonomy is not atomic: we cannot build an identity on our own, we need networks to do this. Our identity is, to a large degree, strewn in the eyes of others.
Autonomy, identity, decision making - all of those factors seem to be at the heart of any equilibrium. And not just decisions made by us, but also decisions made about us and impact our autonomy and agency. This realization is nascent in the way that data protection law’s longer trajectory seems to be towards “justified and justifiable” decisions, with a detour over the ill-thought through idea of explainability (as explored by Sandra Wachter et al and here). How do we combine them then?
Maybe: privacy is the networked negotiation of autonomy and identity, towards a key equilibrium between individual autonomy and collective welfare. In a simple toy model / simulation:
You can play around with the model here. How would you model it, and what would you do differently? The virtue of simulation and modelling is that we force ourselves to make our ideas clearer, and so we end up with various alternative futures. Here are some of the candidate equilibria in this toy model.
There is another value, here, of course, that we do not discuss enough, and that is power. Another model approach would be to say that what we model is corporate, state and individual decision power - and that the negotiation around autonomy and privacy is really just a negotiation of power in networked society. Such models are also interesting to explore - and we will come back to this. Here a first example of a possible model at play, tracking power across time in the negotiation. Play around here with it.
This simulation treats society as a fluid bargaining game in which decision-making power continuously reallocates among three actors—individuals (whose autonomy and identity hinge on privacy), corporations (which harvest data for profit and design choices), and the state (which wields coercion for security and macro-steering)—while a fourth variable, the share of privacy-enhancing technologies (PETs), rises or falls. Power flows each step according to four dial-up forces: stronger regulation shifts power from corporations to both individuals and the state; mounting security pressure pulls power toward the state and away from PET uptake; a higher innovation dividend rewards corporations for raw-data access unless PETs are efficient enough to preserve learning value; and PET efficiency itself tempers the trade-off by letting data stay useful without eroding privacy. Users set initial power shares and these force levels, then watch the system iterate as PET adoption and power proportions co-evolve toward new equilibria.
Summing up then, data protections long arc is likely bending towards some kind of equilibrium of power or autonomy, and using that as a reference case in discussing reform is a way to orient ourselves better. Asking the question of where this all ends is both intriguing and helpful.
Thanks for reading.
Nicklas
The discussion on the reform of GDPR is an interesting one that is long overdue. But at the same time, I believe that a lot of the problems with GDPR could be addressed even without re-opening the text of the law.
Take the German regulator's ReguLab: Nothing prevents DPAs from setting up those labs today. Under GDPR, advising businesses on data processing is as much a responsibility of DPAs as is enforcement, but in reality, regulators rarely had the resources for this. So it's great that DPAs are now embracing their role as advisors - but they could have done this six years ago already!
Or take Christiane Wendehorst's proposal of a risk-based approach to privacy. Again, GDPR already follows a risk-based approach, at least in theory. But in reality, regulators have tended to err on the side of the strictest interpretation of GDPR whenever they were discussing a case. So it is already in the hand of regulators to take a more nuanced approach to data processing, e.g. to allow for learning from personal data.
Which brings me to the last part of my comment, the distinction between data processing for advertising and data processing for learning. It's fair to say in my view that all the protections we erected to protect users from advertising may now hold us back to pursue groundbreaking developments in health and science.
But GDPR sees the protection of personal data first and foremost as a fundamental right and only then considers use cases for data processing. And the same personal data that is used to display advertising could also lead to a new discovery.
So one question for a reform of GDPR could be whether the use case of data processing should be taken into account when processing personal data. That would certainly be an interesting angle but also a fundamental shift from where we stand currently.
Another interesting route is to think more about exceptions for self-processing and exceptions under Art. 9 as suggested by Wendehorst (although I would question why this data could not simply be anonymised in which case we would just need simple guidelines for anonymisation.
So in short, it's great to fix a broken law but it seems more important to me we fix the bias in GDPR's interpretation and the mindset around privacy and data protection!