What if it’s not the enforcement? Reflections post #EDPSConf2022

What worries me is not that Meta or Google continuously choose to infringe the GDPR. What I fear is that, despite all their data collection and usage, such companies are actually in compliance with the Regulation. If so, no amount of enforcement will change the status quo.

What happened in Brussels on June 16-17, 2022?

Europeans think that the problem with the GDPR is insufficient enforcement.

On June 16-17, 2022, several hundred privacy professionals gathered in Brussels at a conference titled “The future of data protection: effective enforcement in the digital world,” convened by the European Data Protection Supervisor, Wojciech Wiewiórowski. Don’t let the generic title fool you; this was a major political event and attracted the top players in the GDPR game. Among one hundred speakers, one could find prominent policymakers (Margrethe Vestager, Věra Jourová, Birgit Sippel, Karen Melchior, Axel Voss, Juan Fernando López Aguilar), regulators (Marie-Laure Denis, Ulrich Kelber), activists (Ursula Pachl, Max Schrems), industry representatives (Julie Brill, Microsoft; Jane Horvath, Apple; William Malcolm, Google), and academics (Orla Lynskey, Paul De Hert, Michael Veale). All the people whose job is to shape the narrative and the trajectory of the data protection law in Europe were present.

What motivated the conference was a shared disappointment with the GDPR’s influence on the Big Tech. Four years after the law has become applicable, we still live in a world of commercial surveillance. In 2022, just like in 2012, when work on the GDPR began, essentially everything we do gets recorded as data and used to advance some corporate interests.

The participants seemed to agree about the reason behind the GDPR’s suboptimal performance: insufficient enforcement in cross-border cases. The GDPR applies directly throughout the Union but is enforced locally by the national Data Protection Authorities. What works well regarding local matters fails when transnational corporations are concerned. Under the so-called “One Stop Shop” mechanism, corporations like Meta and Google can choose one DPA to be overseen by (and they usually select Ireland). Meaning: one member state is overburdened with enforcement costs and overendowed with enforcement power, vis a vis the Big Tech coming from oversees.

The disagreement concerned the way forward: what shall we do about the suboptimal enforcement? On the one hand, the defenders of the status quo called for more time, funding, and mutual dialogue. The law is fine; no need to reform – they seemed to say – just give the DPAs and the NGOs more money and let us do our work. Until about a year ago, this has been the orthodox view in European mainstream circles: the GPDR is perfect on paper; if any intervention is needed, it concerns the factual capacities of the DPAs.

On the other hand, an increasingly larger and louder chorus of calls for legal reform could be heard. Among several ideas, including harmonization of procedural rules, the most straightforward has been the proposal to centralize the enforcement in cross-border cases. Let’s leave the 99% of enforcement as is, with the national DPAs – the reformers seemed to suggest – but carve out the truly expensive, complex, and politically loaded cases for a supranational body, like the EDPB, EDPS, or a newly created entity. From a purely academic standpoint, this proposal seems commonsensical. Indeed, this is the way the EU enforces many of its regulations, including competition law. As personal data protection has been enshrined as a fundamental right in the Charter, one is left puzzled seeing how the EU, on the ground, treats the market economy as more important than human dignity. Do as I say, not as I do, I suppose. Politically, though, centralization would require opening a new battle that few have an appetite for now.

So, what happens now? What will be the results of the conference? In the short term, probably nothing. The EU is busy finishing the DSA and DMA business (which all feature centralized enforcement mechanisms, btw) and in the trenches fighting over the AI Act, the Data Act, and a whole range of the digital policy pieces already on the agenda. In the mid-term, however, centralization of enforcement in cross-national cases seems hard to avoid. When the new Commission and the new European Parliament begin their terms in two years, given how often scandals regarding data processing hit nowadays, the idea that today seems practically impossible might turn into a politically inevitable one. In this sense, the reformists at the EDPS Conference succeeded. An idea has been planted, moved from the academic outskirts into the political mainstream, and sooner or later will become a reality.

So let us imagine that five years from now, the GDPR enforcement concerning the Big Tech has been centralized in the hands of a powerful, well-financed, and stuffed regulatory body. Will we then, finally, move from the world of commercial surveillance into a world of perfect privacy and data autonomy?

I have my doubts.

Unpopular view: the problem with the GDPR is its substance

As we are in the moment when unpopular opinions are uttered, let me express a view considered an absolute heresy in Europe: the GDPR is just a bad law, on substance, when it comes to taming the excesses of data collection and usage by the Big Tech. It could not have stopped the commercial surveillance, and it will not save us, regardless of beefed up the enforcers become.

The GDPR is – to its core – a neoliberal regulation delegating political choices to the market. Europeans hate to hear this; they do not believe it in good faith, but it’s true. Under all the rhetoric about the fundamental rights and the substantive principles, procedural requirements, and data subjects’ rights, the GDPR is exactly the same as the American “notice and choice” model, just with extra steps.

Don’t close this blog post; let me elaborate.

At the foundation of the GDPR model lay several principles that seem to distinguish it from the American in-your-face-neoliberal counterpart: the purpose limitation principle (data can only be processed for the purposes for which it was gathered), the data minimization principle (one cannot collect more data than necessary for a particular purpose), or the legality principle (one must secure a legalizing basis, like consent or a necessity to perform the contract, for processing to be lawful). In addition, one faces robust transparency and accountability obligations, paired with people’s rights to know what data is processed, correct it, or object to further processing. This sounds promising, doesn’t it?

The problem with the GDPR system is that it says nothing about the legality of particular purposes of processing or the lawfulness of specific contracts or business models. On substance, corporations are essentially unconstrained when it comes to specifying what purposes they want to processes the data for, or what place this processing has in the overall commercial transaction. Following the adoption of the Digital Content Directive 2019/770 which (in a supposedly pro-consumer attempt to extent the legal protection to “free” services for which consumers “pay” with personal data) effectively legalized the B2C contracts treating personal data as “payment,” corporations are free to specify what processing they consider necessary to perform the contract.

Consequently, if Facebook or Google hire stellar lawyers to draft their terms of service and privacy policies (which they do), they are absolutely free to decide what purposes to process the data for, or how to construct their contracts. Sure, a lot of legal engineering needs to happen around this – accountability procedures need to be established, and lengthy documents that no one will need must be drafted – but this is a monetary cost, not a substantive constraint, on data collection and usage.

Do you want to collect data to addict users to your platform? Do you want to use data to influence your users’ opinions or behaviors? As long as you openly admit to that in the privacy policy, and as long as you can demonstrate that such a use makes up part of the contract the consumer has concluded with you, and as long as you play by the GDPR’s procedural rules, GDPR gives you the green light. That is why nothing has changed since 2018.

What I fear, as a citizen concerned about the Big Tech’s power over individuals’ lives, impacts on autonomy and mental health, is not that Facebook or Google choose to continuously infringe the GDPR, for whatever reasons (lax enforcement among them). What I fear, as an academic who empirically studies their terms of service and privacy policies in the light of the binding law, is that what Facebook and Google do is perfectly complaint with the GDPR. Sure, they might be infringing the law on the margins – personal advertising systems need to be improved, disclosures could be clearer, etc. – but the very core of their business models is not only outside of the GDPR’s policing power; the GDPR legalizes these practices.

Meta and Google do what they did before; just now they have hundreds of pages of documents explaining how, under the GDPR, these practices are legal. If this is the case, no amount of enforcement will help us.

So, what can be done?

The GDPR, in its name and ambitions, is a general law, applying to both private and public bodies, in the same way, throughout the Union. And, in many cases, it works well. For example, regarding the public administration, which does not come up with purposes of processing on its own, but is endowed with competencies by legislation, the GDPR is a perfect tool to safeguard individuals’ privacy. And in many private sector contexts, like the paradigmatic “pizzeria does not need more than your address and phone number to deliver pizza, and should not use your data to send you further ads” it curtails unwanted commercial communications.

However, the GDPR was not designed for the Big Tech, basing its entire business model on data collection and advertising.

Of course, we need more enforcement, and the centralization in cross-border cases is a no-brainer for anyone who thinks about it seriously. But centralization itself won’t help. We need substantive regulation of purposes of processing.

Put simply: the EU, or the Member States, should take some purposes of processing, some contract types, and some business models outside of the realm of market choices and regulate them directly. Maybe there are some data practices that we want to forbid across the board, like using addictive design in apps used by minors, or directly promoting self-harm and eating disorders, like Instagram did. Or maybe we want to create some specific conditions for other practices, like mental-health protection measures for social media, or pointing out the kinds of products we don’t want to be advertised based on specific data, or in specific hours, or to some social groups. These are political, not technocratic, decisions to be taken.

Regulation of purposes of processing needs to be done case-by-case and sector-by-sector, something the Europeans don’t like. And yet, as problems are very specific (different normative considerations, and different solutions, come into play when speaking about data leading to discrimination in hiring, and contributing to depression in teens) responses need to be tailor-made as well.

In a world in which almost everything is data-driven, the activities of the Big Tech are no longer a personal data protection problem (only). They are consumer law problems, employment law problems, discrimination law problems, mental health law problems, etc. And they need to be addressed as such, by these laws, with a deep understanding of the technology and business models beneath them.

So, is the GDPR a bad law, as I provocatively wrote a couple of paragraphs earlier? Today, in action, it is. It does not have to be, if it is accompanied by substantive regulation of specfic purposes of processing, business models, and types of contracts. If one looks at the history of the idea that ultimately became the GDPR, this was the plan back in the 1970s. But then, you know, Ronald Regan and Margaret Thatcher happened, followed by Bill Clinton, Gerhard Schroeder, and Tony Blair, and we all kind of fell in love with neoliberalism, and delegated these choices entirely to the market. It’s time to wake up.

Concluding: the reformists’ call for the centralization of GDPR’s enforcement in cross-border cases – against companies like Meta or Google – is a step in the right direction but will solve much less than participants in the EDPS Conference have been assuming. It is a necessary move but, by far, insufficient. Or, put differently: it is a second-order problem, discussed widely, while the first-order problem remains unaddressed. The good thing is that regulation of purposes of processing might actually be easier than re-opening the GDPR. The bad thing is that no one thinks about doing it.

Shall we try to plant this idea now?

Zuboff v. Hwang, or: are targeted ads a bubble?

The Internet runs on ads. Ads pay for the operations of Google and Facebook, and a lot of other stuff, including journalism. You might dislike them, but they’re really important. However, what if they’re just one, huge bubble; a scam waiting to fall apart like the subprime mortgage derivatives back in 2008?

tl;dr: Read Tim Hwang’s Subprime Attention Crisis: Advertising and the Time Bomb at the Heart of the Internet, or at least listen to this podcast with him.

Advertising is the prime source of revenue for big tech companies like Google or Facebook. It is also the cornerstone of the “Grand Bargain” — you get access to services and content for free, but we get to collect data about you and use it to personalize the ads you see. Even though everyone’s (correctly) upset about all this data collection and threats to privacy, one must admit: the consumption of the Internet’s perks is still extremely egalitarian. One might be unable to afford a dentist appointment or a daily healthy dinner, but with a smartphone and internet access, everyone can “afford” to use Instagram, Google Maps, Gmail, Whatsapp, YouTube, and everything else. Ads subsidize all this.

Now, there are two narratives about online ads that seldom meet. On the one hand, academics and privacy/digital rights advocates tell the story of how personalized ads influence our minds and behavior, stripping us of autonomy. Because ads are based on data about us and millions of others, their timing/content/context, etc. can be so good as to influence purchasing behavior to a degree threatening human freedom. This, also, provides an incentive to keep collecting all this data.

The most well-known elaboration of this critique has been Shoshana Zuboff’s 2019 “The Age of Surveillance Capitalism.” Zuboff not only described the phenomenon of data-driven marketing; she also provided a conceptual framework to talk about it, and a theory explaining it. In her view (admittedly criticized by some academics), the mechanisms behind online ads are so reliable that corporations now trade in so-called “behavioral futures.” The idea is this: if I’m a marketer, I am so good and sophisticated that I can guarantee that if you spend X on my services, I will increase your sales by Y in the Z period of time. Of course, we don’t know who exactly will buy your product – this is just statical certainty – but we know that someone will. Because of this certainty, you can already now sell this future profit, or use it as collateral in some other transaction. A complex web of financial products surrounds online ads.

Scary isn’t it? Or exciting, if you want to make money.

The second narrative about online ads is somehow contradictory: they suck. How many times has it happened to you that you already bought something, and yet keep receiving the ads for the same/similar product? How many times have you seen an ad and thought “how can they be so dumb?” Lately, a colleague of mine, who’s a law professor at an American law school got an ad suggesting to them a part-time law degree program at the same law school. A Google ad, the best on the market! This is just an anecdote, I know, but I’m sure you have your own.

A tremendous book I just read (well, listened to on Audible) is Tim Hwang’s “Subprime Attention Crisis.” Hwang analyses lots of data available about the efficacy of online ads and makes a case that they’re just one, huge bubble. Many corporations think they are valuable and actually work, but it might soon turn out that they don’t. Once this happens, the whole financial ecosystem funding the operation of the internet will collapse. How could that happen?

One option is that the companies will simply realize they’re overpaying and limit their ad spending with programmatic ads. This could lead to some sort of “Internet recession” but not necessarily a crisis. The other option, however – and here we get back to Zuboff’s claim that “behavioral futures” already serve as collateral – is that at some point we’ll realize that all this promised value, value already reinvested, does not exists. That’s when the bubble bursts.

Now, whether this is actually the case – that behavioral futures are packed together and sold to a degree threatening the stability of the internet ecosystem – or who’s betting on this future value – is beyond my ability to know. But the idea is so intriguing it got me back to blogging after a couple of years of a pause.

All this to say: a “shock” enabling policymakers to radically remake the Internet as we know it might be around the corner. And to follow Naomi Klein’s reading of Milton Friedman: our job is to keep ideas on how a better world could look like alive.

CLAUDETTE: Automating Legal Evaluation of Terms of Service and Privacy Policies using Machine Learning

It is possible to teach machines to read and evaluate terms of service and privacy politics for you.

Have you ever actually read the privacy policy and terms of service you accept? If so, you’re an exception. Consumers do not read these documents. They are too long, too complex, and there are too many of them. And even if they did the documents, they have no way to change them.

Regulators around the world, acknowledging this problem, put in place rules on what these documents must and must not contain. For example, the EU enacted regulations on unfair contractual terms; and recently the General Data Protection Regulation. The latter, applicable since 25^th May 2018, makes clear what information must be presented in privacy policies, and in what form. And yet, our research has shown that, despite substantive and procedural rules in place, online platforms largely do not abide by the norms concerning terms of service and privacy policies. Why? Among other reasons, there is just too much for the enforcers to check. With virtually thousands of platforms and services out there, the task is overwhelming. NGOs and public agencies might have competence to verify the ToS and PPs, but lack the actual capability to do so. Consumers have rights, civil society has its mandate, but no one has time and resources to bring them into application. Battle lost? Not necessarily. We can use AI for this good cause.

The ambition of the CLAUDETTE Project, hosted at the Law Department of the European University Institute in Florence, and supported by engineers from the University of Bologna and the University of Modena and Reggio Emilia, is to automate the legal evaluation of terms of service and privacy policies of online platforms, using machine learning. The project’s philosophy is to empower the consumers and civil society using artificial intelligence. Currently artificial intelligence tools are used mostly by large corporations and the state. However, we believe that with efforts of academia and the civil society AI-powered tools for consumers and NGOs can and should be created. Our most technically advanced tool, described in our recent paper, CLAUDETTE: an Automated Detector of Potentially Unfair Clauses in Online Terms of Service, can detect potentially unfair contractual clauses with 80%-90% accuracy. Such tools can be used both to increase consumers’ autonomy (tell them what they accept), and increase efficiency and effectiveness of the civil society’s work, by automating big parts of their job.

Our most recent work has been an attempt to automate the analysis of privacy policies under the GDPR. This project, funded and supported by the European Consumer Organization, has led to the publication of the report: Claudette Meets GDPR: Automating the Evaluation of Privacy Policies Using Artificial Intelligence. Our findings indicate that the task can indeed be automated once a significantly larger learning dataset is created. This learning process was interrupted by major changes in privacy policies undertaken by the majority of online platforms around 25 May 2018, the date when the GDPR started being applicable. Nevertheless, the project led us to interesting conclusions.

Doctrinally, we have outlined what requirements a GDPR-complaint privacy policy should meet (comprehensive information, clear language, fair processing), as well as the ways in which these documents can be unlawful (if required information is insufficient, language unclear, or potentially unfair processing indicated). Anyone – researchers, policy drafters, journalists – can use these “golden standards” to help them asses existing policies, or draft new ones, compliant with the GDPR.

Empirically, we have analyzed the contents of privacy policies of Google, Facebook (and Instagram), Amazon, Apple, Microsoft, WhatsApp, Twitter, Uber, AirBnB, Booking.com, Skyscanner, Netflix, Steam and Epic Games. Our normative study indicates that none of the analyzed privacy policies meet the requirements of the GDPR. The evaluated corpus, comprising 3658 sentences (80.398 words), contains 401 sentences (11.0%) which we marked as containing unclear language and 1240 sentences (33.9%) that we marked as potentially unlawful clauses, i.e. either a “problematic processing” clause or an “insufficient information” clause (under articles 13 and 14 of the GDPR). Hence, there is significant room for improvement on the side of business, as well as for action on the side of consumer organizations and supervisory authorities.

The post originally appeared at the Machine Lawyering blog of the Centre for Financial Regulation and Economic Development at the Chinese University of Hong Kong