Passenger Name Records, data mining & data protection: the need for strong safeguards


by Douwe KORFF and Marie GEORGES (FREE-Group Members)


Much has been said and written about Passenger Name Records (PNR) in the last decade and a half. When we were asked to write a short report for the Consultative Committee about PNR, “in the wider contexts”, we therefore thought we could confine ourselves to a relatively straightforward overview of the literature and arguments.

However, the task turned out to be more complex than anticipated. In particular, the context has changed as a result of the Snowden revelations. Much of what was said and written about PNR before his exposés had looked at the issues narrowly, as only related to the “identification” of “known or [clearly ‘identified’] suspected terrorists” (and perhaps other major international criminals). However, the most recent details of what US and European authorities are doing, or plan to do, with PNR data show that they are part of the global surveillance operations we now know about.

More specifically, it became clear to us that there is a (partly deliberate?) semantic confusion about this “identification”; that the whole surveillance schemes are not only to do with finding previously-identified individuals, but also (and perhaps even mainly) with “mining” the vast amounts of disparate data to create “profiles” that are used to single out from the vast data stores people “identified” as statistically more likely to be (or even to become?) a terrorist (or other serious criminal), or to be “involved” in some way in terrorism or major crime. That is a different kind of “identification” from the previous one, as we discuss in this report.

We show this relatively recent (although predicted) development with reference to the most recent developments in the USA, which we believe provide the model for what is being planned (or perhaps already begun to be implemented) also in Europe. In the USA, PNR data are now expressly permitted to be added to and combined with other data, to create the kinds of profiles just mentioned – and our analysis of Article 4 of the proposed EU PNR Directive shows that, on a close reading, exactly the same will be allowed in the EU if the proposal is adopted.

Snowden has revealed much. But it is clear that his knowledge about what the “intelligence” agencies of the USA and the UK (and their allies) are really up to was and is still limited. He clearly had an astonishing amount of access to the data collection side of their operations, especially in relation to Internet and e-communications data (much more than any sensible secret service should ever have allowed a relatively junior contractor, although we must all be grateful for that “error”). However, it would appear that he had and has very little knowledge of what was and is being done with the vast data collections he exposed.

Yet it is obvious (indeed, even from the information about PNR use that we describe) that these are used not only to “identify” known terrorists or people identified as suspects in the traditional sense, but that these data mountains are also being “mined” to label people as “suspected terrorist” on the basis of profiles and algorithms. We believe that that in fact is the more insidious aspect of the operations.

This is why this report has become much longer than we had planned, and why it focusses on this wider issue rather than on the narrower concerns about PNR data expressed in most previous reports and studies.

The report is structured as follows. After preliminary remarks about the main topic of the report, PNR data (and related data) (further specified in the Attachment), Part I discusses the wider contexts within which we have analyzed the use of PNR data. We look at both the widest context: the change, over the last fifteen years or so, from reactive to “proactive” and “preventive” law enforcement, and the blurring of the lines between law enforcement and “national security” activities (and between the agencies involved), in particular in relation to terrorism (section I.i); and at the historical (immediately post-“9/11”) and more recent developments relating to the use of PNR data in data mining/profiling operations the USA, in the “CAPPS” and (now) the “Secure Flight” programmes (section I.ii).

In section I.iii, we discuss the limitations and dangers inherent in such data mining and “profiling”.

Only then do we turn to PNR and Europe by describing, in Part II. both the links between the EU and the US systems (section II.1), and then the question of “strategic surveillance” in Europe (II.ii).

In Part III, we discuss the law, i.e., the general ECHR standards (I); the ECHR standards applied to surveillance in practice (II, with a chart with an overview of the ECtHR considerations); other summaries of the law by the Venice Commission and the FRA (III); and further relevant case-law (IV).

In Part IV, we first apply the standards to EU-third country PNR agreements (IV.i), with reference to the by-passing of the existing agreements by the USA (IV.ii) and to the spreading of demands for PNR to other countries (IV.iii). We then look at the human rights and data protection-legal issues raised by the proposal for an EU PNR scheme. We conclude that part with a summary of the four core issues identified: purpose-specification and –limitation; the problem with remedies; “respect for human identity”; and the question of whether the processing we identify as our main concern – “dynamic”-algorithm-based data mining and profiling – actually works.

Part V contains a Summary of our findings; our Conclusions (with our overall conclusions set out in a box on p. 109); and tentative, draft Recommendations. (…)


As noted above, we have drawn important conclusions on the use of bulk PNR data in respect of four fundamental issues:

The compulsory suspicionless provision of PNR data in bulk does not serve a legitimate aim:

As already noted, we found that bulk PNR data are not needed for any normal, legitimate law enforcement or border control purpose (API suffices for those). Rather, we concluded that the only real purposes of the demand for bulk PNR data is to serve either of the two following purposes:

  • pro-active “identification” of “possible suspects”, i.e., the marking of people as a “probable criminal” or “possible criminal”, without those people being yet formally categorised as suspects in the criminal law/criminal procedure law sense (i.e., in the absence of any evidence against them that would suffice to properly designate them as formal suspects, in accordance with criminal procedure law); and
  • pro-active “identification” of people for “preventive targeting” on national security grounds, in cases in which no action can (yet) be taken against them under the criminal law –
  • on the basis of “dynamic”-algorithm-based data mining and profiling.

In other words, the demands for PNR data are part of an attempt at “predictive policing” or “predictive protection of national security”: the Vorverlegen or “bringing forward” of state intrusion, to “deal” with people who are not (yet) breaking the law, but who are either labelled as “probably” or “possibly” being a terrorist or other criminal, or “predicted” to “probably” (or even “possibly”) become one in future.

In our opinion, it cannot be acceptable in a society under the rule of law that intrusive measures are used to “target” people who have done no wrong – not even on the basis that “the computer says” that they are at some dubiously-calculated “risk” of doing some wrong in the future, or similarly dubiously calculated to have “possibly” or indeed “probably” been involved in any wrong, without the kind of evidence (even preliminary evidence) that states under the rule of law require for the imposition of repressive measures.

As the case of Maher Arar shows, being thus labelled on a list is not without consequences – indeed possible extreme consequences.

In other words: dynamic-algorithm-based data mining and profiling with the aim of such predictive or preventive labelling of people on a risk scale is not a legitimate aim in a democratic society, and is therefore inherently fundamentally incompatible with the European Convention of Human Rights and the EU Charter of Fundamental Rights.

This ought to suffice to reject any plans to allow PNR data, or any bulk data on general populations, for large-scale data mining and profiling.

However, we will still also consider the other three fundamental objections mentioned.

There are no effective remedies against the outcomes of dynamic-algorithm-based data mining and profiling:

We have concluded that there simply are no currently available, let alone operational, remedies against the dangers of people being mis-labelled as “high risk” on an anti-terrorist list as a result of deficiencies in the algorithms used, or against discrimination-by-computer caused by the algorithms.

Crucially, you simply cannot remedy such wrongs by “improving” the algorithm, or by adding more data: the dangers are inherent in the processes and can only be countered, if at all, by deep analyses and auditing of the results of the data mining.

There is no indication whatsoever that such deep analyses and audits are actually carried out with the aim of protecting innocent people from being wrongly labelled.

Until such analysis- and audit systems are in place, and are made transparent – with involvement of critical scientists and human rights and data protection advocates – “dynamic” algorithm-based profiling should not be permitted in a state under the rule of law.

In simple human rights and data protection terms: there are no effective remedies available against anti-terrorist/national security dynamic algorithm-based data mining and profiling and without such remedies such operations are simply not compatible with the European Convention on Human Rights, the EU Charter of Fundamental Rights, or the Council of Europe Data Protection Convention.

Or to put it at its absolute mildest:
The conclusion must be that either dynamically-improved algorithms should be regarded as intrinsically contrary to the ECHR, because they cannot be properly controlled; or that actually effective means of controlling them must be found, e.g., to check on how reliable the application of the algorithms is: how many false positivesand how many false negatives did they generate? And were the results (unintentionally) discriminatory?
As noted in the report that is a much bigger challenge than is acknowledged by the proponents of those systems.

Dynamic-algorithm-based datamining and profiling, in particular if aimed at rating people on a risk scale on an anti-terrorist list, violates the most fundamental duty of the State and the EU to respect human identity:

We believe that “preventive” or “predictive” profiling of individuals on the basis of essentially unverifiable and unchallengeable “dynamic”-algorithm-based bulk data, unrelated to any specific indications of wrongdoing, and without any targeting on the basis of such suspicions touches on the “essence”, the untouchable core of the right to privacy – and indeed violates the even more fundamental principle underpinning the right to privacy (and other rights), that states must respect “human identity”.

In our opinion, the PNR instruments allowing for such data mining and profiling are thus, on this basis too, incompatible with European legal principles of the most fundamental kind.

Trying to identify possible or probable terrorists by means of dynamicalgorithm-based datamining and profiling does not work:

Profiling and mining large datasets with the aim of “identifying” rare phenomena, such as the small number of terrorists in the general population (or even in more specific populations) inevitably suffers from the “base rate fallacy”, leading to unacceptably high number of “false positives” (people wrongly labelled a “possible” or “probable” terrorist, or generally as “high risk”), or “false negatives” (actually terrorists not being identified), or both.

It has been acknowledged by the US National Research Council and others that the US data mining operations have not stopped any terrorist attack.

The EU Member States and the European Commission have failed to provide any serious, scientifically verifiable data in support of their claims that bulk PNR data does work in identifying terrorists, or indeed that other bulk datasets, specifically compulsorily retained communications data, have had any impact on law enforcement clear-up rates.

The largest and most serious study into possible efficacy of bulk data retention, by the Max Planck Institute at the request of the European Commission, discussed in Part xxx of the report, found that:

there are no indications that compulsory suspicionless [e-communications] data retention has in the last years led to the prevention of any terrorist attack.

There is still no serious effort on the part of those who clamour, not just for continuing communications data retention, but also for further bulk “just-in-case” collections, such as the compulsory provision of full PNR data, to actually provide any serious, meaningful, scientifically valid evidence to show the efficacy of the measures in fighting serious crime or terrorism.

Yet under the ECHR and the EU Charter, the onus is on them to show convincing evidence of the effectiveness of bulk data collection and –analyses. This duty is the more onerous in view of the very serious interferences with human rights inherent in such collection and analyses (as noted above).

The fact that they have not provided any such evidence, in our opinion, simply underlines the scientific doubts about the efficacy of data mining in these regards: the proponents of bulk data collection, -mining and –profiling do not provide any real evidence of the efficacy of their “dynamic”-algorithm-based system, because they simply DO NOT WORK.

This ought to suffice in simple practical terms to abandon these highly-intrusive and dangerous efforts. But in more legal terms, it means dynamic-algorithm-based data mining and profiling are simply not appropriate, not suited to the proclaimed aim of identifying terrorists from large datasets and thus also not necessary or proportionate in relation to any legitimate law enforcement or anti-terrorist actions.

In other words, our overall conclusions are that:
The compulsory suspicionless provision of PNR data in bulk does not serve a legitimate aim;
There are no effective remedies against the outcomes of dynamic-algorithm-based datamining and profiling;
Dynamic-algorithm-based datamining and profiling, in particular if aimed at rating people on a risk scale on an anti-terrorist list, violates the most fundamental duty of the State and the EU to respect human identity;

and on top of that:

Trying to identify possible or probable terrorists by means of dynamic-algorithm-based datamining and profiling does not work.


NB: We have been asked by the Consultative Committee to draft recommendations that the Committee itself might wish to adopt. We provide a number of those below. However, it is of course entirely up to the Committee to decide whether to make any of these draft, tentative recommendations its own.

The Consultative Committee recalls that European human rights- and data protection law requires, inter alia, that:

  • All requirements that personal data should be provided to law enforcement-, border control- or national security agencies “in bulk” should be clearly set out in clear and precise statute law; and all subsidiary rules that are necessary to enable individuals to foresee the application of the statutory rules, should be equally clear, and made public. Only the lowest, operational guidance-type rules might be kept secret, and even then only as long as they do not contradict of obscure the application of the published rules. This also applies to any requirements that PNR data be handed over to state (or international) authorities in bulk;
  • The application of all those rules in practice should be subject to serious, meaningful transparency and accountability;182 and that
  • There should be full and effective remedies against the use of bulk data, including bulk PNR data, in “general surveillance”.

In that regard, the Consultative Committee notes that the Secretary-General of the Council of Europe has been urged, inter alia, by the Parliamentary Assembly of the Council of Europe, to use his power under Article 52 of the European Convention to demand that all CoE Member States provide full account of any “general surveillance” of the kind exposed by Edward Snowden that they may be involved in, with clarification on how this accords with their obligations under the ECHR.

The Consultative Committee supports this call, and recommends that when the Secretary-General does issue such a demand, he specifically also asks the Member States:

  • whether they use any bulk data they acquire for any data mining and profiling in order to “identify” “possible” (or “probable”) terrorists – with full clarifications of what exactly this “identification” entails (i.e., whether it merely involves matching PNR data against lists of “known” people, or whether it involves rating people on “risk scales” that are reflected in anti-terrorist databases);
  • what safeguards are in place against straightforward mis-identifications on such lists,

but also especially:

– how they guard against erroneous risk ratings of such kind; and why they believe any such redress and remedial action is effective.

Pending the provision of information that might lead to another conclusion, the Consultative Committee believes that the use of “dynamic”-algorithm-based data mining and profiling with the aim of “predictive” or “preventive” labelling of people on a “risk scale” is not a legitimate aim in a democratic society, touches on the essence, the untouchable core, of the right to private life and the right to data protection, and would appear to be unsuited to the aim of actually identifying real terrorists – and thus neither necessary nor proportionate to that aim; and is therefore fundamentally incompatible with the European Convention of Human Rights, the EU Charter of Fundamental Rights – and with the Council of Europe Data Protection Convention of which the Committee is a guardian;

And therefore recommends:

  • That “dynamic”-algorithm-based data mining and profiling for the purpose of “identifying” “possible” (or “probable”) terrorists on the basis of a computer assessment by any State party to the Data Protection Convention be stopped immediately; and
  • That the passing on of PNR data to any non-State Party for the purpose of such “dynamic”-algorithm-based profiling, or that may result in the use of the data in such processing by the non-State Party be also stopped; and
  • That serious scientific studies are commissioned as a matter of urgency of appropriate independent scientist, with the involvement of human rights- and data protection advocates and civil society, to evaluate the effectiveness or ineffectiveness of such processes for such purposes, in particular also in terms of “false positives” and “false negatives”, and in relation to the question of whether such data mining and profiling can or did lead to discriminatory outcomes; and to examine if effective, scientifically sound, means can be developed to counter such negative outcomes (or whether this is impossible).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s