Thursday, 11 August 2016

How secondary data got its name

We are transported back in time to early 2016. The draft Investigatory Powers Bill has been through its pre-legislative Parliamentary scrutiny. Somewhere in Whitehall a committee is discussing drafting changes.

- “Next item please.

- ‘Related communications data’. A bit ticklish, this one.

- What seems to be the problem?

- We carried over ‘Related communications data’ from RIPA, but built on it.

- By built on, you mean expanded?

- Yes.

- How much did you build on it?

- Quite a lot. In RIPA related communications data was a subset of communications data.

- So we would expect. A qualifier such as ‘related’ should limit the scope of the main defined term.

- Yes. But in the draft Bill we ended up with a superset, not a subset.

- You mean broader than communications data?

- Yes.

- So related communications data now includes data that is not communications data?

- Yes.

- This could be serious. We had enough trouble with ‘Data includes any information that is not data’.

- Sorry about that.

- We had better find a new name.

- We’ve had an idea. We have focused on promoting the new culture of clarity, openness and transparency.

- I see. What have you come up with?

- Well, this data is revealing about people’s daily lives. ‘Lifestyle information’ would sum it up nicely.

- This data is collected by GCHQ isn’t it?

- Yes indeed. It can be more useful than content. And the Bill imposes fewer controls than for content on how they use it.

- I see. I wonder if ‘Lifestyle information’ is quite what we are looking for.

- But clarity, openness, transparency….

- Of course. But this is legislation, not a press release. Doesn’t this data include machine communications? Not much life there.

- Something a bit more neutral, then?

- Perhaps. Suggestions anyone?

- The Intelligence and Security Committee said that information associated with communications was the primary value of bulk interception. How about ‘primary data’?

- Pleasingly abstract, has a certain logic. But not quite there, I fear.

- Well, if not primary it must be secondary. Ha! Ha!

- ‘Secondary data’. Perfect. Next item please.”


 

Tuesday, 19 July 2016

Data retention - the Advocate General opines

The long-running court battles over compelling internet service providers to retain data about their users’ internet communications for the benefit of law enforcement took another turn today, with the publication of the Advocate General’s Opinion in the Watson/Tele2 references to the EU Court of Justice.

The litigation has implications both for existing data retention laws in the UK, Sweden and elsewhere in the EU and for the UK Investigatory Powers Bill currently going through Parliament, which significantly expands the government’s data retention powers.

The Advocate General's view is that generalised data retention may be permissible, but only subject to a series of conditions:

  • The general obligation to retain data and the accompanying guarantees must be laid down by legislative or regulatory measures possessing the characteristics of accessibility, foreseeability and adequate protection against arbitrary interference. (This articulates the well-known rule of law requirement of legality.
  • The obligation must respect the essence of the rights to private life and data protection under the Charter. (The reference to the 'essence' of the right is significant. If the essence of a right is violated then the interference is unlawful, regardless of necessity or proportionality.)
  • The only general interest that is capable of justifying a general obligation to retain data is the fight against serious crime. Ordinary offences and the smooth conduct of proceedings other than criminal proceedings would not be capable of justifying a general retention obligation. (This potentially has implications for the Investigatory Powers Bill, which specifies 11 different purposes for which communications data, including mandatorily retained data, can be accessed, as well as access via bulk acquisition warrants.)
  • The general obligation must be strictly necessary to the fight against serious crime. The conditions set out in the CJEU case of Digital Rights Ireland regarding access to the data, the retention period and protection and security of the data must all be respected.
  • The general retention obligation must be proportionate, so that the serious risks engendered by the obligation must not be disproportionate to the advantages offered in the fight against serious crime.
Implications for the Investigatory Powers Bill

What may the case mean for the Investigatory Powers Bill? This stage is only an Advocate General's Opinion, which does not bind the Court. There is no guarantee that the Court of Justice will come to the same conclusion, although more often than not it does so. 

The most obvious potential issues for the Bill would be:

(1) the restriction of a general data retention obligation to combating serious crime (which according to the Advocate General would apply to both the purpose for which the data is required to be retained and access to the data). The Bill would allow access to mandatorily retained communications data for a variety of purposes, which are not limited to serious crime. 

(2) A requirement for independent prior review of access to mandatorily retained communications data. For most access to communications data the Bill would not require prior independent review.

(3) The emphasis on binding legislative measures. This could put in doubt the extent to which the Bill relies on Codes of Practice. Although the government contends that Codes of Practice have statutory force, they do so only to the extent of the status conferred by Schedule 7 para 6 of the Bill. Codes of Practice do not have the same force as statute.

The Bill also extends mandatory data retention into site-level web browsing histories, so-called internet connection records. These are not specifically addressed in the current litigation. The government acknowledges that these are more intrusive than ordinary communications data. This expansion may provide further grounds of challenge, whatever the final decision of the Court in Watson/Tele2.

Further grounds could include not only privacy and data protection issues, but also intrusion into freedom of expression. It might be argued that to the extent that internet connection records mandate the logging of reading habits (the equivalent of lists of book titles) the Bill strays from communications data into content; that doing so interferes with the very essence of freedom of expression and is thus per se unlawful with no need to consider necessity or proportionality.

Background

In the Watson case former claimant David Davis MP (now withdrawn from the case on account of becoming Brexit Minister) and Tom Watson MP (now Deputy Leader of the Labour Party), with co-claimants Peter Brice and Geoffrey Lewis, sued the Home Secretary, challenging the Data Retention and Investigatory Powers Act (DRIPA) which the government pushed through Parliament in four days in July 2014.

DRIPA sought to re-enact in primary legislation the 2009 Data Retention Regulations. These implemented the EU Data Retention Directive and were vulnerable to challenge following the April 2014 CJEU decision in Digital Rights Ireland, invalidating the Directive as contrary to Articles 7 (privacy) and 8 (data protection) of the EU Charter of Fundamental Rights.

The Tele2 case challenges existing Swedish data retention legislation following the DRI decision.

The Swedish and UK cases have been joined and heard together. The Tele2 reference asks broadly whether generalised traffic data retention laws are compatible with EU law and follows up with questions about the specifics of the Swedish legislation. The Watson reference asks two specific questions: first whether the DRI judgment laid down requirements applicable to a national regime for retention of and access to communications data; and second whether Articles 7 and 8 of the Charter lay down stricter requirements than Article 8 of the European Convention on Human Rights.

The Advocate General's Opinion in detail

In the Advocate General's view the second Watson question should be rejected as inadmissible.: "The fact that [the DRI] judgment may possibly have extended the scope of Articles 7 and/or Article 8 of the Charter beyond that of Article 8 of the ECHR is not in itself relevant to the resolution of those disputes… EU law does not preclude Articles 7 and 8 of the Charter from providing more extensive protection than that provided for in the ECHR."  

As to compatibility of the Swedish and UK regimes with EU law, the Advocate General's view is:

- Data retention obligations are within scope of the Privacy and Electronic Communications Directive (PECR), so must be considered within the regime established by that Directive, in particular the exception provided by Article 15(1). [97]

- The EU Charter of Fundamental Rights is applicable to general data retention obligations since they implement the Article 15(1) exception, even though national provisions governing access to retained data do not fall within the Charter. [122], [123]. The AG goes on:
"124. Admittedly, to the extent that they concern ‘activities of the State in areas of criminal law’, national provisions governing the access of police and judicial authorities to retained data for the purpose of fighting serious crime fall, in my opinion, within the scope of the exclusion laid down in Article 1(3) of Directive 2002/58. Consequently, national provisions of that kind do not implement EU law and the Charter therefore does not apply to them.
125. Nevertheless, the raison d’ĂȘtre of a data retention obligation is to enable law enforcement authorities to access the data retained, and so the issue of the retention of data cannot be entirely separated from the issue of access to that data. As the Commission has rightly emphasised, provisions governing access are of decisive importance when assessing the compatibility with the Charter of provisions introducing a general data retention obligation in implementation of Article 15(1) of Directive 2002/58. More precisely, provisions governing access must be taken into account in the assessment of the necessity and proportionality of such an obligation."
- General data retention obligations are consistent with the PECR regime, but only if compliant with strict requirements which flow from Article 15(1) and from the Charter read in the light of DRI. [116] The Article 15(1) exception permits restrictive "legislative measures" that constitute:
"a necessary, appropriate and proportionate measure within a democratic society to safeguard national security (i.e. State security), defence, public security, and the prevention, investigation, detection and prosecution of criminal offences or of unauthorised use of the electronic communication system… To this end, Member States may, inter alia, adopt legislative measures providing for the retention of data for a limited period justified on the grounds laid down in this paragraph. All the measures referred to in this paragraph shall be in accordance with the general principles of Community law…".
As to the requirements flowing from Article 15(1) and the Charter read in the light of DRI:

- The requirements of PECR Article 15(1) and the Charter are cumulative: "Compliance with the requirements laid down in Article 15(1) of Directive 2002/58 does not in itself mean that the requirements laid down in Article 52(1) of the Charter are also satisfied, and vice versa." [131]

- 'Legislative measures' in Article 15(1) must have the characteristics of accessibility, foreseeability and providing adequate protection against arbitrary interference. The measures must therefore be binding on the national authorities upon which the power to access the retained data is conferred [150]:
"It would not be sufficient, for example, if the safeguards surrounding access to data were provided for in codes of practice or internal guidelines having no binding effect, as the Law Society of England and Wales has rightly pointed out. [150]
Moreover, the words ‘Member States may adopt … measures’, which are common to all the language versions of the first sentence of Article 15(1) of Directive 2002/58, seem to me to exclude the possibility of national caselaw, even settled caselaw, providing a sufficient legal basis for the implementation of that provision. I would emphasise that, in this respect, the provision is more stringent than the requirements arising from the caselaw of the European Court of Human Rights. [151]"
- General data retention obligations are capable of being justified by the objective of fighting serious crime, but not combating ordinary crime or the smooth conduct of non-criminal proceedings. [164], [173] Appropriateness, necessity and proportionality of such obligations have to be assessed with reference to that objective [174].

- General data retention obligations do not of themselves go beyond what is strictly necessary for the purposes of fighting serious crime. Necessity is to be assessed in conjunction with the safeguards concerning access to the data, period of retention and the protection and security of the data. [194], [205]

- It is imperative that national courts, when assessing necessity, do not "simply verify the mere utility of general data retention obligations, but rigorously verify that no other measure or combination of measures, such as a targeted data retention obligation accompanied by other investigatory tools, can be as effectiveness in the fight against serious crime." [209] National courts should also determine whether an effective alternative measure would interfere with fundamental rights to a lesser extent than a general data retention obligation [210]; and should consider whether the substantive scope of a retention obligation can be limited while preserving its effectiveness in the fight against serious crime. [211]

- All the safeguards described by the CJEU in paras [60] to [68] of DRI are mandatory. They are not merely illustrative. [221], [226]

In particular:

- "access to and the subsequent use of the retained data must be strictly restricted to the purpose of preventing and detecting precisely defined serious offences or of conducting criminal prosecutions relating thereto." [229] (This is much more limited than the access permitted under either DRIPA or the Investigatory Powers Bill.)

- Access to the retained data should, other than in cases of extreme urgency, be made dependent on a prior independent review by a court or independent administrative body. [232] et seq. The current (RIPA) and proposed (IP Bill) regimes for access to retained communications data for the most part do not comply with this.

When discussing proportionality (a matter to be assessed by the national court) the Advocate General emphasised that:

"the risks associated with access to communications data (or ‘metadata’) may be as great or even greater than those arising from access to the content of communications, as has been pointed out by Open Rights Group, Privacy International and the Law Society of England and Wales, as well as in a recent report by the United Nations High Commissioner for Human Rights. In particular, as the examples I have given demonstrate, ‘metadata’ facilitate the almost instantaneous cataloguing of entire populations, something which the content of communications does not." [259]
 He also emphasised that compliance with the mandatory DRI safeguards does not guarantee proportionality:
"I would emphasise, in this connection, that the mandatory safeguards described by the Court in paragraphs 60 to 68 of Digital Rights Ireland are no more than minimum safeguards aimed at limiting the interference with the rights enshrined in Directive 2002/58 and Articles 7 and 8 of the Charter to what is strictly necessary. Consequently, a national regime which includes all of those safeguards may nevertheless be considered disproportionate, within a democratic society, as a result of a lack of proportion between the serious risks engendered by such an obligation, in a democratic society, and the advantages it offers in the fight against serious crime."

Sunday, 12 June 2016

The List

"Excuse me, Madam.

"Yes?

"I see you’re reading a book.

"Not a crime is it, officer?

"Not usually.

"So…

"Have you put it on the List?

"What list would that be?

"At your local library. You have to register a list of all your reading material at least once a month.

"Is it 1st April today?

"This is no laughing matter, Madam. The List is a vital tool in the fight against terrorists and paedophiles. Haven’t you seen the posters: “The List will keep you safe.”?

"No, I’ve been abroad. Please tell me what’s been going on.

"You’ve heard of the Investigatory Powers Act?

"Yes.

"Then you’ll know that the Home Secretary can require internet service providers to keep records of all the websites that we visit for up to 12 months.

"Yes.

"Somebody pointed out that this wasn’t technology neutral. It records your online but not your paper reading habits.

"I can see how that would offend a tidy mind.

"What’s more there was growing evidence that people who wanted to do us harm were getting around the legislation by going low tech. They were reading paper instead of using the internet.

"Crafty devils.

"Quite. Everyone agreed that a law that applied online should apply offline. The Investigatory Powers (Amendment) Act plugged the gap. So now we have the List.

"Seems pretty pointless. Your average terrorist is hardly going to write down what they have been reading, are they?

"Parliament thought of that, Madam. You see this badge?

"“Book Registration Unit”.

"Our mission is to take all necessary and proportionate steps to ensure compliance with measures that are vital to keeping you and your family and loved ones safe.

"What does that mean?

"It means we can carry out random spot checks on anyone we see reading a book or newspaper in public. And we conduct carefully targeted book raids on homes and businesses. We have the power to confiscate unregistered reading material.

"How do you know whether what you find has been read?

"If they can prove that a book hasn’t been read, then of course we would take no further action.

"I thought book licensing went out with John Milton.

"Madam, I think you have misunderstood. We do not license books. Anyone can publish a book. This is only about reading – no more than an administrative notification procedure. Libraries already keep a record of books that people borrow, so it was natural that they should administer this scheme.

"Who can look at this list?

"The same categories of authorities that can look at website records.

"That’s nearly 50 isn’t it?

"More or less.

"And for the same purposes?

"Yes. So no one has any reason to be concerned about registering their books on the list. Access by the authorities is carefully regulated for limited purposes and subject to stringent safeguards.

"But there is no need for a court order or independent judicial approval?

"No.

"I see. Has no-one objected?

"There were a few protests, but of course it already applied to the internet so they got nowhere.

"ISP lists of websites are automatically generated. Hardly surprising that no-one cared about that. Out of sight, out of mind.

"Possibly. Returning to the matter at hand Madam, have you put that book on the List?

"Of course not.

"I’m sorry then, you’ll have to come with me.


Thursday, 26 May 2016

The content v metadata contest at the heart of the Investigatory Powers Bill

After more than 30 hours of Commons Committee debate and 1,000 or so proposed Opposition amendments, the Investigatory Powers Bill is moving on to its Report stage. Now is a good time to revisit one of the most fundamental points in the Bill: the dividing line between content and metadata.

This is especially topical in view of reports that David Anderson Q.C. is to undertake an independent review of the operational case for bulk powers. As will become apparent the dividing line is not the same for every power. This has particular relevance for bulk interception and equipment interference. 



Sensitivity of content and metadata

Why does the distinction between content and metadata matter? 

The government’s position, which finds support in human rights law, is that intercepting, acquiring, processing and examining the content of a communication is more intrusive than for the “who, when, where, how” contextual data wrapped around it.

Others argue that as, 
thanks to the mobile internet and smartphone apps, metadata has become ever richer and more revealing so the difference in intrusiveness has become less marked. We know from the Report of the Intelligence and Security Committee of Parliament in March 2o15 that the intelligence agencies value metadata as much as, if not more than, content. 

Be that as it may, under the Bill fewer safeguards and constraints apply to selection and examination of metadata than to content.


Content and metadata separation

Where to draw a line between content and metadata is not necessarily obvious. There is no assurance, come the inevitable human rights scrutiny, that courts applying ECHR Articles 8 and 10 or the EU Charter will draw a dividing line in the same place as domestic legislation.

In fact the Bill creates different dividing lines between content and metadata for different purposes: one version for mandatory retention and acquisition of communications data from service providers and another for communications interception and equipment interference. The latter designates more information as metadata and less as content.

This is perhaps not wholly surprising, since the Anderson Review (10.28) was sympathetic to the usefulness of content-derived metadata. Whether the possible extent of the change wrought by the Bill is generally appreciated is another matter.


Consequences of the demarcation

The demarcation between content and metadata has significant practical consequences. The Snowden disclosures suggest that GCHQ has bulk intercepted and stored metadata by the tens of billions of records. Even where such 'related communications data' (the term used in the Regulation of Investigatory Powers Act (RIPA)) is gathered as the by-product of an overseas-focused bulk interception campaign, the agency is able to look in the resulting metadata pool for information about people known to be in the British Islands.

Under RIPA it cannot do that for content without obtaining special Ministerial authorisation. Under the Bill that would need a targeted examination warrant. In Committee the government resisted an amendment that would have extended the requirement for a targeted examination warrant to include metadata as well as content. 

Does Parliament have enough information to know where the line is drawn?

The Commons Science and Technology Committee and a Joint Parliamentary Committee scrutinised the draft Bill. Neither seemed confident that it (or anyone else) understood where the legislation drew the line between content and metadata. 

The Joint Committee identified the definitions of communications data and content as one of the most common concerns among witnesses. Its Recommendation 1 said:

“Parliament will need to look again at this issue when the Bill is introduced. We urge the Government to undertake further consultation with communications service providers, oversight bodies and others to ascertain whether the definitions are sufficiently clear to those who will have to use them.”

For bulk interception the Committee noted the concerns of witnesses about the distinction between ‘related communications data’ and content. It recorded my own suggestion that:

“The Home Office could usefully produce a comprehensive list of datatype examples, where appropriate with explanations of context, categorised as to whether the Home Office believes that each would be entity data, events data, contents of a communication, data capable of being related communications data when extracted from the contents of a communication and so on.”

The Science and Technology Committee had previously noted that the government, in seeking to future-proof the legislation, had produced definitions that had led to significant confusion on the part of communications service providers and others. It said that definitions such as ‘communications content’ needed to be clarified as a matter of urgency.

The closest that the Home Office has come to producing a systematic analysis is in Annex A to evidence submitted to the Joint Committee, categorising a selection of datatypes. This fell too late to be considered by most witnesses and was light on analysis of why particular items fell on one side of the line or the other.

Since then, the Bill as introduced into Parliament in March 2016 has revised some of the definitions. Most significantly it replaces 'related communications data' with ‘secondary data’. This, explained the government:

“[makes] clear that it is broader than communications data. This clarifies the distinction between this type of data and the narrower class of data available under a communications data authorisation.” (emphasis added)

The government published draft Codes of Practice alongside the Bill. In principle the wealth of explanation in these and other sources – Explanatory Notes, Home Office evidence, fact sheets, operational cases, Ministerial statements in Committee, Home Office letters to the Committee and so on – should help us understand where the dividing line lies.

How does the Bill draw the line?

Any attempt to draw a line between content and metadata has to avoid circularity: “Why is this information not content?” “Because it is less sensitive.” “Why is this information less sensitive?” “Because it is not content.”

The Bill's new definition of content (there is no existing definition in RIPA) turns on whether data reveals anything of what might reasonably be considered to be the meaning (if any) of a communication. The Joint Committee commented on the draft Bill:



The impression of having to perform metaphysical gymnastics is bolstered when we are introduced to the concept of ‘inferred meaning’. Paragraph 2.14 of the draft Interception Code of Practice says:

“There are two exceptions to the definition of content section out in section 223(6). The first is there to address inferred meaning. When a communication is sent, the simple fact of the communication conveys some meaning, e.g. it can provide a link between persons or between a person and a service. This exception makes clear than any communications data associated with the communication remains communications data and the fact that some meaning can be inferred from it does not make it content.”

If anything this confirms Paul Bernal’s concern that since meaning can be derived from almost any data, a dividing line based on the existence of meaning is problematic.

What is the practical result of the Bill’s definitions? 

Since the Bill draws the line in different places for different purposes the practical result depends on which set of definitions is used. One set applies to interception and equipment interference, the other to retention and acquisition of communications data. 

The interception variety of metadata is ‘secondary data’. For equipment interference it is the similar ‘equipment data’. Both consist of either ‘systems data’ or ‘identifying data’. Systems data is a critical definition, since S.223(6) lays down that if something is systems data it cannot be content.

The overriding nature of the systems data definition relieves the intercepting or interfering agency of the need to grapple with questions of the ‘meaning’ of the communication. The draft Interception Code of Practice notes that in practice the agency will only have to decide whether information fits within the definition of systems data. If so, it cannot be content even if it reveals some of the meaning of the communication.

The Bill will also enable ‘identifying data’ to be extracted from the contents of a communication and treated as secondary data. Under RIPA, information such as an e-mail address embedded in a web page is treated as content. Under the Bill, intercepting and interfering agencies would be able to scrape such data from the body of a communication and treat it as metadata.

For retention and acquisition of communications data metadata is either ‘entity data’ or ‘events data’. Here the position is reversed: content takes precedence. If information reveals anything of the meaning of the communication (beyond the mere fact or transmission of the communication) then for these purposes it is content, even if for interception or equipment interference purposes it would be systems data. The ‘identifying data’ scraping exception does not apply.

The result is that some types of information may be treated as metadata for the purposes of interception and equipment interference, but as content for the purposes of communications data retention and acquisition.

This overlap of content and metadata is not merely theoretical. The draft Communications Data Code of Practice suggests that some communications may consist entirely of systems data (and thus be deemed to contain no content). The draft Equipment Interference Code of Practice gives the example of machine to machine messages between items of network infrastructure to enable the system to manage the flow of communications. 

Testing the content/metadata dividing line

The most comprehensive way of testing the dividing line between content and metadata is to take a large number of examples of different types of information and assess which side of the line they would fall.  

I have adopted a different approach: take a short e-mail and evaluate which of its components might count as content and which as metadata.

For this exercise I have used the version of the dividing line that contrasts content with ‘secondary data’. This applies to targeted, thematic and bulk interception warrants. It replaces ‘related communications data’ under RIPA. As we have seen, ‘secondary data’ is generally broader than the ‘communications data’ definition used for mandatory retention and acquisition.

Here is my sample e-mail.



An initial impression is probably that the From/To and Sent fields are metadata and everything else is content. Indeed that is the current position under RIPA. When we turn to the Bill however, things seem to be rather different. It appears that most of the e-mail may be either systems data, or identifying data that can be extracted and treated as metadata.

Of course only the visible parts of the e-mail are shown. More datatypes will be lurking in the header. Depending on exactly what they contain those are likely to be secondary data.

To understand how what looks like e-mail content can become metadata, we need to delve more deeply into the definition of 'secondary data'.

What is secondary data?

S.120 of the Bill provides that secondary data, in relation to any communication transmitted by means of a telecommunication system, means any data falling within either of two subsections:

Subsection (4) is systems data which is comprised in, included as part of, attached to or logically associated with the communication (whether by the sender or otherwise). In general terms systems data is data that enables or facilitates a telecommunication system or service, a system holding a communication, or a service provided by such a system, to function. It is not limited to the system that is conveying the communication in question. For a graphical representation of the full definition of systems data, see here.

Subsection (5) concerns identifying data. Like systems data it must be comprised in, included as part of, attached to or logically associated with the communication (whether by the sender or otherwise). Unlike systems data it must also be capable of being logically separated from the remainder of the communication; and, if it were separated, must not “reveal anything of what might reasonably be considered to be the meaning (if any) of the communication, disregarding any meaning arising from the fact of the communication or from any data relating to the transmission of the communication.”

This last condition mirrors the Bill’s general definition of content. It raises the perplexing question of what (and how much) information can be extracted from the content of a communication without revealing anything of the meaning of the communication. Examples given in the Explanatory Notes include:

  • the location of a meeting in a calendar appointment; 
  • photograph information - such as the time/date and location it was taken; and 
  • contact 'mailto' addresses within a webpage
The first two of these examples reveal a possibly surprising feature of identifying data. The data can, it seems, relate to matters such as a real world meeting or the taking of a photograph that are not an aspect of a communication.

This conclusion follows from the definition of ‘identifying data’, which includes data which may be used to identify any person, apparatus, system or service, any event, or the location of any person, event or thing. Events are – apparently - not limited to events forming part of the use of a communications system. Data may relate to the fact of the event, the type, method or pattern of event, or the time or duration of the event. For a graphical representation of the full definition of identifying data, see here.

The Home Office in its evidence to the Joint Committee said: “It is also possible for certain structured data types to be extracted from the content of a communication”. In the Bill neither the systems data nor identifying data definitions appear to be restricted to structured data (and the definition of ‘data’ is certainly not limited in that way).

Identifying data must be capable of being logically separated from the content of the communication. Does that imply some element of structure in the extractable data? It may just mean that physical separation is unnecessary. In the Bill Committee on 12 April 2016 the Minister said: “For example, if there are email addresses embedded in a webpage, those could be extracted as identifying data.”

Another conundrum is whether each item of identifying data has to be evaluated separately in determining whether it reveals anything of the meaning of the communication, or whether extracted items of identifying data should be considered cumulatively.

For the purposes of analysing my sample e-mail I have assumed that unstructured information can for the purposes of the Bill (whether it is technically possible is another matter) be “logically separated” from the rest of the communication; and that extracted elements of identifying data are not considered cumulatively. These are points on which further elucidation would be desirable.

Analysis of sample e-mail

Below is a marked up version of the e-mail. All the highlighted text could, it seems, be either systems data (yellow) or identifying data (orange). 



The “From”, “To” and “Sent” fields fit the definition of systems data, as data facilitating the functioning of a telecommunications service. This is unsurprising and corresponds to the existing position under RIPA.

An e-mail 'Subject' line is content. However, as the draft Equipment Interference Code of Practice explains in relation to equipment data, elements of the subject line may be capable of being extracted and treated as metadata: “the text in the subject line would not be equipment data (unless separated as identifying data).”

So consider “last night’s call”. ‘call’ appears to be identifying data, since it identifies both the fact and type of an event (S.225(2)(b), (3)(a) and (b)). “last night’s” relates to the time of the event (225(3)(c)).

“Bill” and “Graham” both identify, or may assist in identifying, persons (s.225(2)(a)).

“Meet”, Wednesday”and "Red Lion” all appear to be identifying data. “Meet” relates to the type of event (S.225(2)(b), (3)(b)), “Wednesday” to its time (225(3)(c)) and “Red Lion” to the location of the event (225(2)(c)). The fact that this is a real world event rather than a communications event does not appear to prevent it being identifying data. The Explanatory Note gives an example of the location of a meeting in a calendar appointment. It would be odd if information sent in a calendar appointment was treated differently from the same information sent in an e-mail.

“DM”. It is possible that this is systems data, describing something connected with enabling or facilitating the functioning of a telecommunications service. If not, it appears to be identifying data as assisting in identifying a service (225(2)(a)).

“@cyberleagle” is probably systems data (there no apparent requirement that the data should relate to means used to send the intercepted communication itself). If not, this is identifying data.

If this tentative analysis is correct, the secondary data (and equipment data) provisions of the Bill would represent a significant change to the existing content/metadata boundary under RIPA. 

Despite all the supporting Bill materials these provisions still present a challenge to understand. If Parliament is to have a properly informed debate on these matters a fully detailed and reasoned Home Office explanation of what data falls within each category and why would be helpful.


Friday, 15 April 2016

Future-proofing the Investigatory Powers Bill

[Based on a presentation to BILETA 2016 on 11 April 2016]

If we know one thing about the Investigatory Powers Bill, it must be future-proof. Legislation should, self-evidently, stand the test of time in the face of rapid technological change and not become out of date overnight.

However the task is not a simple matter of spraying a coat of future-proof paint on to the Bill. Future-proofing can give rise to serious difficulties when the legislation furnishes the state with intrusive powers over its citizens. An attempt to future-proof blighted the current Regulation of Investigatory Powers Act (RIPA). The signs are that some of the mistakes of RIPA are about to be repeated in the 
Investigatory Powers Bill.

How should we set about future-proofing legislation? In the communications surveillance field two techniques have been tried.

One is a broad, flexible, order-making power. The statute would empower the Secretary of State to make and revise regulations from time to time, subject to less Parliamentary scrutiny than for primary legislation. However when considering the primary legislation Parliament has only the mistiest outline of what it is being asked to approve. The features of the landscape do not appear until it is too late.

That was the approach adopted in the draft Communications Data Bill (CDB), which in 2012 was stopped in its tracks by a Joint Parliamentary Committee. Clause 1 of the draft Bill was a general order-making power that could be used to mandate collection, generation and retention of communications data. Home Office official Charles Farr said in evidence to the Committee:

"Future-proofing and flexibility are at the heart of the language we have used in clause 1."
The Committee noted the "wide anxiety raised by the breadth of clause 1". It concluded:
"We do not think that Parliament should grant powers that are required only on the precautionary principle. There should be a current and pressing need for them."
Remnants of the CDB approach survive in parts of the Investigatory Powers Bill. 

The power to serve technical capability notices on telecommunications operators sets out a list of obligations that can be imposed, including the obligation to remove electronic protection applied by or on behalf of the operator. Although the list is fairly specific, the power itself is open-ended. The obligations that may be specified in regulations merely "include, among other things" the items in the list.

The direct descendant of Clause 1 of the CDB is Clause 78 of the IP Bill. Clause 78, u
nlike the CDB, sets out a list of ‘Relevant Communications Data” that can be the subject of data retention notices issued by the Secretary of State. The items on the list are still described in quite general terms, including for instance “data which may be used to identify, or assist in identifying, … the type, method or pattern, or fact, of communication”.

Clause 78 also retains a strong bias towards the ‘precautionary principle’ deprecated by the 2012 Joint Committee. At present notices under DRIPA can require retention of a few specific types of data in respect of limited categories of communication such as internet e-mail, SMS messages and internet telephony. The Counter Terrorism and Security Act 2015 added IP address resolution data. The financial projections in the Home Office’s IP Bill Impact Assessment allow only for the addition of so-called internet connection records. Yet Clause 78 is much broader than that, encompassing for instance the machine to machine communications that will underpin the internet of things. There has been no attempt to explain or justify this broad scope.

Another method of future-proofing is technological neutrality. This approach contrasts with technology-specific legislation. The objective is to draft at a sufficiently abstract level to allow for future changes in technology.

IT and technology lawyers have been brought up to think of technologically neutral legislation as a Good Thing. Professor Chris Reed observed in 2007 that technological neutrality had become part of the general wisdom: 'motherhood and apple pie'. And so it was, when we were trying to avoid problems such as statutory writing requirements that assumed paper. However technological neutrality runs into trouble when applied to intrusive state powers.

The first problem is that abstract drafting has a tendency to be unintelligible. The obvious example is RIPA. Sir David Omand, the Permanent Secretary in the Home Office at the time RIPA was prepared, told the Commons Home Affairs Select Committee in February 2014:

“The instructions to parliamentary draftsmen were to make it technology-neutral, because everyone could see that the technology was moving very fast. Parliamentary draftsmen did an excellent job in doing that, but as a result I do not think the ordinary person or Member of Parliament would be able to follow the Act without a lawyer to explain how these different sections interact.”
RIPA is notoriously impenetrable, even to lawyers. It has been criticised almost from birth:
"We have found RIPA to be a particularly puzzling statute" (R v W, Court of Appeal, 2003)
"longer and even more perplexing" than the "short but difficult" Interception of Communications Act 1985. (Lord Bingham, A-G’s Ref (No 5 of 2002), 2004) 
"this impenetrable statute … one of the most complex and unsatisfactory statutes currently in force." Professor David Ormerod (2005) 
"a complex and difficult piece of legislation" Mummery LJ (then President of the Investigatory Powers Tribunal, 2006) 
"RIPA 2000 is a difficult statute to understand" (Sir Anthony May, IOCC Report for 2013) 
"RIPA, obscure since its inception, has been patched up so many times as to make it incomprehensible to all but a tiny band of initiates" (David Anderson Q.C., A Question of Trust, 2015.)
Unintelligibility is a direct consequence of the attempt to future-proof by technologically neutral, abstract drafting.

Intelligibility is not just a lawyer's nice to have. Where intrusive powers are concerned it is a rule of law principle that the public should be able to know with reasonable certainty the kind of circumstances in which the powers may be used against them. Unintelligible legislation fails that test. A Question of Trust said:

“The desire for legislative clarity is more than just tidy-mindedness. Obscure laws –and there are few more impenetrable than RIPA and its satellites – corrode democracy itself, because neither the public to whom they apply, nor even the legislators who debate and amend them, fully understand what they mean.”
A Question of Trust challenged the government to produce legislation that is both comprehensive and comprehensible.

A second problem with applying technological neutrality to intrusive powers arises from the fact that where the technology goes, so the powers automatically follow.

This of course is what the technique is intended to achieve. As people use technology in ways that were unknown at the time of the legislation, the powers will apply to the new behaviour. However the result is that the balance between privacy and intrusion that Parliament contemplated at the time it passed the legislation is liable to shift due to mere accidents of technology.

Again, RIPA is a prime example. Mobile phones existed in 2000, as did the internet. But they were not yet combined. When they merged on the smartphone all kinds of human activity that were previously untouched by RIPA suddenly fell into its scope.

It will be said that that is how it should be: conspirators who used to communicate by telephone and now use over-the-top messaging should be subject to equivalent powers. That may be so. But entirely personal behaviour that does not involve any kind of messaging between two or more human beings has also been swept up. We never used to read books or newspapers over the telephone. Now we read websites remotely. RIPA counts this activity, equivalent to sitting at home reading a book, as a communication - as if it were the same as e-mailing or text messaging a contact.

The mobile internet was not contemplated by the legislators in 2000. The result of this accident of technology is a major shift in the privacy/intrusion balance, without Parliament ever having had the opportunity to consider it. Now that Parliament is considering it, it is doing so against the background of a sense of entitlement to the bounty of data that adventitiously fell into the laps 
of intelligence agencies and law enforcement bodies.

What should we do? The key is to ask what we should be seeking to future-proof: the powers themselves or the privacy/intrusion balance settled upon by Parliament when it enacts legislation of this kind. My own view is that we should learn the lesson of RIPA and seek to future-proof the privacy/intrusion balance, not the powers.

That would require a fundamentally different approach: concrete, technology-specific drafting, sunsetting of powers, frequent review by Parliament and continued openness by the government about how the powers have been used. The latter is critical if Parliament is to engage in an informed debate when powers come back for renewal.

Regrettably, the IP Bill has gone down a similar track to RIPA. It has tried to future proof the powers and, as with RIPA, the predictable result is unintelligibility. The House of Commons Science and Technology Committee said in its report on the draft Bill:

“The Home Secretary told us subsequently that the definitions for ‘communication data’ and ICRs were intended to be “technology neutral and flexible in order that, should user behaviour and technology change, they will still apply”. The definitions were to be applied “to the full range of powers and obligations under the draft Bill” which had subsumed provisions from several current statutes. As a result, “the definitions as they are formulated are necessarily abstract”.” (emphasis added)
The Committee concluded:
“The government, in seeking to future-proof the proposed legislation, has produced definitions of internet connection records and other terms which have led to significant confusion on the part of communications service providers and others.”
The "others" include the general public, whose communications form the subject of the Bill and who should, as a matter of the rule of law, be able to understand the scope of the powers.

The government, responding to a recommendation by the Joint Committee, has included provision for review of the Bill after five and a half years. However that is insufficient without addressing the problems of over-abstract drafting. Nor is hiving off detail to Codes of Practice a good approach. It is not the function of Codes of Practice to compensate for obscure legislation. 


Further reading on technology neutrality

Alberto Escudero Pascual and Ian Hosein, The hazards of technology-neutral policy: questioning lawful access to traffic data, by (Communications of the Association for Computer Machinery (CACM) Journal Published 29 Feb 2004)

Chris Reed, Taking Sides on Technology Neutrality, (2007) 4:3 SCRIPTed 263

Graham Smith, Are Techlaw principles in the Ascendency? Intellectual Property Forum: journal of the Intellectual and Industrial Property Society of Australia and New Zealand, Issue 96 (Mar 2014)


[Amended 15 April 2016 to make specific reference to mobile internet.]


Friday, 1 April 2016

An official announcement

The following official statement was issued this morning.
“A temporary ceasefire has been agreed among combatants in the Semantic Wars. 
A list of banned words and phrases has been drawn up including ‘Itemised Phone Bill’, ‘The Outside of an Envelope’ and ‘We only want to do what [named Silicon Valley company] does’.

Any permutation of (indiscriminate, blanket, mass, dragnet, random, uncontrolled, at will) and (surveillance, trawling, snooping, browsing, monitoring) is also prohibited, whether accusations or denials thereof.
 
Use of the term 'Snoopers Charter' will be regarded as grounds for immediate termination of the accord.”

Early indications are that the truce is unlikely to hold.

[BREAKING NEWS, 10.45 am. Unconfirmed reports suggest that teams of inspectors are in the process of being deployed to eliminate stockpiles of unused non-denial denials.]


Tuesday, 29 March 2016

Woe unto you, cryptographers!

A hitherto unknown translation of the Bible has been found in a Cheltenham safe deposit. So far it has been possible to decipher only a few verses:


Matthew 7:16: "Ye shall know them by their metadata".

Job 31:4: "Does not GCHQ see my ways, and count all my hops?" 

Revelation 20:13: "And they were judged all according to the pattern of their communications."

Revelation 3:8: "Behold, I have set before thee an open door, and no man can shut it: for thou hast a little strength, and hast kept my word, and hast not installed end-to-end encryption".

Psalm 1391-2 (To Wearable Tech): "Thou knowest my sitting down and my rising up, thou understandest my thought afar off."

Luke 11:52: "Woe unto you, cryptographers! for ye have taken away the key of knowledge: ye entered not in yourselves, and them that were entering in ye hindered."