Thursday 14 June 2018

Can artificial intelligence solve the criminal disclosure problem?



Here is the problem: digital evidence is of increasing importance in a very wide range of criminal investigations because so much of our lives is being recorded on smart phones, tablets, personal computers and the large system is owned by financial institutions, transport companies and the like. Digital evidence can indicate our location (if we were at any specific place at a specific time or if we were not), our Internet activity, photographs we have taken or had taken of us, who our friends are and how often we contact them, our financial transactions, even our thoughts and attitudes.

That’s why law enforcement officers are keen to seize digital devices from perpetrators, victims and third parties. In order for there to be a fair trial most countries have rules about disclosure, also referred to as discovery. The principle is that a defendant should have an opportunity to review not only the evidence that is adduced against him (or her) but anything else that might have been collected during the investigative process and which might influence the outcome of a trial. In most countries the test is “relevancy” and if necessary defence lawyers will apply to the court for appropriate orders. In the UK the position is rather different: the prosecution has a duty to disclose any material reviewed during an investigation and to disclose it to the defence if it undermines a prosecution case or might assist a defence case. The law dates from 1996 – the Criminal Procedure and Investigations Act (CPIA).

The law was introduced because there had been a number of trials in which crucial material was withheld and miscarriages of justice had occurred. But the law is still not working perfectly and a select committee of the House of Commons is currently reviewing it. (https://www.parliament.uk/business/committees/committees-a-z/commons-select/justice-committee/inquiries/parliament-2017/disclosure-criminal-cases-17-19/) This blog is stimulated by some of the things that are being said in front of that committee.

As soon as anyone starts to examine digital evidence from modern devices they will discover the vast number of files, messages, photos and so on that exist even on the most modestly used smart phone or personal computer, tens and hundreds of thousands. In a typical home there may be seven or eight digital devices that are likely to hold material which ought to be examined. It is difficult enough for a law enforcement investigator to go through all these devices simply to find evidence to support a complaint or suspicion. But the current law of disclosure requires them additionally to look for material which might – undermine their case or support a defendant’s.

Some people hope that “artificial intelligence” will either solve the problem or at least address it. See, for example, the 2017 “ State of Policing” Report by Her Majesty’s Chief Inspector of Constabulary  How far are these expectations likely to be fulfilled?

Digital investigators certainly use significant computer aids but very few of these can really be labelled “artificial intelligence”. The analysis suites they use typically are able to: make a safe forensic copy of the contents of a computer or smartphone, extract obvious potential sources of evidence such as emails, text messages, social media postings, histories of Internet browsing, lists of file downloads and substantive files. Graphics, photo and video files can be viewed in a gallery. The entire contents can be indexed and not only the substantive files but associated time and date stamps and other meta data (additional embedded data associated with Microsoft Office and photo files, for example). Once indexed the investigator can then search for files by combinations of keywords and time and date.  The keywords may be specific to a particular case or maybe generic to types of cases – for example in child sex cases words such as “Lolita”, “teen”, “7yo” and its variants and “asparagus”.  More advanced software allows the investigator to examine files at the bits and bytes level, to analyse hidden operating system features such as the Windows registry and also to interrogate a hard disk directly – these procedures may be necessary when some new product hits the IT market and becomes widely used. The most advanced software even allows the well-trained investigator to create their own procedures, for example to look for things which might be bank account details, credit card credentials, username and password combinations and so on. Increasingly too the software allows examinations to span several different digital devices so that an integrated view of the actions of a person of interest can be examined even if conversations took place using, for example, email, text messages, and social media postings. Separate software can be used to scan an entire hard disk or storage medium for files which have previously been identified as “bad” – child pornography, terrorist material, pirated intellectual property and so on. It does this by using file hashes, aka digital fingerprints – there are databases of file hashes and every time a file is encountered on a hard disk a file hash is created and compared against the database.

But none of this involves artificial intelligence, although this phrase is rather vague and covers a number of different techniques. More properly we are talking about “machine learning”. In machine learning a quantity of unsorted data – files, statistics, graphics – is offered to a program which is capable of deriving rules about that data. Once the rules have been discovered, a feat which may be beyond most humans, they can be applied to further similar unsorted data in order to make predictions or find conclusions. In the health field, given enough medical data, it may be possible to identify commonalities in diagnosis or treatment.  In one form of predictive policing  data can be collected about callouts for police vehicles to respond to incidents. A machine learning program can find rules which in turn can be used to point to situations where and when incidents are more likely to happen so that response teams can get to them more quickly. A travel company with aircraft can monitor over the period of the year the types of meal passengers ask for and thereafter be able to predict with greater accuracy how many meals of each type are loaded onto each flight so that every passenger gets what they want - meat, fish, vegetarian – so that there is less wastage.

There are, however, weaknesses which should not be underestimated. The first of these is the quality and quantity of the training material offered to the program. If the training material is not representative of what you hope to predict results will be poorer. The larger the quantity of material the greater the chance that accurate rules will be derived. Secondly some material is more difficult to parse than others – in the example above of police deployments the data will be unambiguous and clear;  but reliance on informal conversations will be quite another matter. Another form of predictive policing - trying to spot which individuals will turn "bad"- will depend on the observations and judgements of observers, which will inevitably have  inconsistencies.  Third, anyone wishing to deploy machine learning has to look to the possibility of bad outcomes – false and negative positives – where a prediction from machine learning gives a misleading result. A bad outcome in terms of an airline not having the right food on board is one thing but the arrest of a person who turns out to be innocent is quite another.

The main relevant instance of machine learning in disclosure occurs in civil, as opposed to criminal, disclosure. In the civil procedure claimants and defendants are expected to disclose to each other material which might undermine their own case or support that of their opponent. (Civil Procedure Rules Part 31). This is the same test as is applied in the criminal procedure but of course the circumstances are different; in a civil case a dispute exists between parties of roughly equal status (at least in theory) whereas in a criminal case it is the state which charges an accused with a criminal offence and with the possible outcome of loss of liberty and reputation.

In a typical civil case between companies the amount of material that needs to be considered for disclosure can often be enormous – all the emails and substantive documents created by several key individuals over a lengthy period, for example.  Originally the assumption was that lawyers on both sides would carry out a disclosure review manually. But much material will of course be in electronic format and over the years a standard questionnaire has evolved – the ESI Questionnaire. It comes in Practice Direction 31B which is part of Civil Procedure Rule 31. Overall it covers such topics as “reasonable search”, agreements on the format in which files are to be delivered and keyword and other automated searches. The courts may force the parties into an agreement – on the basis that they both have a duty to control costs. But even this type of ESI questionnaire has proved insufficient for the larger cases and resort is now made to the form of artificial intelligence known as machine learning.

Adopting this to disclosure/discovery, the parties to a civil dispute agree to provide a selection of types of document which they believe are likely to meet a disclosure requirement. The machine learning program produces rules defining those documents and the rules are then applied to the much larger archives of documents the parties hold. The parties agree that they will accept the outcome of this machine learning enabled activity. They do this because any more exhaustive form of review is likely to incur crippling expense. Lawyers refer to this as Technology Aided Review or predictive coding.  More detail on how this should work and the judgments a court might make appear in Triumph Controls UK Ltd & others v Primus International Holding Co & another [2018] EWHC 176 (TCC).  A number of companies offer supporting products. The important thing to recognise is that the parties consent to the process.

But will this work for criminal discovery? The first thing to note there is no court mandated requirement to keep costs down. It is up to the prosecution to decide how much to invest in order to support the charges they wish to bring. Secondly, as we saw above, the situation is not dispute resolution but an accused’s potential loss of liberty. There is no mutual consent . Thirdly we need to consider how machine learning-supported criminal disclosure might work in practice.  Who is to provide the documents which the AI programme is to learn from, or take training?  At the moment a defendant is required to produce a Defence Case Statement under ss 5 and 6 CPIA 1996 but all that is required is to set out the general nature of the defence, matters of fact in which there is an issue, the identity of any witness who might be able to provide an alibi and any information in an accused’s possession which might be of material assistance in identifying further witnesses. But they don’t have to produce sample documents and also, given the disparity in resources between the police/CPS and most defence solicitors it is not at all clear how easily most criminal defence solicitors would be able to facilitate the process. The solicitor may indeed require the support of an expert but it is also not clear whether legal aid for this activity would be forthcoming.

Or is it the hope that one can produce a generic set of rules to cover a wide range of disclosure situations? That seems perilously close to the joke widely shared by digital forensic technicians when confronted with an item of analytic software – where is the “find evidence” button? (One vendor went as far as producing a stick-on key for a keyboard imprinted with the words “find evidence”).
One can have nothing but sympathy for police and prosecutors in seeking aids to reduce the burden of criminal disclosure. But a combination of desperation to reduce costs and the exagerated claims of software salesman can lead to wasted money and disappointed expectations. We have seen this with image recognition – image recognition may work well in the limited circumstances of providing access control to a smartphone or for entry to corporate premises but produces poor results when used in the challenging environments of carnivals and other instances of public order.

Almost certainly the remedy to criminal disclosure of digital material is the provision at an early stage of entire forensic images to defence solicitors who wish to employ their own experts. Defence experts, informed by defendants, can then use keyword search and similar software both to verify the work of prosecution experts and to produce, always supposing that it is there to be found, exculpatory material. I have explored this approach both in my evidence to the recent enquiry by the House of Commons Justice Select Committee (Https://Goo.Gl/Qkhxf3) and in another blog (https://goo.gl/rDMwK5 - you may need to scroll down).