Thursday, 14 June 2018
Can artificial intelligence solve the criminal disclosure problem?
Here is the problem: digital evidence is of increasing importance in a very wide range of criminal investigations because so much of our lives is being recorded on smart phones, tablets, personal computers and the large system is owned by financial institutions, transport companies and the like. Digital evidence can indicate our location (if we were at any specific place at a specific time or if we were not), our Internet activity, photographs we have taken or had taken of us, who our friends are and how often we contact them, our financial transactions, even our thoughts and attitudes.
That’s why law enforcement officers are keen to seize digital devices from perpetrators, victims and third parties. In order for there to be a fair trial most countries have rules about disclosure, also referred to as discovery. The principle is that a defendant should have an opportunity to review not only the evidence that is adduced against him (or her) but anything else that might have been collected during the investigative process and which might influence the outcome of a trial. In most countries the test is “relevancy” and if necessary defence lawyers will apply to the court for appropriate orders. In the UK the position is rather different: the prosecution has a duty to disclose any material reviewed during an investigation and to disclose it to the defence if it undermines a prosecution case or might assist a defence case. The law dates from 1996 – the Criminal Procedure and Investigations Act (CPIA).
The law was introduced because there had been a number of trials in which crucial material was withheld and miscarriages of justice had occurred. But the law is still not working perfectly and a select committee of the House of Commons is currently reviewing it. (https://www.parliament.uk/business/committees/committees-a-z/commons-select/justice-committee/inquiries/parliament-2017/disclosure-criminal-cases-17-19/) This blog is stimulated by some of the things that are being said in front of that committee.
As soon as anyone starts to examine digital evidence from modern devices they will discover the vast number of files, messages, photos and so on that exist even on the most modestly used smart phone or personal computer, tens and hundreds of thousands. In a typical home there may be seven or eight digital devices that are likely to hold material which ought to be examined. It is difficult enough for a law enforcement investigator to go through all these devices simply to find evidence to support a complaint or suspicion. But the current law of disclosure requires them additionally to look for material which might – undermine their case or support a defendant’s.
Some people hope that “artificial intelligence” will either solve the problem or at least address it. See, for example, the 2017 “ State of Policing” Report by Her Majesty’s Chief Inspector of Constabulary How far are these expectations likely to be fulfilled?
Digital investigators certainly use significant computer aids but very few of these can really be labelled “artificial intelligence”. The analysis suites they use typically are able to: make a safe forensic copy of the contents of a computer or smartphone, extract obvious potential sources of evidence such as emails, text messages, social media postings, histories of Internet browsing, lists of file downloads and substantive files. Graphics, photo and video files can be viewed in a gallery. The entire contents can be indexed and not only the substantive files but associated time and date stamps and other meta data (additional embedded data associated with Microsoft Office and photo files, for example). Once indexed the investigator can then search for files by combinations of keywords and time and date. The keywords may be specific to a particular case or maybe generic to types of cases – for example in child sex cases words such as “Lolita”, “teen”, “7yo” and its variants and “asparagus”. More advanced software allows the investigator to examine files at the bits and bytes level, to analyse hidden operating system features such as the Windows registry and also to interrogate a hard disk directly – these procedures may be necessary when some new product hits the IT market and becomes widely used. The most advanced software even allows the well-trained investigator to create their own procedures, for example to look for things which might be bank account details, credit card credentials, username and password combinations and so on. Increasingly too the software allows examinations to span several different digital devices so that an integrated view of the actions of a person of interest can be examined even if conversations took place using, for example, email, text messages, and social media postings. Separate software can be used to scan an entire hard disk or storage medium for files which have previously been identified as “bad” – child pornography, terrorist material, pirated intellectual property and so on. It does this by using file hashes, aka digital fingerprints – there are databases of file hashes and every time a file is encountered on a hard disk a file hash is created and compared against the database.
But none of this involves artificial intelligence, although this phrase is rather vague and covers a number of different techniques. More properly we are talking about “machine learning”. In machine learning a quantity of unsorted data – files, statistics, graphics – is offered to a program which is capable of deriving rules about that data. Once the rules have been discovered, a feat which may be beyond most humans, they can be applied to further similar unsorted data in order to make predictions or find conclusions. In the health field, given enough medical data, it may be possible to identify commonalities in diagnosis or treatment. In one form of predictive policing data can be collected about callouts for police vehicles to respond to incidents. A machine learning program can find rules which in turn can be used to point to situations where and when incidents are more likely to happen so that response teams can get to them more quickly. A travel company with aircraft can monitor over the period of the year the types of meal passengers ask for and thereafter be able to predict with greater accuracy how many meals of each type are loaded onto each flight so that every passenger gets what they want - meat, fish, vegetarian – so that there is less wastage.
There are, however, weaknesses which should not be underestimated. The first of these is the quality and quantity of the training material offered to the program. If the training material is not representative of what you hope to predict results will be poorer. The larger the quantity of material the greater the chance that accurate rules will be derived. Secondly some material is more difficult to parse than others – in the example above of police deployments the data will be unambiguous and clear; but reliance on informal conversations will be quite another matter. Another form of predictive policing - trying to spot which individuals will turn "bad"- will depend on the observations and judgements of observers, which will inevitably have inconsistencies. Third, anyone wishing to deploy machine learning has to look to the possibility of bad outcomes – false and negative positives – where a prediction from machine learning gives a misleading result. A bad outcome in terms of an airline not having the right food on board is one thing but the arrest of a person who turns out to be innocent is quite another.
The main relevant instance of machine learning in disclosure occurs in civil, as opposed to criminal, disclosure. In the civil procedure claimants and defendants are expected to disclose to each other material which might undermine their own case or support that of their opponent. (Civil Procedure Rules Part 31). This is the same test as is applied in the criminal procedure but of course the circumstances are different; in a civil case a dispute exists between parties of roughly equal status (at least in theory) whereas in a criminal case it is the state which charges an accused with a criminal offence and with the possible outcome of loss of liberty and reputation.
In a typical civil case between companies the amount of material that needs to be considered for disclosure can often be enormous – all the emails and substantive documents created by several key individuals over a lengthy period, for example. Originally the assumption was that lawyers on both sides would carry out a disclosure review manually. But much material will of course be in electronic format and over the years a standard questionnaire has evolved – the ESI Questionnaire. It comes in Practice Direction 31B which is part of Civil Procedure Rule 31. Overall it covers such topics as “reasonable search”, agreements on the format in which files are to be delivered and keyword and other automated searches. The courts may force the parties into an agreement – on the basis that they both have a duty to control costs. But even this type of ESI questionnaire has proved insufficient for the larger cases and resort is now made to the form of artificial intelligence known as machine learning.
Adopting this to disclosure/discovery, the parties to a civil dispute agree to provide a selection of types of document which they believe are likely to meet a disclosure requirement. The machine learning program produces rules defining those documents and the rules are then applied to the much larger archives of documents the parties hold. The parties agree that they will accept the outcome of this machine learning enabled activity. They do this because any more exhaustive form of review is likely to incur crippling expense. Lawyers refer to this as Technology Aided Review or predictive coding. More detail on how this should work and the judgments a court might make appear in Triumph Controls UK Ltd & others v Primus International Holding Co & another [2018] EWHC 176 (TCC). A number of companies offer supporting products. The important thing to recognise is that the parties consent to the process.
But will this work for criminal discovery? The first thing to note there is no court mandated requirement to keep costs down. It is up to the prosecution to decide how much to invest in order to support the charges they wish to bring. Secondly, as we saw above, the situation is not dispute resolution but an accused’s potential loss of liberty. There is no mutual consent . Thirdly we need to consider how machine learning-supported criminal disclosure might work in practice. Who is to provide the documents which the AI programme is to learn from, or take training? At the moment a defendant is required to produce a Defence Case Statement under ss 5 and 6 CPIA 1996 but all that is required is to set out the general nature of the defence, matters of fact in which there is an issue, the identity of any witness who might be able to provide an alibi and any information in an accused’s possession which might be of material assistance in identifying further witnesses. But they don’t have to produce sample documents and also, given the disparity in resources between the police/CPS and most defence solicitors it is not at all clear how easily most criminal defence solicitors would be able to facilitate the process. The solicitor may indeed require the support of an expert but it is also not clear whether legal aid for this activity would be forthcoming.
Or is it the hope that one can produce a generic set of rules to cover a wide range of disclosure situations? That seems perilously close to the joke widely shared by digital forensic technicians when confronted with an item of analytic software – where is the “find evidence” button? (One vendor went as far as producing a stick-on key for a keyboard imprinted with the words “find evidence”).
One can have nothing but sympathy for police and prosecutors in seeking aids to reduce the burden of criminal disclosure. But a combination of desperation to reduce costs and the exagerated claims of software salesman can lead to wasted money and disappointed expectations. We have seen this with image recognition – image recognition may work well in the limited circumstances of providing access control to a smartphone or for entry to corporate premises but produces poor results when used in the challenging environments of carnivals and other instances of public order.
Almost certainly the remedy to criminal disclosure of digital material is the provision at an early stage of entire forensic images to defence solicitors who wish to employ their own experts. Defence experts, informed by defendants, can then use keyword search and similar software both to verify the work of prosecution experts and to produce, always supposing that it is there to be found, exculpatory material. I have explored this approach both in my evidence to the recent enquiry by the House of Commons Justice Select Committee (Https://Goo.Gl/Qkhxf3) and in another blog (https://goo.gl/rDMwK5 - you may need to scroll down).
Saturday, 19 May 2018
Disclosure of Digital Evidence in Rape Trials
This note arises from a hearing by the Commons Justice Select Committee on Disclosure of Evidence in Criminal Trials on 15 May 2018. A transcript is available at: http://data.parliament.uk/writtenevidence/committeeevidence.svc/evidencedocument/justice-committee/disclosure-of-evidence-in-criminal-cases/oral/83096.pdf and a video at https://www.parliamentlive.tv/Event/Index/13d15d6a-8aa9-40ce-bdf2-3d19777b3af8
Digital forensics practice requires that the entire contents of a personal computer or smart phone be forensically copied and then analysed; the concern is that if all of this material is provided to the defence it will be used for aggressive cross examination about a complainant’s previous sexual history.
For a quarter of a century it has been the practice when dealing with evidence from digital devices such as personal computers that a “forensic copy” is made of the device at as early an opportunity as possible. (The procedures have been updated to deal with smart phones and acquisition from cloud-based services). This is done the several reasons. First, it provides an explicit physical link between a device and the person responsible for it so that there can be attribution of its contents. Second, direct examination of a device is highly undesirable because in the course of it data will get altered; the procedures for making a forensic copy avoid this and in fact all examinations take place on the copy not the original. Third, it is all too easy for individual emails, social media postings, webpages, photographs, et cetera to be subject to forgery. But it is extremely difficult to forge an entire hard disk or memory of a phone. The operating systems create file date and time stamps and many other alterations all the time and it is easy to spot tampering. The forensic image thus provides essential provenance, authentication and continuity .
This procedure is for the benefit of all types of evidence that might be adduced from these sources and for the benefit of both prosecution and defence. In a rape trial, along with any other case, the prosecution may wish to rely on digital evidence as well. In case you are asking yourself – can't they redact the forensic image? The answer is not really, given the technical complexity of the task (existence of temporary back-up files, caches, registry entries etc). The issue was examined extensively in the context of legal professional privilege. There the solution is that an independent lawyer is appointed to identify material which should be redacted.
Turning now to the defence position, the availability of a digital image makes it very difficult for the prosecution to cherry pick evidence. The cherry picking may be deliberate, the result of poor training, or simply “confirmation bias”. The role of the defence is to see if this has taken place. It was these concerns that triggered the current enquiry. The enquiry by the Justice Select Committee is about, among other things, the mechanics of disclosure. Because of the quantity of data to be examined it is unrealistic to expect a prosecution expert or technician to carry out an exhaustive examination of all the devices that might have been seized. This plainly creates a problem for the disclosure regime as it is normally understood and where there is a responsibility to identify material which may undermine the prosecution case or support the defence case. In my evidence to the committee I said the solution is to make available to the defence copies of all the forensic images that have been created by the prosecution. It is then open to a defence expert to use tools very similar or identical to those used by the prosecution to carry out the instructions of a defence lawyer. This surely satisfies the aims of disclosure in every practical respect.
There are protections against abuse of disclosed material, specifically sections 17 and 18 of the Criminal Procedure and Investigations Act 1996. There is a criminal offence involved and even if there were not there is still the possibility of contempt of court. (Yes, in the course of examining digital devices I do see information which the owners would regard as private and highly personal but which is also wholly irrelevant to the subject matter of charges. I don’t even share these with instructing lawyers).
Let us now look at the position of what happens in rape trials, an issue extensively canvassed by subsequent witnesses. The main protection is discussed in the CPS Manual: https://www.cps.gov.uk/legal-guidance/rape-and-sexual-offences-chapter-4-section-41-youth-justice-and-criminal-evidence. References is also made to Criminal Procedure Rule 22 (https://www.justice.gov.uk/courts/procedure-rules/criminal/docs/2015/crim-proc-rules-2015-part-22.pdf). (I am fully aware of and sympathetic with concerns that defence lawyers from time to time abuse rape victims in the witness box by asking aggressively about previous sexual history. But it seems to me that if the procedures laid down under s 41 Youth Justice and Criminal Evidence Act 1999 and CPR 22 are inadequate the remedy is to reform that part of the law and the linked judicial guidance rather than to take steps which would make digital evidence significantly less reliable. It may also be the case that inadequate funding for the police and CPS mean that the right applications are not made to the court in a timely fashion.
Saturday, 3 February 2018
How to manage Internet Content Blocking: some practicalities
“The Internet contains some deeply
troublesome and harmful material. The main commercial players are both
immensely rich and immensely clever – they must
be able to do more to find solutions.
If they don’t we politicians will prosecute/fine/tax them until they behave
responsibly.” So goes the refrain but what can we reasonably expect of the
available technologies? Here is a guide for campaigners.
Issue
1: what criteria are you applying for blocking “undesirable” material?
To those who haven’t thought about the
issue it seems obvious what needs to be blocked. Almost anyone other than the
most extreme libertarian will point to material which they find distressing or
harmful – and be able to produce justifying arguments. But if you are asking a
computer program or a human being to make decisions there has to be greater
clarity. In almost all circumstances there will be countervailing arguments
about freedom of speech, freedom of
expression and censorship.
The easiest policy to implement is where
one can point to existing legislation defining specifically illegal content.
For example in the United Kingdom possession of indecent images of children is
a strict liability offence[i].
Published guidelines from the Sentencing Council describe in detail three
levels of offence in terms of age and specific activities[ii].
Similarly “extreme pornography” is clearly defined – essentially animals, dead people and absence of consent[iii].
But outside that particular context there is no definition of “extremism” still
less of “harmful”.[iv]
Successive would-be legislators have struggled because so often the appearance
of a particular document or file depends not only on its content but on its
context.
A simple example: let’s take two
statements: “the state of Israel is a theft from Palestinians” and “the state
of Israel is entitled to occupy all the territories mentioned in the Bible”.
Are these statements, which many people would label “extreme”, simply expressions of history and religious
belief? Do we have a different view of them if they are accompanied by a call
to action – push all the Jews out, push out all the Arabs? The boundaries are
unclear and it seems unreasonable that if legislators are unwilling to provide
assistance that somehow Internet companies should be forced to make those
decisions. There is a separate further issue for the biggest of the global
companies in that judgements about extremism and harmfulness vary across
jurisdictions and cultures.
It gets more difficult with “grooming”
whether for a sexual purpose or to incite terrorist acts. The whole point of grooming is that its
starts low key and then builds. It is easy
enough to identify grooming after a successful exercise[v]
but how do you distinguish the early stages from ordinary conversation? And how do you do so via a computer program
or a human monitor?
Finally, it is even more difficult to think
what the evidence would look like where the enforceable law simply says: social
media sites should keep children safe.
Issue
2: what is the legal framework within which material gets uploaded?
Material gets uploaded to the Internet via
a variety of legal frameworks and this has an impact on where potential legal
enforcement can be directed. An
individual might buy web space from an Internet service provider and create
their own website. That same individual may provide facilities for third
parties to post comments which will then be automatically instantly seen by all
visitors. A social media service will almost certainly require a specific sign
up from their subscribers/members and at that time inform them of an
“acceptable use” or “community standards” policy but will thereafter allow
postings without prior approval or initial restraint.
The position currently taken by most
Internet service companies, bolstered by various directives and laws is that
they are not publishers in the same sense as traditional media such as
newspapers magazines and broadcast television stations. They say that they are
providing facilities but are not editors. Or that they are “data processors” as
opposed to “data controllers”[vi]. The claim is that they are “intermediaries”
for the purpose of the E-Commerce Directive and Regulations. These arguments
are currently being hotly debated. But even under their interpretation there is
a significant impact on what one can reasonably expect them to do in terms of
attempting to block before publication.
The main business of Google is to index
world wide web content which has been originated by others with whom it has no
contractual relationship. It has a series of “crawler” programs which scavenge
the open part of the World Wide Web; the findings are then indexed and that is
what visitors to Google’s main pages see. The contractual relationship that is
most important in the basic Google framework is with those who use the indexes
– essentially the service is paid for by allowing Google to harvest information
about individuals which can be turned into targeted advertising. But Google is
not under any compulsion or contractual obligation to index anything; it can block at will. The main policy reason for refusing to block
is that it has decided that it favours completeness and freedom of speech and
expression; it blocks only when there is an overwhelming reason to do so.
By contrast for Facebook, Twitter, and many
similar services the contractual relationship is with their
customers/subscribers/members. It is consists of saying “we will let you see
what others have posted and we will let you post provided you will allow us to
harvest information about you and send you targeted advertisements”. As part of the contract there is usually an
Acceptable Use or Community Standards provision which are the basis for
blocking. But here again as companies headquartered in the United States they
are concerned about observing First Amendment rights[vii].
There are important differences in terms of
what one can expect if some of this material is to be blocked. In the case of
Google they have no opportunity to prevent material from being uploaded; the
earliest point at which they could intervene is when their crawler comes across
material which has already been published. Their choice is to refuse to index. But for the social media sites and where the
acceptable use policy is part of the customer agreement the earliest
opportunity for blocking is when the customer uploads material.
Issue
3: technical means for blocking material (a) that that has already been
identified as “undesirable”.
We must now look at the various blocking
technologies and see how far they are practical to implement. There is a
significant difference between situations where material has already been
identified by some method or other as requiring blocking and material which no
one has so far seen and passed judgement on.
Blocking of known “undesirable” material (I
am using the word “undesirable” to avoid the problems raised in Issue 1 above) is
relatively straightforward though there are questions of how to do so at the
speed and quantity of uploads. For example on Facebook, it is said that every 60 seconds 510,000 comments are
posted, 293,000 statuses are updated and 136,000 photos uploaded[viii].
It is trivially easy to block an entire
website. The block is on the URL - www.nastysite.com -and this is the method
traditionally used by such bodies as the Internet Watch Foundation and the
National Center for Missing and Exploited Children. It is also possible, again
by URL, to block part of the website -
www.harmlesssite.com /nastymaterial
- though here the blocking will fail if the folder containing the
undesirable material is given a different name or location in the file
structure of the website as a whole. One can extend this method to specific
pages and pictures on the website – www.harmlesssite.com/harmless/nastyfile.jpg
- but here too simple name changes will
render the blocking ineffective.
Blocking on the basis of keyword is
impossibly crude. “Sex” eliminates the
counties of Sussex, Essex, Middlesex etc as well as much useful material on
health, education, law enforcement and more.
In order to overcome these problems one
must revert to a different technology – file hashing. A file hash or
fingerprint of a file is created using a simple program[ix]
which is applied to the totality of a file – photo, picture, documents,
software program – to produce a unique short sequence of numbers and letters.
The program is clever enough so that the most purposes no two dissimilar files
will ever produce the same hash or signature. A database of these hashes is
built up and when a file is presented for examination a hash is created and
compared with the database. If there is a match the newly uploaded file is then
blocked. File hashing is used elsewhere throughout computing in order, for
example, to demonstrate that a file has not been altered or that it has.
This method only works to identify
absolutely identical files so that if an “undesirable” file has been slightly
altered there will be a different hash and so blocking will not take place. To
a limited extent there is also a further technology which deals with slightly
dissimilar files. For photo images the most popular of these is called photoDNA[x]
which is promoted by Microsoft and given away to Internet service providers ,
social media services and to law enforcement. There are two typical situations
where it is effective – when a file has been subject to a degree of compression
to reduce its size and where are there are a series of adjacent clips taken
from a video.
Issue
4: technical means for blocking material (b) that is new and hasn’t been seen
before.
This leaves the situation where a wholly
new material never seen before is uploaded or where previously seen material
has been substantially altered for example by cropping or selection. Here many
claims are made for “artificial intelligence” techniques.
But most computer scientists as opposed to
marketing droids no longer use the phrase “artificial intelligence” or its
contraction “AI” because concepts of what it is keep on changing in the light
of developments in computer science and investigations by biological scientists
in how the human brain actually works. Moreover AI consists of a number of separate
techniques all with their own value but also limitations. It can include
pattern recognition in images, the identification of rules in what initially
appears to be random data, data mining,
neural networks, and machine learning in which a program follows the
behaviour of an individual or event and identifies patterns and linkages. And there are more and there are also many
overlaps in definitions and concepts.
Much depends on what sort of results are
hoped for. A scientist either operating in the physical or social sciences and
possessed of large volumes of data may wish to have drawn to their attention
possible patterns from which rules can be derived. They may want to extend this into making
predictions. A social media company or
retailer may wish to scan the activity of a customer in order to make
suggestions for future purchases – but here high levels of accuracy are not
particularly required. If an intelligence agency or law enforcement agency uses
similar techniques to scan the activities of individual the level of inaccuracy
may have unfortunate consequences – the decision to prevent that person from
boarding an aeroplane or whether they secure future employment or whether they
are arrested.
If one is scrutinising uploaded files,
limitations become apparent. In the first place the context in which a file is
being uploaded may be critical. Field Manuals from the United States Army[xi]
were produced as part of the training mechanism for that organisation but they
are also found on the computers of people suspected of terrorism. Terrorist
manuals may be reproduced on research and academic websites on the basis that
experts need to be able to refer and analyse them. The same photo may appear on
a site promoted by a terrorist group and by a news organisation. Some sexually explicit photos may be
justified in the context of medical and educational research – or law
enforcement.
Beyond that, as we have already discussed,
telling the difference between a document which merely advances an argument and
one which incites may be beyond what is currently possible via AI. My favourite
example of linguistic ambiguity is “I could murder an Indian” which might mean
no more than one person is inviting another to a meal in an Indian restaurant.
In terms of photos, how does one tell the difference between the depiction of a
murderous terrorist act and a clip from a movie or computer game? AI can readily identify a swasitka in an image - but is the photo historic and of Germany in the 1930s and during World War II, or a still from a more modern war movie, or is it on a website devoted to neo-Nazi anti-semitism? How do you reliably distinguish a
16-year-old from an 18-year-old, and for all ethnicities? How does an AI system distinguish the
artistic from the exploitative or when in a sexual situation there is an
absence of consent? What exactly is "fake news" and where are the generally-accepted guidelines to recognise it?
The role of AI techniques therefore is less
that they can make fully automated decisions of their own and more that they
can provide alerts for which human monitors will make a final arbitration. Even
here there is a problem because as with most alert systems it is usually
possible to set a threshold before something is brought to attention. A balance
has to be struck between too many false positives – alerts which identify
harmless events – and false negatives - failures to identify harmful activity.
Issue
5: the role and training of human monitors.
This takes us back to Issue 1. A human
monitor has to make judgements based on criteria laid down by the organisation
exercising blocking. That human monitor needs clear and consistent instructions
and associated with them appropriate training. Among other things the blocking
organisation will want to be able to demonstrate consistency in decisions. As we have seen monitoring for illegality is
easier than making judgements about “extremism” and “harm”. But even here the
structure of many laws is that it is for a court to determine whether a crime
has been committed. Where the test is purely of a factual nature – for example
the age of a person in a sexual situation – the decision might be relatively
simple. But whether somebody is to be convicted for disseminating terrorist
material context may be critical – the academic researcher versus someone
against whom there is also evidence of having sent funds to or has begun to
accumulate the material necessary to build a bomb.
As a result the human monitor can probably only block where they are absolutely sure that a court would convict – leaving a number of potential situations in which a court might possibly convict but the monitor decides that there is insufficient reason to block. At the Internet Watch Foundation which operates on a relatively limited remit confined to illegal sexual material, decisions about marginal photos and files are usually taken by more than one person and may be referred upwards for special review.
As a result the human monitor can probably only block where they are absolutely sure that a court would convict – leaving a number of potential situations in which a court might possibly convict but the monitor decides that there is insufficient reason to block. At the Internet Watch Foundation which operates on a relatively limited remit confined to illegal sexual material, decisions about marginal photos and files are usually taken by more than one person and may be referred upwards for special review.
One policy problem in the counter-terrorism
domain is that material which by itself is not illegal may nevertheless play a
part in the radicalisation of an individual.
A striking recent example was a BBC drama based on events involving
child abuse in the northern town of Rochdale which was said to have inspired a
man to murder a Muslim man and attack
others in Finsbury Park, London.
Where are we to obtain appropriate human
monitors? Facebook and similar organisations have announced that they plan to
recruit 10,000 or more such persons. But there is no obvious source – this is
not a role which exists in employment exchanges or in the universities. Almost
inevitably a monitor will spend most of their day looking at deeply unpleasant
and distressing material – even if you can persuade people to assume such a
role it is plainly important to establish that they have the intellectual
ability and psychological make up to be able to cope and perform. Current indications are that monitors are
recruited in countries that possess a population of graduates but where regular
employment for them is very limited and hourly rates are low. It also looks as though the monitors are not
directly employed by the social media sites but by third-party out-sourcing
companies such as Accenture.[xii] If true this could be aimed at limiting the
liability of the major social media sites.
Moreover, and again one looks at the experience of the Internet Watch
Foundation, employers have a duty of care as damage to the monitor as well as
their effectiveness may develop over time. One must also ask what sort of
career progression such a monitor can expect.
Observations
Too often those who dislike what they see
“on the Internet” spend all their energy in drawing attention to the various
harms and neglect to consider in sufficient detail which remedies might have a
practical impact.
As this article has tried to show criteria
for blocking have to be clear and unambiguous whether the blocking is carried
out by human monitors, computer programs or a combination thereof. There will
always be a substantial territory at the margins where there are disputes.
Fully automated computer-mediated blocking
is high risk because AI is nowhere near sufficiently sophisticated to achieve
results which most people will accept. There is a useful mantra: Blocking is
good and censorship is bad.
So given that obvious harms exist on the Internet: what practical routes are available now?
One of them,
popular with campaigners, is to emulate Germany and its Netzwerkdurchsetzungsgesetz - NetzDG for short.
This requires the biggest social networks - those with more than two million
German users - to take down "blatantly illegal" material within 24
hours of it being reported. For less obvious material, seven days’
consideration is allowed. Fines for violation could be up to 50 million
euros. At the time of writing there have
been no cases. But this law seems to be
limited to situations where there is existing law describing illegality, not to
further instances of extremism and harm.
There are a number of
existing UK laws which address situations which are less than full-on sexual
and terrorism offences, for example the sending by an adult of a sexually
explicit picture to a child and the various preparatory terrorist activities in
the Terrorist Act 2006 -
“encouragement”, dissemination of
materials, raising funds, arranging and
attending training events.
The NSPCC proposes a
Code of Practice which it says should be mandatory[xiii]
but many of their detailed proposals
lack the specificity which is required if there is to be legal enforcement –
“safeguarding children effectively – including preventative measures to protect
children from abuse” is simply the articulation of a desirable policy aim.
However there is much to be said for campaigning for a voluntary code,
violation of which would be an opportunity for public shaming.
This takes us to a
proposal which is in some respects contentious but which merits further
examination: much higher personal identity verification standards before admitting people to
accounts on social media. This would
involve processes similar to those required in opening an online bank account –
birth certificates, passports, possibly signatures from trusted individuals
to sign off on some-one’s identity. Such
an approach would do much to prevent under-age individuals from joining
unsuitable services and stop others from seeking to post anonymously or via a fake
identity. Just as
gun laws do not wholly stop the circulation of illegal firearms such measures
would reduce though not eliminate
grooming, hate speech and fake news. At
the least higher personal identity verification standards would make it much easier
to identify fake identities and identities which are bots as opposed to real
people. But there will be opposition from privacy advocates who will argue that in some countries dissent is difficult to publish unless there is anonymity.
But higher personal identity verification standards
would have to be imposed globally and not just in the UK in order to close off
obvious evasion routes – and both the public and the major social media sites
would need to be persuaded that the advantages outweigh the loss of convenience
and privacy.
[i] S 160 Criminal Justice Act 1988
[ii] https://www.sentencingcouncil.org.uk/offences/item/possession-of-indecent-photograph-of-child-indecent-photographs-of-children/
[iv]
https://www.theguardian.com/uk-news/2017/sep/17/paralysis-at-the-heart-of-uk-counter-extremism-policy
[v] Indeed under s 67 Serious Crime Act 2015 it is an offence for an
adult to send a sexually explicit message to a child
[vi] See for example:
https://inforrm.org/2017/11/12/cjeu-advocate-general-opines-on-the-definition-of-a-data-controller-applicable-national-law-and-jurisdiction-under-data-protection-law-henry-pearce/
[vii] http://constitutionus.com/;
https://www.law.cornell.edu/constitution/first_amendment
[viii] Cited by https://zephoria.com/top-15-valuable-facebook-statistics/
though there are other statistics and it is difficult to know which to credit.
[ix] Such as MD5 or from the SHA family
[x] https://www.microsoft.com/en-us/photodna; https://en.wikipedia.org/wiki/PhotoDNA
[xi] https://www.loc.gov/rr/frd/Military_Law/pamphlets_manuals.html
[xii] https://www.thetimes.co.uk/article/facebook-fails-to-delete-hate-speech-and-racism-hwrzw0qzn;
https://www.thetimes.co.uk/article/meet-the-internet-moderators-b86t2lrlv;
ttps://www.washingtonpost.com/news/the-intersect/wp/2017/05/04/the-work-of-monitoring-violence-online-can-cause-real-trauma-and-facebook-is-hiring/?utm_term=.4d0a47b56d12;
https://www.wsj.com/articles/the-worst-job-in-technology-staring-at-human-depravity-to-keep-it-off-facebook-1514398398;http://www.dailymail.co.uk/news/article-4548898/Facebook-young-Filipino-terror-related-material-Manchester.html
[xiii]
https://www.nspcc.org.uk/what-we-do/news-opinion/more-than-1300-cases-sexual-communication-with-child-recorded-after-change-law/
Subscribe to:
Posts (Atom)