*/
In the first of a two-part series on the challenges of voice ID, Dr Jeremy Robson and Dr Harriet Smith advise practitioners to treat evidence from a witness claiming to recognise someone by their voice with extreme caution
When Charles I was beheaded, his executioner wore a visor and false beard. A Captain Hulet was subsequently tried for regicide. The key witness against him was a solider called Gittens who, asked how he was certain of Hulet’s identity, replied: ‘By your voice.’ Hulet had a compelling alibi for the offence, being in prison when it occurred. After retiring for ‘longer than the normal time’ the jury convicted. The judge who presided had sufficient doubt about the verdict to commute the death sentence to one of imprisonment.
Voice identification has long been recognised as having the potential to be the determinative evidence in a criminal trial. With it comes the risk that it might be appear persuasive in circumstances when in fact it is not. While the risks of eyewitness evidence are well known to criminal practitioners, earwitness evidence is encountered much less frequently. As a result, the challenges of dealing with earwitness evidence are not as fully recognised by either investigators or advocates. There is case law which has reminded courts of the need to be as cautious with earwitness evidence as eyewitness evidence but unlike eyewitness evidence, there is a lack of clear guidance in the Codes of Practice to the Police and Criminal Evidence Act 1984 as to how such evidence should be recorded and tested.
The circumstances in which a witness might hear rather than see a perpetrator are numerous; the perpetrator may be wearing a mask, they may be on the end of a telephone or be covertly recorded. There may be a recording of the voice which can be subjected to expert analysis, in other cases it will be necessary to assess the accuracy of the witness identification. The identity verification task an earwitness is required to perform could be matching the person they heard during a crime with someone they know, and who is retained in their memory (a recognition case) or with someone they had not encountered prior to the crime event and are subsequently asked to identify (an identification case). This may appear similar to the exercise eyewitnesses perform, however, recent psychological research highlights that although face and voice processing are in some ways similar, there are also significant differences. Memories for faces and voices are processed in distinct specialised areas of the brain. Further, it has been shown that memory for voices is more error-prone than memory for faces, and that auditory memories are more likely to be subject to interference and corruption from supervening events. Overall, evidence from the psychological literature points to faces being more reliable indicators of identity than voices. This is the case regardless of whether the person in question is familiar to the witness or not. One fact that does exist in common with eyewitness evidence is that the confidence a witness has in their identification does not necessarily indicate accuracy.
One practical difficulty with voice identification is the difficulty witnesses have in articulating a description of a voice. Consider a description given by an eyewitness – ‘the suspect was male, quite tall with dark hair.’ It is unlikely anyone would consider such a description helpful or satisfactory and a defence advocate would have little difficulty in establishing the distinctive features which were not mentioned. With a voice identification however a description of a voice which runs along the lines of ‘male, quite deep, a northern accent of some sort’ sounds more convincing. It is much more difficult to identify inconsistencies between the description which was originally given at the time of the offence and the voice of the speaker. The lower expectations of voice descriptions creates an impression of consistency when in fact the points of similarity are few.
Another important consideration for practitioners dealing with voice cases is the importance of focusing on the duration of speech exposure rather than the extent of an observation. In one case we encountered the witness had seen the perpetrator for 10 minutes without recognising him. It was only when he spoke six words that he was recognised. The case was summed up on the basis of a ‘10 minute view’ rather than a two second burst of speech.
The Court of Appeal has recognised the challenges of voice identification in a number of authorities. In Hersey it was recognised that a modified Turnbull direction was needed in cases regarding voice identification (although it didn’t specify what these modifications should be). Clear guidance has now been incorporated in the judicial directions in the Crown Court compendium.
When an eyewitness has seen a crime, the procedures for testing the ability of the witness to repeat that task are well documented in Code of Practice D to PACE. The video identification procedure is something all criminal practitioners will encounter on an almost daily basis. Similar procedures have been used in voice identification cases with a recording of a voice being placed in a ‘line up’ of other voices and such procedures have been approved by the Court of Appeal in Hersey. Guidance, prepared by Professor Francis Nolan and John McFarlane of the Metropolitan Police, has existed on the best practice in conducting such parades since 2003. In order to ensure that voice parades do not unduly draw attention to the suspect they are often time consuming and expensive to produce. Officers need a sufficient sample of speech (usually taken from the interview) where the defendant is not discussing the crime. Similar ‘foil’ samples have to be extracted from other records of interviews which are of a similar audio quality to ensure the suspect sample does not receive undue prominence. The compilation needs to be overseen by a forensic phonetician to ensure consistency. As a result of these hurdles, voice parades are seldom used by the police to support an identification and some forces have made the policy decision never to conduct them. Our research showed that 74% of criminal lawyers surveyed had never encountered a voice identification parade.
The ‘Improving Voice Identification Procedures’ project, funded by the Economic and Social Research Council, has been examining how to improve understanding of earwitness behaviour in order to modify parade procedures and maximise earwitness accuracy. We have addressed various aspects of the procedure which had not been previously tested. For example, the current 2003 Guidelines recommend that earwitnesses should listen to 60-second samples of each of the 9 voices in a parade before making a decision. Constructing such long samples is time-consuming and makes it harder to find suitable foils, which risks delaying the identification. The sooner the witness’ memory is tested, the more accurate it is likely to be. Our findings reveal that the samples can be reduced to 15-seconds without a performance cost. We have also tested the effect of different types of pre-parade warnings. The 2003 Guidelines state that witnesses should be warned that the perpetrator may or may not be present. While they are not prescriptive about the wording, our results show that strongly worded versions of the warning can negatively affect earwitness performance, and risk inhibiting correct identifications.
Hopefully these findings will result in a streamlined procedure which can be deployed more easily and at an earlier point in the investigation enabling suspects to be identified or, more importantly excluded more effectively. Developments in technology may enable the processes to be streamlined further. Until they do, practitioners faced with a witness who claims to have confidently have recognised someone by their voice should treat the evidence with extreme caution.
The authors are co-investigators on the ‘Improving Voice Identification Procedures’ project funded by the ESRC (Ref. ES/S015965/1) which is a collaboration between the University of Cambridge, De Montfort University, Nottingham Trent University and the University of Oxford.
Identification by voice (2) by Dr Jeremy Robson and Dr Kirsty McDougall will deal with expert analysis of voice.
When Charles I was beheaded, his executioner wore a visor and false beard. A Captain Hulet was subsequently tried for regicide. The key witness against him was a solider called Gittens who, asked how he was certain of Hulet’s identity, replied: ‘By your voice.’ Hulet had a compelling alibi for the offence, being in prison when it occurred. After retiring for ‘longer than the normal time’ the jury convicted. The judge who presided had sufficient doubt about the verdict to commute the death sentence to one of imprisonment.
Voice identification has long been recognised as having the potential to be the determinative evidence in a criminal trial. With it comes the risk that it might be appear persuasive in circumstances when in fact it is not. While the risks of eyewitness evidence are well known to criminal practitioners, earwitness evidence is encountered much less frequently. As a result, the challenges of dealing with earwitness evidence are not as fully recognised by either investigators or advocates. There is case law which has reminded courts of the need to be as cautious with earwitness evidence as eyewitness evidence but unlike eyewitness evidence, there is a lack of clear guidance in the Codes of Practice to the Police and Criminal Evidence Act 1984 as to how such evidence should be recorded and tested.
The circumstances in which a witness might hear rather than see a perpetrator are numerous; the perpetrator may be wearing a mask, they may be on the end of a telephone or be covertly recorded. There may be a recording of the voice which can be subjected to expert analysis, in other cases it will be necessary to assess the accuracy of the witness identification. The identity verification task an earwitness is required to perform could be matching the person they heard during a crime with someone they know, and who is retained in their memory (a recognition case) or with someone they had not encountered prior to the crime event and are subsequently asked to identify (an identification case). This may appear similar to the exercise eyewitnesses perform, however, recent psychological research highlights that although face and voice processing are in some ways similar, there are also significant differences. Memories for faces and voices are processed in distinct specialised areas of the brain. Further, it has been shown that memory for voices is more error-prone than memory for faces, and that auditory memories are more likely to be subject to interference and corruption from supervening events. Overall, evidence from the psychological literature points to faces being more reliable indicators of identity than voices. This is the case regardless of whether the person in question is familiar to the witness or not. One fact that does exist in common with eyewitness evidence is that the confidence a witness has in their identification does not necessarily indicate accuracy.
One practical difficulty with voice identification is the difficulty witnesses have in articulating a description of a voice. Consider a description given by an eyewitness – ‘the suspect was male, quite tall with dark hair.’ It is unlikely anyone would consider such a description helpful or satisfactory and a defence advocate would have little difficulty in establishing the distinctive features which were not mentioned. With a voice identification however a description of a voice which runs along the lines of ‘male, quite deep, a northern accent of some sort’ sounds more convincing. It is much more difficult to identify inconsistencies between the description which was originally given at the time of the offence and the voice of the speaker. The lower expectations of voice descriptions creates an impression of consistency when in fact the points of similarity are few.
Another important consideration for practitioners dealing with voice cases is the importance of focusing on the duration of speech exposure rather than the extent of an observation. In one case we encountered the witness had seen the perpetrator for 10 minutes without recognising him. It was only when he spoke six words that he was recognised. The case was summed up on the basis of a ‘10 minute view’ rather than a two second burst of speech.
The Court of Appeal has recognised the challenges of voice identification in a number of authorities. In Hersey it was recognised that a modified Turnbull direction was needed in cases regarding voice identification (although it didn’t specify what these modifications should be). Clear guidance has now been incorporated in the judicial directions in the Crown Court compendium.
When an eyewitness has seen a crime, the procedures for testing the ability of the witness to repeat that task are well documented in Code of Practice D to PACE. The video identification procedure is something all criminal practitioners will encounter on an almost daily basis. Similar procedures have been used in voice identification cases with a recording of a voice being placed in a ‘line up’ of other voices and such procedures have been approved by the Court of Appeal in Hersey. Guidance, prepared by Professor Francis Nolan and John McFarlane of the Metropolitan Police, has existed on the best practice in conducting such parades since 2003. In order to ensure that voice parades do not unduly draw attention to the suspect they are often time consuming and expensive to produce. Officers need a sufficient sample of speech (usually taken from the interview) where the defendant is not discussing the crime. Similar ‘foil’ samples have to be extracted from other records of interviews which are of a similar audio quality to ensure the suspect sample does not receive undue prominence. The compilation needs to be overseen by a forensic phonetician to ensure consistency. As a result of these hurdles, voice parades are seldom used by the police to support an identification and some forces have made the policy decision never to conduct them. Our research showed that 74% of criminal lawyers surveyed had never encountered a voice identification parade.
The ‘Improving Voice Identification Procedures’ project, funded by the Economic and Social Research Council, has been examining how to improve understanding of earwitness behaviour in order to modify parade procedures and maximise earwitness accuracy. We have addressed various aspects of the procedure which had not been previously tested. For example, the current 2003 Guidelines recommend that earwitnesses should listen to 60-second samples of each of the 9 voices in a parade before making a decision. Constructing such long samples is time-consuming and makes it harder to find suitable foils, which risks delaying the identification. The sooner the witness’ memory is tested, the more accurate it is likely to be. Our findings reveal that the samples can be reduced to 15-seconds without a performance cost. We have also tested the effect of different types of pre-parade warnings. The 2003 Guidelines state that witnesses should be warned that the perpetrator may or may not be present. While they are not prescriptive about the wording, our results show that strongly worded versions of the warning can negatively affect earwitness performance, and risk inhibiting correct identifications.
Hopefully these findings will result in a streamlined procedure which can be deployed more easily and at an earlier point in the investigation enabling suspects to be identified or, more importantly excluded more effectively. Developments in technology may enable the processes to be streamlined further. Until they do, practitioners faced with a witness who claims to have confidently have recognised someone by their voice should treat the evidence with extreme caution.
The authors are co-investigators on the ‘Improving Voice Identification Procedures’ project funded by the ESRC (Ref. ES/S015965/1) which is a collaboration between the University of Cambridge, De Montfort University, Nottingham Trent University and the University of Oxford.
Identification by voice (2) by Dr Jeremy Robson and Dr Kirsty McDougall will deal with expert analysis of voice.
In the first of a two-part series on the challenges of voice ID, Dr Jeremy Robson and Dr Harriet Smith advise practitioners to treat evidence from a witness claiming to recognise someone by their voice with extreme caution
The Chair of the Bar sets out how the new government can restore the justice system
In the first of a new series, Louise Crush of Westgate Wealth considers the fundamental need for financial protection
Unlocking your aged debt to fund your tax in one easy step. By Philip N Bristow
Possibly, but many barristers are glad he did…
Mental health charity Mind BWW has received a £500 donation from drug, alcohol and DNA testing laboratory, AlphaBiolabs as part of its Giving Back campaign
The Institute of Neurotechnology & Law is thrilled to announce its inaugural essay competition
How to navigate open source evidence in an era of deepfakes. By Professor Yvonne McDermott Rees and Professor Alexa Koenig
Brie Stevens-Hoare KC and Lyndsey de Mestre KC take a look at the difficulties women encounter during the menopause, and offer some practical tips for individuals and chambers to make things easier
Sir Geoffrey Vos, Master of the Rolls and Head of Civil Justice since January 2021, is well known for his passion for access to justice and all things digital. Perhaps less widely known is the driven personality and wanderlust that lies behind this, as Anthony Inglese CB discovers
The Chair of the Bar sets out how the new government can restore the justice system
No-one should have to live in sub-standard accommodation, says Antony Hodari Solicitors. We are tackling the problem of bad housing with a two-pronged approach and act on behalf of tenants in both the civil and criminal courts