Forensic DNA phenotyping (FDP) based on massive parallel sequencing (MPS) is an emerging technique within forensic genetics that enables the prediction of an individual's externally visible characteristics (EVCs) from DNA. Because of its achievements, FDP has become one of the most useful additional tools for aiding police investigations to narrow down the investigative pool in different types of forensic cases. Eye, hair and skin colour can now be predicted reliably and with practically useful accuracy. However, FDP has not yet been routinely implemented in the forensic science field due to, principally, the lack of complete genetic knowledge of pigmentation and facial traits and the lower predictability of intermediate phenotypes. Furthermore, in some countries its application has given rise to a number of ethical, social and legal issues, the latter being the most restrictive barrier to the implementation of FDP.
El fenotipado de ADN forense mediante massive parallel sequencing es una técnica emergente dentro del campo de la genética forense, que permite predecir características visibles del individuo a partir del ADN. Esta herramienta se ha convertido en una de las más potentes para ayudar a estrechar el cerco investigativo en diferentes casos forenses. Hasta ahora el color de ojos, de piel y de pelo son los rasgos fenotípicos que se pueden predecir con la suficiente precisión y fiabilidad como para usarlos en la práctica forense. Sin embargo, esta técnica no está implementada todavía de manera rutinaria en este campo debido, principalmente, a la falta de conocimiento genético completo sobre la pigmentación y los rasgos faciales humanos; y la menor predictibilidad de los fenotipos intermedios. Además, su aplicación en algunos países ha suscitado una serie de cuestiones éticas y sociales, así como legales, siendo estos últimos los más determinantes en la implementación de esta herramienta.
Forensic DNA phenotyping using massive sequencing techniques is being introduced in the field of forensic genetics for two reasons. On the one hand it is responding to the limitations that exist with conventional methods of human identification,1 and on the other hand, it is clearly useful as a police investigation tool. In police investigations of criminal acts testimony is one of the tools used to narrow the list of suspects. However, witnesses are not always available, or their declarations may not be completely trustworthy, as several factors such as the amount of light, prejudices and visual acuity, etc., influence memory and declarations.1 The cases in which genetic analysis brings no relevant information to bear on the investigation and when there are no witnesses, or when their declarations are irrelevant or lead to no clues, normally stagnate and are left unresolved, becoming cold cases.
The DNA phenotype makes it possible to predict phenotypic traits based on DNA. This leads to a reduction in the number of potential suspects in cases where there is no clue regarding the identity of an individual.2 The forensic phenotype may therefore be used as a “biological witness”, offering data that help to find the criminal when it is impossible to identify them by conventional means.1
Interest in the prediction of phenotypic traits has increased in the forensic community due to the increasing discovery of genes that are associated with different physical traits.2 More specifically, single nucleotide polymorphic markers (SNP) are the ones which regulate the phenotype of each individual.3 These traits are considered to be complex, i.e., they are controlled by many genes as well as the influence of environmental factors,3 and their variation is continuous. Of all the human phenotypic traits included in this review, the ones that have been studied the most are eye, hair and skin pigmentation, together with facial features, as these are the most information in criminal investigations when seeking and identifying suspects.
In spite of the progress that has been made over recent years, this tool is not free of limitations, and it is these that constitute the reason why it has yet to be used in the field of forensic science. Moreover, its practical application has led to debates in the scientific community about the ethical, social and legal implications that would have to be taken into account before using DNA phenotyping as a routine tool.
Genetic aspectsPigmentation geneticsStudy of human pigmentation genetics is more advanced than the study of facial traits because the former phenotypic traits are less complex. This is because they follow a semi-Mendelian heredity in which a reasonable number of genes provide the majority of the phenotypic information.4
Genome-wide association studies (GWAS) are the ones most commonly used to trace genes associated with these phenotypic traits. On the one hand, there are technological advances such as SNP microarrays or chips based on massive parallel sequencing (MPS) technique, which simultaneously sequence millions of short fragments of DNA. On the other hand, advances in the development of statistical tests have made it possible to increase the number of genes found to be involved in pigmentation.2
The genes involved in human pigmentation are mainly those associated with the process of melanin production, as the eumelanin/pheomelanin ratio determines pigmentation type. The genes that predominate in eye colour are HERC2 and OCA2,4 while MC1R dominates hair colour.5 Skin colour genetics is more complicated, given that different genes contribute to changing pigmentation as a result of migrations and environmental adaptation during evolution.6 Additionally, biogeographical gene markers, which refer to the genetic variation deriving from the fact of having lived in or occupied a specific geographical location over time, are especially influential in this trait.7 As a result of this, skin colour is directly associated with population group, i.e., depending on the population, genes and their variants vary too.2 For example, the SNP rs1426654 in gene SLC24A5 partially explains the variation in skin colour in Southern Asia populations.8 On the contrary, this SNP is practically fixed in Eastern Asia and Europe, so that it does not contribute to skin colour variation in these populations.8 Finally, the genetic complexity of pigmentation is increased with epistasis phenomena,9,10 in which several SNP interact to give rise to a single phenotype, and the pleiotropic effect due to which a single SNP may affect more than one phenotype trait.6
Facial features geneticsStudies of identical twins conclude that their facial features are almost completely genetically determined, and that environmental factors have a limited influence on them.7 Currently, the search for genes that control facial structure is very active,11–13 and to date, more than 50 loci have been found that are associated with facial features.7 For example, gene PAX3 is associated with the position of the cleft between the eyes, just above the bridge of the nose,14 while gene DCHS2 is associated with nasal prominence.12 Additionally, the genes associated with the facial phenotype are also involved in craniofacial development processes and the disorders that arise with facial anomalies.7 Likewise, some genetic factors which determine biogeographic origin are associated with facial morphology.7,15 Finally, as occurs with pigmentation, a single SNP may control several facial traits and vice versa.7 Epistasis is also present as a regulatory genetic mechanism.7
The number of genes identified to date that influence facial shape is negligible in comparison with the hundreds of thousands of genes suggested by studies.16 Notwithstanding this progress is being made in the search for genes and SNP connected with craniofacial traits, thanks to technological advances that make it possible to develop an optimised facial phenotype7 and the use of approaches not used to date. This is the case of the study by Claes et al., which has found 1,932 SNP in 38 loci that are significantly associated with certain facial traits.11
Phenotype predictionPigmentation predictionOnce the genes involved in the traits in question have been discovered then predictive modelling is applied, based on a statistical model that makes it possible to drawn an inference of the probability that a certain phenotype will arise associated with a percentage of uncertainty among all possibilities, i.e., a prediction.17 After this different parameters are applied that estimate the level of predictive precision.1 The parameter used the most often for categorical predictions is the area under the characteristic operative curve of the receptor.1 The area under the curve may vary from 0.5 (random prediction) to 1 (exact prediction).1
The HIrisPlex-S predictive system is the most recent test that makes it possible to simultaneously predict eye, hair and skin colour.18 HIrisPlex-S is the result of 2 tests developed beforehand by the same group of researchers: IrisPlex, for eye colour prediction,19 and the HIrisPlex test for predicting eye and hair colour.20 HIrisPlex contains the SNP of IrisPlex and other markers in more than 11 genes identified as major contributors to hair colour prediction.20 Finally, the HIrisPlex-S system includes the aforesaid markers as well as 17 SNP skin colour predictors.18 It is therefore composed of 2 multiplex genotyping assays of 41 SNP and 3 statistical models.18
The 3 tests are validated for application in the forensic field.18,21,22 This is particularly relevant in this discipline, given that validation is a process that supplies objective evidence of quality, reliability and reproducibility of the DNA test.23 It is also designed to work with small amounts of DNA as well as with degraded DNA, such as is often found in crime scenes and during investigations. HIrisPlex and HIrisPlex-S produce complete genetic profiles with an input of 63pg18,20 and 31pg19 in the case of IrisPlex.
Facial trait predictionPredicting a complete face based on a DNA sample is the most ambitious objective of DNA phenotyping.24 Different attempts have tried to create a predictive facial model.11,15,24–26 Some studies have managed to create simple models that predict facial features such as eyebrow width or the distance between the eyes, achieving a certain degree of reliability.26 Other studies have found that sex or biogeographical markers determine facial features to a greater extent,15,24 and they are therefore based on creating a face that starts with these traits before adding the other SNP.11,15 Nevertheless, all of the models created to date are simplified and explorative, and they are a long way from being sufficiently reliable to be used in practice.
However, a North American company, Parabon Nanolabs, Inc., has developed a phenotyping system called the Snapshot Forensic DNA Phenotyping System which is able to predict facial shape.27 It also predicts eye, hair and skin colour, together with freckles and biogeographical origin.27 Although the company has not published its predictive algorithm or validation studies,16 some police departments have already used its services to resolve cases.28
Limitations of DNA phenotypingFour major limitations affect genetic studies as well as those involving prediction. On the one hand, it is often said that there is not always a separation of independent samples in different types of study (gene discovery, model development and validation. This may mean that sample size is too small and unrepresentative to generate reliable results.2 Secondly, there is no standard phenotyping method for either genetic studies or predictive ones. Nor is there a single predictive statistical model, and this may give rise to problems of reproducibility and difficulties in sharing information. Thirdly, there is a clear need to construct a standard database that contains phenotypes and genotypes associated with phenotype traits, given that current databases are based on microsatellite genetic markers and identification SNP.1 It would therefore supply more comparable prediction estimators and more solid model parameters. Lastly, to date financing strategies have centred on supporting disease-based research, leaving research associated with normal phenotype variation to one side.1
Limitations in pigmentation geneticsThe main limitation of genetic studies of pigmentation is the quantitative nature of these traits. The genetic association, as well as prediction, depends on the quality of the phenotypic characterisation of the trait.29 Conventionally, pigmentation phenotypes have been defined using simple categories,4 such as the colour blue, brown or intermediate for the eyes. Given their quantitative nature this categorisation notably simplifies the quantitative trait4. Different quantitative phenotyping methods have been developed for eye29–31 and skin32 colour. These studies conclude that the application of these methods in future GWAS may be useful in finding new genes, covering part of the continuous variation in colour that is not taken into account by classical categorical classification. Lastly, the phenotype classification system does not only influence the capacity to find gene variants, as it also represents a reproducibility problem due to the subjective nature of categorical classifications.33 Additionally, although quantitative methods are objective, there are many types of the same so that results may not be comparable.29
A crucial limitation when elucidating skin pigmentation genetics is that it is heterogeneous. The more heterogeneous a trait is, the greater the risk of finding false positives in GWAS.2 Attempts are being made to resolve this problem by using algorithm-based functional prioritisation methods for genes. The result of these consists of genes that have a higher probability of functional association with the phenotype.34
Finally, the influence of environmental factors is another limitation when attempting to understand the genetics of pigmentation, as GWAS, due to their inherent nature, do not take these into account.35 Moreover, sample characteristics (composition and size) have an influence when finding genes, and they may constitute a limitation. For example, an association study in Poland revealed that 3 SNP in the MC1R gene had 100% penetrance for red hair colour.36 Nevertheless, they were not statistically relevant in this sample.36 The authors argued that the fact of the low frequency of these alleles together with the small size of the sample explain this result.36
Limitations of genetics in facial featuresThe most important limitation in connection with facial features is the enormous complexity of the genotype-phenotype map, so that the results of classical GWAS are limited. The variation in morphological traits occurs due to processes that follow a certain order and occur at exact time intervals.37 These modular processes and their interactions are the causes of phenotypic facial variation.37 Changes in development process will therefore have a global effect and, due to the fact that their genetic variance is not cumulative, epistatic processes and their context-dependent interactions will determine variations in the shape of the face.37
Another major limitation is the size and genetic richness of samples. Classical studies have been undertaken with large genetically rich samples which have been able to find genes.38 Nevertheless, recently Claes et al. found 38 significant loci with a relatively small sample.11 The authors suggest that the problem is that previous phenotyping methods are not effective, as they simplify the facial phenotype down to a series of distances, angles and qualitative characteristics.11 Additionally, the statistical methods usually used to characterise the phenotype (main components analysis) treat each feature separately and as if they were invariant.24 Finally, the method used by Claes et al. uses a phenotypic approach based on data generated by the SNP analysis and not based on the phenotype (facial images) as conventional methods do.11
Other limiting factors are the lack of standardisation in protocols as well as in imaging technologies, and this may lead to reproducibility problems.39
Lastly, it would be necessary to undertake studies of large-scale populations to find new variants.7 However, this would involve increased complexity in elucidating the facial phenotype, given that it is necessary to identify and explain all of the processes and interactions involved in producing a specific facial phenotype.7 Furthermore, the volume of data to be processed would generate enormous and complex databases, making the phenotyping process notably difficult.7
Limitations in pigmentation predictionThe main limitation here is the precision of prediction. Forensic genetics is a very exact science which works with minimum margins of error. Nevertheless, when these tests were applied we found margins of error amounting to 10–20%. This may be due to the phenotype categorisation system that simplifies features. Notwithstanding this, the problem of the low predictability of some phenotypes, especially intermediate ones, is largely due to the lack of complete genetic knowledge about them. In some cases it is necessary to find new biomarkers that are not included in the predictive tests.18 Firstly, markers for intermediate phenotypes (green eye colour or blond hair).18–20 Secondly, the changes in pigmentation associated with age are not taken into account by the tests. The lower predictability of hair colour in brown-haired individuals is due to the fact that they were blond during infancy, and this phenotype is predicted rather than their current one.20,22 The molecular mechanism of these changes is unknown.22 This phenomenon is currently being studied in adolescents and children40 as conventionally these samples are composed of adults. The authors recommend that tests should be developed which take different ages into account, thereby achieving greater phenotype coverage.40 Other changes, such as grey hair or the appearance of skin marks, are also due to unknown m molecular mechanisms.2,41 Thirdly, sex is another factor that may modify eye colour. A study in Spain found an association between sex and eye colour prediction.42 The results suggest that women have a tendency to have darker eyes when they have the CC genotype in the HERC2 gene.42 Another study which applied the IrisPlex model to a sample of Italian population found that sex was the second most important predictor of eye colour.43 However, it is unclear how sex affects eye colour and nor are any of the genes in the sexual chromosomes known to be associated with pigmentation.44 Lastly, epistatic or genic interaction phenomena should also be taken into account. Pośpiech et al. found that certain specific DNA variants were significant in the prediction of green eyes only when it was assumed that they interacted.10 Later on the same research group found more interactions associated with pigmentation which slightly improved the predictive precision of all 3 features.9
Limitations in the prediction of facial featuresGiven that the facial features have an extremely high percentage of hereditability, it would be expected that a model based on genetic factors that determine the phenotype would give a correct prediction.25 Nevertheless, knowledge of the genetic basis of human face structure is not sufficiently advanced to make it possible to predict the same based on DNA.16 To date the phenotyping methods used as well as the statistical models created have simplified the facial phenotype, and the number of SNP used is insignificant in comparison with the whole genetic network that controls facial shape. There is also a lack of validation of the statistical methods used as well as those for phenotyping.45
Ethical aspectsEthical aspects do not emerge as a barrier against the implementation of phenotyping, as rather than this they are pillars which support this practice so that it can be used to benefit society as well as legal professionals.
The ethical aspect that has been discussed the most is the possible invasion of privacy.1,46 Several authors state that phenotypic traits, as they are considered to be “public”, are not subject to privacy rights.1,46,47 However, other invisible inferences that may be drawn based on DNA would involve a violation of privacy.1,46,47 One aspect that has not been debated so much, but which Kayser emphasises, is the prediction of phenotypic traits associated with disease.1 Although some genes control the normal variation of a trait as well as its pathological forms, normally variants are different, so that tests developed for phenotyping offer no information on the pathological aspects of a trait.1
Another fundamental ethical question under debate is whether the DNA phenotype should be used as evidence in trial, or whether it should only be used as a tool in investigations. Thus the value of DNA in a trial has been questioned, given that the right to a fair trial is a recognised human right, and the so-called CSI effect may distort this.46 This effect may cause prejudice in a trial, as it triggers an exaggerated degree of trust in the capacities and reliability of forensic technologies.46 This effect may be aggravated by the expectations generated by certain private companies when they commercialise phenotyping as a tool that is able to create a face based on DNA, using methods that have yet to be published or validated.48 Additionally, when a suspect is arrested whether or not they have all of the phenotype traits predicted on the basis of crime-scene samples has no legal validity at all.47 Finally, the predictions made are probabilities and are therefore not conclusive, so that they can only give rise to reasonable doubt in a trial.47 Due to these reasons it is recommended that phenotyping be used solely as an investigative tool, and that it should not be used as judicial evidence.46,47
The relatively high margin of error with this new tool has led some authors to reflect on the root of the problem. For example, MacLean and Lamparello believe that the important thing is whether or not the process is reliable, and not whether the result of the process may give rise to doubts.47 They also conclude that the phenotyping tool is not intrinsically problematic, but rather that the problem lies in how it is interpreted and applied.47 It is therefore crucially important for legal professionals to be suitably training and have clear guides, so that they can interpret and use this tool correctly.46
Thus recently in Spain the National Commission for the forensic use of DNA issued certain recommendations on the use of new genetic analysis technologies and markers of this type.49 Among other aspects it expresses the need for legal regulation, so that phenotype markers and those with a biogeographical origin can only be used as indications when they have no coincided with the database, and when all lines of investigation have been exhausted, as well as solely being used for serious crimes, when this is requested by express judicial or fiscal authorisation and always taking into account the fact that it is an investigative tool and must not be used as conclusive proof in identification.49
A final ethical aspect that has to be taken into account would consist of the storage of data generated by phenotyping, and who should have access to this.46 In the wrong hands genetic information could be used to discriminate against individuals according to their medical predispositions or specific physical traits.46
Social aspectsIf the analysis and interpretation of DNA phenotyping results are not undertaken correctly, this may have a negative social impact. The majority of authors refer to the potential for stigmatisation and discrimination of specific populations which are the most vulnerable to the action of the criminal system.50 Quite simply, authors such as Wienroth argue that predicting a face based on DNA may lead to a situation in which certain groups of people are connected with a crime remotely, or that they may even be innocent as DNA-based prediction does not exactly determine what a person looks like.16 Phenotyping provides group information, i.e., it predicts traits that may be shared by a group of people.51 Due to this, some authors express the need for training to include cultural awareness raising to minimise such possible consequences.51 The traits associated with ethnic group which are influenced by biogeographical markers, such as skin colour, may be treated on the basis of racial stereotyping.16 Slabbert and Heathfield propose eliminating the skin colour categorisation system, supplying a continuous scale for inferring this trait based on DNA.46
Although it has a negative impact, phenotyping has social advantages due to the fact that predicting certain phenotype traits makes it possible to reduce the pool of suspects and exclude individuals who are innocent, thereby saving time and resources in criminal investigations.46
Legal aspectsThere is no agreement between countries on the international laws which regulates forensic DNA phenotyping.46 In many European countries current law on the forensic use of DNA does not cover phenotyping because the laws have not changed since the introduction of techniques such as DNA fingerprinting or DNA profiling.48 Some countries prohibit the inclusion of any physical information except for sex, such as South Africa46, Austria48 and Belgium.8 In Spain the police database is controlled by Organic Law 10/2007.52 This law states that it is only possible to include “identifiers obtained based on DNA, in the context of a criminal investigation, which supply, exclusively, genetic information which reveals the identity of the individual and their sex”.52 Although physical markers are not explicitly prohibited, only those which are non-codifying are permitted, on condition that they do not reveal information about physical features.52 Spain is not the only country which solely permits the extraction of non-codifying sequences, as the same criterion is applied in Australia, South Africa and Germany, among others.46 Nevertheless, intron or intergenic markers are now known, i.e., non-coding markers whose genetic variation may influence in the phenotype of a person, as is the case with SNP rs12913832 in the HERC2 gene in determining eye colour.53 As a result of this, this distinction is now obsolete.1 On the contrary, use of DNA phenotyping is permitted, by law or not, in other countries such as the Netherlands, France, the United Kingdom, Canada and several states within the United States of America.54 In the last year the German state of Baviera passed a new law on the forensic use of DNA.54 According to this new law, the police are authorised to predict, based on DNA, physical traits such as hair, eye and skin colour, although this is only so in cases of imminent danger and the suspicion of a serious crime.54
Discussion and conclusionsForensic DNA phenotyping is currently a tool with great potential for helping forensic investigations by predicting physical traits on the basis of DNA, and it has already been used in unresolved cases.28 Nevertheless, it has neither been implemented nor completely developed within the field of forensic genetics.
From now on investigations should centre on improving genetic knowledge and the prediction of traits which have already been studied. This would overcome current limitations and advance the implementation of this tool. The study and implementation of the different genetic and non-genetic factors mentioned in this review in predictive tests would be a useful means of increasing the precision of prediction in general and when it involves intermediate phenotypes.
The most important obstacle in the development of phenotyping is the lack of knowledge about the genetic network that control traits of this type and which is so complicated.1,16 More and better markers have to be found for the intermediate phenotypes, age-related changes in pigmentation and other processes such as epistasis. Although it is true that GWAS have intrinsic limitations, such as the fact that they are sensitive to genetic heterogeneity and sample size,35 among others, the most important factor preventing the progress of genetic knowledge is the limited amount of economic investment in studies of this type.1 Moreover, Kayser argues that it will only be possible to identify the majority of genetic information by means of international cooperation and very large representative samples of populations.1 This will make it possible to achieve sufficient genetic variation to attain the statistical significance that is lost in small samples. Lastly, the use of large samples will make it possible to carry out more replication and validation studies, as these are required and necessary in forensic science, especially in this new application.
For genetic studies as well as for phenotyping a standard quantitative method of phenotyping is necessary, depending on the characteristics of each phenotypic trait. This would make it possible to find new genes for pigmentation as well as facial features. Furthermore, for some traits such as skin colour, methods of this type will reduce their racial value.46 Finally, the generalised implementation of MPS systems would permit optimisation of time and resources use, making it possible to simultaneously perform phenotype analysis, microsatellite marker profiling and the determination of biogeographical origin. In fact, projects with this objective are now starting to be developed, such as the European VISAGE project.55 The core aim of this project is to develop massive sequencing systems for the study of biogeographical markers and phenotype traits for use in the forensic field. It is planned to implement this in the future as a routine tool in the member states of the European Union.55
The forensic application of the DNA phenotype should be regulated by a clearly defined legislative framework. It should also be governed by establishing control of data management and interpretation, not only by forensic scientists, but also by legal professionals, to guarantee the confidentiality of any personal information which a genetic profile may supply, and to prevent discrimination and physical simplifications. Although it is true that laws are the most restrictive obstacles for the implementation of phenotyping, they are necessary to prevent practices that are not very ethical and amoral uses. Practices which are of doubtful ethics are starting to emerge, such as the arrest of the Golden State Killer last year.56 This case had remained unsolved until, in April 2018, the DNA in the samples from the crime scenes coincided with the samples of DNA found in the suspect's home.56 However, identification of the suspect was made in a controversial way.56 The civil servants uploaded the DNA profile of the suspect to GEDMatch, the genetic open database. This allowed them to find distant relatives of the Golden State Killer, and through these, they identified the man who became the chief suspect in the case.56 After his arrest, ethical concerns centred on the privacy of genetic data, as well as how they can be used and who has access to them, especially in public genetic databases.56
To avoid these rather unethical situations phenotyping must only be applied to unknown individuals and, given that its predictions are less than 100% precise, it must never be used alone, but rather as a useful tool that complements the context of other forms of evidence.46,57 Finally, only the physical characteristics which are perceptible by eye witnesses should be predicted, using markers that cannot give rise to ethical disputes.46,57
During forthcoming years DNA phenotyping will continue to grow. The information that can be extracted from the DNA will increase and it will probably be used in more investigations. Nonetheless, the lack of scientific awareness of investigators and professionals in the world of law is the main hindrance in ensuring the correct use of this tool. Police forces and legal professionals must receive appropriate training that allows them to include phenotyping in their work. For this purpose scientific consortiums such as EUROFORGEN, which have the aim of achieving the integration of existing cooperation in the field of forensic genetics, as well as the creation of a uniform curricular framework and setting training standards in data interpretation for biological evidence,58 are of crucial importance in improving communications between scientists and non-scientists with the aim of understanding its limits and guaranteeing that their expectations regarding phenotyping technology are realistic.16
Conflict of interestsThe authors have no conflict of interests to declare.
Please cite this article as: Canales Serrano A. El fenotipado de ADN como potencial herramienta investigativa en el campo de la genética forense. Estado actual. Rev Esp Med Legal. 2020. https://doi.org/10.1016/j.reml.2020.01.003