La diabetes tipo 1 (T1D) es una enfermedad compleja causada por la destrucción autoinmune de las células beta del páncreas, fruto de la interacción entre factores genéticos y ambientales. A pesar de los enormes avances en el estudio de la T1D, los mecanismos etiológicos de la enfermedad y los factores genéticos y ambientales implicados en la misma siguen siendo en parte desconocidos.
La investigación en el campo de la genética de la T1D abarca más de treinta años y se han descubierto hasta 40 regiones cromosómicas relacionadas con la susceptibilidad a T1D. Algunas de ellas, como la correspondiente al HLA o al gen de la insulina, se han establecido claramente como factores de riesgo, mientras que en otras se necesita confirmar resultados preliminares. En este texto revisaremos algunos de estos genes de susceptibilidad: los alelos del MHC de clase II, el gen de la insulina, CTLA4, PTPN22, IL2 y la subunidad de su receptor IL2RA, la helicasa IFIH1/MDA5, el bloque CAPSL-IL7R, la lectina CLEC16A, el factor de transcripción de la respuesta Th1 STAT4 y la tirosin-fosfatasa PTPN2.
Type 1 Diabetes (T1D) is a complex trait caused by T-cell mediated autoimmune destruction of islet beta cells in the pancreas, resulting of the interaction between genetic and environmental factors. Despite enormous advances in the study of T1D, the aethiologic mechanisms of this disease and the genetic and environmental factors involved remain not fully determined.
Research in the field of T1D genetics spans now for more than thirty years and up to 40 chromosomic regions associated with T1D susceptibility have been reported. Some of these, namely the HLA region or the insulin gene, are clearly established as risk factors while others need more study to confirm preliminary results. In this text we will review some of the main susceptibility genes currently accepted for T1D: the MHC class II alleles, the insulin gene, CTLA4, PTPN22, IL2 and its receptor subunit IL2RA, the helicase IFIH1/MDA5, the CAPSL-IL7R block, the lectin CLEC16A, the Th1 transcription factor STAT4, and the tyrosine phosphatase PTPN2.
Type 1 diabetes mellitus (T1D) is a multifactorial autoimmune disorder resulting from selective destruction of the insulin producing beta cells in the pancreatic islets, a process mediated by dendritic cells, macrophages and auto-aggressive T-cells(1). Autoantibodies are present in 85% of patients who develop T1D, and can be detected even years before the first clinical manifestations appear. Classical β-cell autoantigens are insulin, glutamic acid decarboxilase (GAD) and the islet-associated antigen 2 (IA-2), and more recently a zinc cation transporter (ZnT8, also known as Slc30A8), expressed only in β-cells and implicated in the insulin-secretion pathway(2) has been included. The presence of multiple islet autoantibodies is highly predictive of future T1D(3).
The risk of developing T1D is determined by a complex interaction between multiple genes and environmental factors. A proof of the existence of a genetic component is the higher risk in siblings of T1D patients (7%) compared to the risk in the general population (0.4%). On the other hand, environmental factors also play an important role, as the concordance rate in identical twins ranges from 21 to 70%(4). We will focus this review on some of the main genetic factors involved in T1D predisposition.
In the last thirty years both linkage and association studies have identified many genes related to T1D susceptibility. The first markers of association to be determined were the HLA alleles in the 70s. Subsequently, researchers have approached the subject from two different perspectives: first, the study of candidate genes and second, the hypothesis-free genome-wide association studies performed in the last four years. This kind of studies has vastly spanned the knowledge of disease genetics from well-known genes to chromosomic regions of unknown function, proposing over forty chromosomic regions implicated in T1D pathology. Some of them have been firmly established as risk variants through replication in different populations, while others recently discovered need further study(5). Here we will review some of these genes (Figure 1), from the old MHC studies to the recent genome-wide discoveries.
MHC CLASS IIThe MHC is probably one of the most complex and fascinating genetic regions in the whole human genome. It is also one of the most difficult to study: the multiple genes and numerous alleles of these genes, as well as the extensive linkage disequilibrium (LD) throughout the region are the main difficulties that researchers find when it comes to establishing causal variants for the many diseases (autoimmune, infectious and inflammatory) associated with the HLA region.
The major genetic susceptibility to T1D arises from the MHC(6,7), which contains almost 50% of the total genetic contribution to disease(8,9). The main genes involved in T1D susceptibility are the class II loci HLA-DRB1, -DQA1 and DQB1. Studies in white Caucasian patients revealed that 90% of all T1D patients held either DRB1*03-DQ2 or DRB1*04- DQ8 haplotypes, combinations that in the general population do not exceed 10%. But these are not the only genes implicated in the aforementioned region. Several studies have reported that not all DRB1*03-DQ2 haplotypes predispose equally to disease(10-14). Two conserved extended DRB1*03-DQ2 haplotypes or ancestral haplotypes (AH) were associated with diabetes susceptibility; the AH8.1 (also known as COX) and the AH18.2 (also known as QBL)(15). Both haplotypes carry the same DRB1 and DQ alleles, but the AH18.2 haplotype confers significantly higher risk to T1D according to the aforementioned studies. This increased susceptibility suggests the presence of an additional gene on AH18.2 different from the classical MHC class II, but its characterization is difficult due, as we have mentioned, to the numerous genes, their complex allelic and genetic structure, and the high LD of this region.
Haplotypes of the MHC class II loci also confer the strongest protection from T1D. In Caucasian and Japanese populations the protective haplotype is DRB1*1501-DQ6. Such protection dominates even in the presence of the high-risk susceptibility MHC class II alleles, although it is not absolute(16).
INSULIN (INS)The insulin gene, on region 11p15.5, was early considered as a candidate risk factor and it is, together with HLA haplotypes, one of the most consistently replicated regions associated with T1D(17,18). The marker most associated with T1D contains a variable number of tandem repeats (VNTR) located 596 base pairs upstream of the INS locus. Alleles are clustered into three classes: class I or short alleles (26 to 63 repetitions of the consensus sequence), class II or intermediate alleles (64 to 139 repetitions) and class III or long alleles (140–210 repetitions). Class I and III have been found associated with T1D in opposite ways. The intermediate class II alleles are very rare and do not hold any clear relation with the disease(19).
Class I alleles are associated with lower expression of insulin in the thymus and cause susceptibility to T1D. Class III alleles are protective, even in the presence of a susceptibility allele, and they cause a slight decrease of insulin mRNA expression in the pancreas but a strong increase of expression in the thymus(19). These data point to the regulation of self-tolerance in the thymus as a milestone in the development of T1D pathology. By inducing higher levels of insulin expression in the thymus, class III alleles would allow a stronger negative selection and deletion of insulin-reactive T lymphocyte clones, preventing their escape from the thymus and trigger of the autoimmune reaction(19-21).
Reactivity to insulin alone is not enough to develop T1D, but it has been observed that individuals with the susceptibility polymorphisms have a higher rate of insulin autoantibodies(22).
The single nucleotide polymorphism (SNP) -23Hph1, located in the INS promoter, is in strong linkage disequilibrium with class I and III alleles of the VNTR and is commonly used to replace the more complex VNTR genotyping(22).
CYTOTOXIC T-LYMPHOCYTE ASSOCIATED 4 (CTLA4)CTLA4 has attracted interest for many years, and multiple studies have established association or linkage between this chromosomal region and autoimmune diseases, particularly T1D(23-28). The gene is located in the chromosomic region 2q33, along with two other genes involved in the immune response: the CTLA-4 antagonist CD28 and the co-stimulator ICOS. The LD patterns in this region define two blocks, one comprising the CD28 gene and another including CTLA4 and the 5' end of ICOS. The first studies limited the signals to the CTLA4-ICOS block and subsequent research determined that the SNPs selected had functional effects on the CTLA- 4 protein while the expression and function of ICOS did not suffer any change.
The CTLA4 gene has four exons and three introns. Exon 1 codes for the leader peptide of the protein, exon 2 delivers the ligand-binding domain, exon 3 is the transmembrane domain and exon 4 the cytoplasmic tail. Two of the most studied and replicated polymorphisms in CTLA4 are rs231775 (+A49G), located in exon 1, and rs3087243 ("+6230G>A, also known as CT60) in the 3' region.
The A allele of rs231775 codes for a threonine in position 17 of CTLA-4, forming a threonine-X-asparagin glycosilation site. The mutant G allele (alanine) causes an aberrant glycosilation of the derived protein and lower levels of membrane-bound CTLA-4 in in vitro experiments(26).
The change CT60, a transition from a guanine to an adenine in position +6230 of the gene, is correlated to higher levels of a soluble isoform of the CTLA-4 protein.
The CTLA-4 receptor has two variants derived from alternative splicing: the membrane-bound and the soluble form that lacks the transmembrane domain. It is believed that the soluble isoform contributes to downregulate the activation of T cells by binding to CD80-CD86 receptors in antigen presenting cells and preventing the stimulation of CD28. Ueda et al. found a correlation between high levels of sCTLA-4 in serum and the protective A allele in the CT60 polymorphism(26). By mechanisms as yet unknown, the protective allele augments the levels of sCTLA-4 mRNA and patients who carry this allele have higher levels of free sCTLA-4 in serum, that most likely contribute to control the activation of the immune system.
PROTEIN TYROSINE PHOSPHATASE, NON-RECEPTOR TYPE 22 (PTPN22)The PTPN22 gene, located in chromosome 1 (region 1p13), encodes a lymphoid-specific phosphatase, LYP, which is an important downregulator of T cell activation. It is mainly expressed in T cells, but it is also found in B cells, NK cells, macrophages, monocytes, and dendritic cells.
The first PTPN22 polymorphism found associated with an autoimmune disease, C1858T, was originally described in T1D patients(29). This result was consistently replicated in independent populations(30-37), and the same polymorphism was subsequently associated with several autoimmune disorders like rheumatoid arthritis(38-42), systemic lupus eritematosus(40), Wegener's granulomatosis(43) and myasthenia gravis(44).
The C1858T polymorphism is a non-synonymous SNP that causes a substitution of arginine for tryptophan in the encoded protein (R620W). Functional studies revealed that LYP increases phosphatase activity when the 1858T allele is present(45). This gain of function mutant suppresses T cell signaling more efficiently and leads to a failure in apoptosis of autoreactive T cells and to an insufficient activity of regulatory T cells(46). Since deregulated autoaggressive T cells have been described as the main responsible for the beta cell destruction in T1D, this gene revealed as one of the main susceptibility factors known for T1D.
INTERLEUKIN-2 ALPHA CHAIN RECEPTOR (IL2RA)The imbalance between Th1 and Th2 cytokines plays a crucial role in the regulation of the immune response and in the pathogenesis of autoimmune diseases(47-49). Thus, the genes encoding Th1 and Th2 cytokines and their receptors might be considered good candidates to modify the risk of these diseases. Initially, the main function discovered for interleukin-2 (IL-2), a Th1 cytokine, was to promote proliferation and activation of CD4+ and CD8+ T cells(50). However, the only non-redundant role specific of IL-2 is to support the growth, survival and function of the CD4+CD25+FoxP3+ regulatory T cells, a subset of immune cells involved in suppression of autoimmunity.
Out of the three genes involved in IL-2 signaling, the IL2RA subunit in region 10p15 was the first one to be found associated with T1D and the most consistently replicated(23,51,52). This gene codes for the α subunit of the IL2 receptor, also known as CD25, and it is one of the markers that define regulatory T cells.
The IL2 gene, located in region 4q27, has also been found associated with T1D(23,51,53) and with other autoimmune diseases, such as rheumatoid arthritis (RA)(53), celiac disease(54) and multiple sclerosis (MS)(55).
INTERFERON-INDUCED HELICASE 1 (IFIH1/MDA5)Epidemiological studies have suggested the involvement of viral infections as T1D risk factors in genetically susceptible individuals(56-58). The environmental factors would operate as a trigger in subjects with a background of genetic susceptibility, and the growing incidence of T1D in many countries over the past decades seems to indicate an increased environmental pressure on susceptibility genotypes.
The IFIH1/MDA5 gene encodes the interferon beta-inducible RNA helicase MDA-5, also known as helicard or IFIH1, which is implicated in the innate immune response to microbial pathogens. This protein participates in the apoptosis of virus-infected cells by recognizing dsRNA of picornavirus(59). This cytoplasmic viral detector transmits a signal by a caspase recruitment domain and activates intracellular pathways leading to the induction of proinflammatory cytokines, eventually leading to the activation of adaptive immunity(59). In addition, several studies have reported that picornavirus infections were associated with a higher risk to suffer T1D(60-62) and MS(63,64). Therefore, the IFIH1/MDA5 gene would be a good candidate in order to consider the role of environmental factors in the development of the autoimmune process.
The IFIH1/MDA5 gene is located in the chromosomal region 2q24 and a polymorphism on exon 15, rs1990760 (A946T), was reportedly associated with T1D for the first time by Smyth et al.(65), and then replicated in a genome wide study that validated the case control and familial studies(66). The minor allele of this SNP showed a protective effect for both T1D and MS in an independent population(67) in agreement to the one reported by Smyth et al. Interestingly, Nejentsev et al.(68) have described four rare coding variants of the IFIH1/MDA5 gene. All these polymorphisms have strong protective effects towards T1D, with ORs ranging from 0.51 to 0.74, and all of them modify the protein structure either by coding a truncated protein, affecting the splicing positions or due to change of a highly conserved aminoacid. These observations suggest that the protective effect is achieved through lower levels of functional IFIH1/MDA5.
CALCYPHOSINE-LIKE (CAPSL) AND INTERLEUKIN 7 RECEPTOR (IL7R)In the aforementioned study, Smyth et al. also reported the protective effect of a polymorphism located in the CAPSL gene(65). The same effect was replicated in a genome wide study that found two new polymorphisms on the adjacent IL7R gene. These genes, located on the chromosomal region 5p13, are in the same LD block(66). However, in another genome-wide study carried out by the Wellcome Trust Case Control Consortium (WTCCC), the analysis of the diabetic population did not show association with the CAPSL gene. This fact could be explained because the most associated SNP of the CAPSL gene was not included in the latter study. Notably, the IL7R polymorphisms previously found associated also showed a protective effect, although it did not exceed the threshold of significance for these pangenomic analyses.
The functional effect of the protein encoded by the CAPSL gene still remains unknown; IL7R is the specific subunit of the interleukin 7 receptor, and both the cytokine and the receptor are indispensable for thymic maturation and proliferation of lymphocytes(69,70). Mutations in IL7R in humans cause a severe combined immunodeficiency with major deficiencies in T cell development, whereas B and NK cells are found in relatively normal levels(71,72).
C-TYPE LECTIN DOMAIN FAMILY 16 GENE A (CLEC16A)Two recently performed genome-wide studies(23,73) have pointed out to the region 16p13 as a T1D-associated locus. This region contains a big gene (237kb) known as CLEC16A, for its product holds a predicted C-type lectin domain. Little is known about the function of this protein, but it is almost exclusively expressed in cells of the immune system, particularly in antigen-presenting cells and in NK cells. CLEC16A is a good example of a gene that had to wait to the hypothesis-free genome-wide studies to be discovered and catalogued as an interesting candidate for autoimmune disease susceptibility.
C-type lectins are calcium-dependent polysaccharide binding proteins widely involved in several aspects of the immune response, from adhesion (selectins) to endocytic receptors or membrane-bound lymphocyte lectins, group to which CLEC16A probably belongs. Functional studies would be required to further explain how the detected polymorphisms influence the immune response. The first studies have not evidenced differences in mRNA expression levels between normal and mutant alleles of the implicated polymorphisms(73), in contrast to what has been observed, for example, in the CTLA4 +A49G polymorphism.
SIGNAL TRANSDUCER AND ACTIVATOR OF TRANSCRIPTION 4 (STAT4)The autoimmune response found in T1D patients has been considered a Th1 response. STAT4 is a member of a group of transcription factors which participates in pathways related to the polarization of the immune response towards Th1. When phosphorylated, it dimerizes and travels to the nucleus, triggering the transcription of proinflammatory molecules such as IFN-γ. This cycle helps in maintaining the Th1 response.
Several studies have reported that polymorphisms in the STAT4 gene are related to type 1 diabetes susceptibility(74-76). Associations have been also found with other autoimmune diseases such as systemic lupus erithematosus, rheumatoid arthritis or Sjogren's disease(77,78). Experiments in Stat4 null mice have been promising in uncovering the implication of this transcription factor in the mechanisms underlying autoimmune diseases. These mice have a lower rate of severe arthritis, hardly ever develop T1D and are resistant to experimental allergic encephalomyelitis, a mouse model for human multiple sclerosis(79). Other experiments in non-obese diabetic (NOD) mice, the mouse model specific for T1D, showed that blocking of Stat4 prevented these mice from going into spontaneous diabetes(80).
STAT4 maps to the chromosomic region 2q33, the same region as CTLA4, previously reviewed in this text and consistently associated with T1D and other autoimmune diseases like autoimmune thyroiditis. Thus, the region 2q33 constitutes a hot spot for T1D susceptibility.
PHOSPHOTYROSINE-PROTEIN PHOSPHATASE, NON-RECEPTOR 2 (PTPN2)The PTPN2 gene belongs to the protein tyrosin phospatase (PTP) superfamily as well as the aforementioned PTPN22. Also known as TCPTP, it is expressed in cells of the immune system but also in islet β-cells, and recent studies point out to the possible role of this tyrosine phosphatase in preventing β-cell apoptosis(81).
PTPN2 acts by inactivating the STAT1 transcription factor in the nucleus. STAT1 is activated by IFN-γ signalling, then travels to the nucleus and activates T-bet, which promotes the expression of more IFN-γ in a positive feedback that directs the immune response to Th1. That way, PTPN2 acts as a Th1 downregulator.
PTPN2 is the only gene contained in region 18p11. In the WTCCC study, this region was found associated with T1D, rheumatoid arthritis and Crohn's disease(23). The T1D association was subsequently replicated in an independent cohort(66) and recently, it has also been found associated to celiac disease(82).
NEW SOURCES OF GENETIC VARIABILITYThere are also two new fields to explore on T1D genetics: the role of the recently discovered microRNAs and the copy number variations (CNV). MicroRNAs are a sort of epigenetic mediators that can affect the expression of other genes. Therefore, polymorphisms that alter the functioning of these microRNAs would influence the expression of the regulated gene, but there is still little information in this field concerning T1D susceptibility.
Copy number variations are segments of DNA (from kilobases to megabases long) that can be found in a different number of copies between individuals as a result of rearrangements of the genetic material (inversions, deletions, duplications) and may influence the expression of surrounding genes. In the last few years there has been an expansion in the knowledge of this source of genetic variation and the first CNVs associated to autoimmune diseases, including T1D, have been recently described(83).
CONCLUSIONThe genetic model proposed for T1D postulates that the genetic background of the disease consists of a small number of genes with large effects, namely the HLA region or the INS gene, and a large number of genes with small effects (OR≤ 1.3), such as IL2 and IL7R. New techniques and advances in the study of human genetic variation have allowed the identification of many of these genes with high population frequency and low contribution to disease. It is believed that the remainder of the genetic load of the disease will comprise common variants with low contribution to risk disease (OR= 1.2) and rare variants (population frequency lower than 3%) with high contribution that are particularly interesting to explain the family aggregation and, in particular, the high sibling recurrence and twin concordance.
This genetic research opens the possibility of achieving a protocol to predict risk to T1D in order to apply preventive treatments. Currently, there is not a line of action for preventive treatments, and efforts and protocols are centered in the patient with overt disease. However, many clinical trials are trying to find effective preventive treatments and, once they are ready to use, it will be necessary to define risk groups to which the treatment will be directed(84). One of the best tools for this matter is the knowledge of the genetic susceptibility. It is well known that certain alert signs such as the presence of more than one autoantibody, or the abnormal response to glucose tests, can be found before the clinical onset of the disease, but it is impractical to implement these tests as a screening in the general population. Antibodies would require periodic tests, given that they can become positive at any time, even ten years before the onset of the disease. Furthermore, positivity to one autoantibody is not predictive of disease risk, but when this result extends to two or more autoantibodies within a short time, such as one year, the risk of developing T1D increases considerably. Combining serologic and metabolic tests with an accurate genetic stratification would reduce the groups of study and make it more viable to follow their evolution, to detect the onset of T1D earlier, and to apply preventive treatments when they become available.
CONFLICT OF INTERESTThe authors declare no financial conflict of interest.