The patient health questionnaire-9 (PHQ-9) is one of the most widely used self-report instruments in primary care. There is no criterion validity of the PHQ-9 in Colombia. The objective was to validate the PHQ-9 as a screening tool in primary care. A cross-sectional, scale criterion validity study was performed using as reference criterion the mini neuropsychiatric interview (MINI) in male and female adult users of primary care centres. We calculated the internal consistency and convergent and criterion validity of the PHQ-9 by analysing the receiver operating characteristics (ROC) and the area under the curve (AUC). We analysed 243 participants; 184 (75.7%) were female. The average age was 34.05 (median of 31 and SD=12.47). Cronbach's α was 0.80 and McDonald's ω was 0.81. Spearman's Rho was 0.64 for HADS-D (P<0.010) and 0.70 for PHQ-2 (P<0.010). The AUC was 0.92 (95% CI 0.880–0.963). The optimal cut-off point of PHQ-9 was ≥7: sensitivity of 90.38 (95% CI: 81.41–99.36); specificity of 81.68 (95% CI: 75.93–87.42); PPV 57.32 (95% CI: 46.00–68.63); NPV 96.89 (95% CI: 93.90–99.88); Youden index 0.72 (95% CI: 0.62–0.82); LR+ 4.93 (95% CI: 3.61–6.74); LR− 0.12 (95% CI: 0.005–0.270). In sum, the Colombian version of PHQ-9 is a valid and reliable instrument for depression screening in primary care in Bucaramanga, with a cut-off point ≥7.
El Cuestionario de salud del paciente-9 (PHQ-9) es uno de los instrumentos de autoinforme más utilizado en Atención Primaria (AP). No existe validez de criterio del PHQ-9 en Colombia. El objetivo fue realizar la validez de criterio del PHQ-9 como instrumento de cribado en AP. Se realizó un estudio trasversal de validez de criterio de una escala usando como criterio de referencia la minientrevista neuropsiquiátrica (MINI) en usuarios adultos de centros de AP de ambos sexos. Se calcularon la consistencia interna y la validez convergente y de criterio del PHQ-9 mediante el análisis de las características operativas del receptor (COR) y el área bajo la curva (ABC). Participaron 243 pacientes, 184 (75,7%) fueron de sexo femenino. El promedio de edad fue 34,05 (mediana 31 y DE=12,47). El α de Cronbach fue 0,80 y ω de McDonald, 0,81. La rho de Spearman fue 0,64 para HADS-D (p<0,010) y 0,70 para PHQ-2 (p<0,010). El ABC fue 0,92 (IC del 95%, 0,880-0,963). El punto de corte óptimo del PHQ-9 fue ≥ 7: sensibilidad de 90,38 (IC del 95%: 81,41-99,36); especificidad de 81,68 (IC del 95%: 75,93-87,42); el VPP 57,32 (IC del 95%: 46,00-68,63); el VPN 96,89 (IC del 95%: 93,90-99,88); índice de Youden 0,72 (IC del 95%: 0,62-0,82; LR+ 4,93 (IC del 95%: 3,61-6,74); LR– 0,12 (IC del 95%: 0,005-0,270). En conclusión, la versión colombiana del PHQ-9 es un instrumento válido y confiable para el cribado de depresión en AP de Bucaramanga, con un punto de corte ≥ 7.
Depression is a major public health problem worldwide1 and has a significant impact on quality of life,2 high morbidity levels,3 reduced life expectancy4 and excess mortality.5 The lifetime prevalence of major depressive disorder (MDE) is 11.2%.6 Prevalences tend to be higher in low and middle income countries such as Pakistan, where depression prevalences of 45.9% have been reported.7 In primary care (PC), the prevalence of MDE varies significantly in a range from 4.5% to 47.8%.8
In the 2015 Colombian National Mental Health Survey, the prevalence of major depression in the general population was 5.4 (95% CI: 4.6–6.4), 2.3 (95% CI: 1.8–2.9) and 0.8 (95% CI: 0.5–1.3) for lifetime, the last year and the last month, respectively.9 In Bucaramanga, the prevalence of clinically significant depressive symptoms (CSDS) was 22.3% (95% CI: 20.0–24.6) and 11.2% for major depressive disorder (MDD) (95% CI: 9.7–12.9%).10 A later population study in adults living in Bucaramanga (N=266), using the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I), reported a prevalence of 16.5% (95% CI: 12.3–21.6),11 which confirms the high prevalence of depression in this region.
In spite of its high burden, chronicity and recurrent nature, depression is underdiagnosed in PC, as approximately 50% of patients who present depression will not be detected.12 This diagnostic gulf could be explained by the fact that more than 75% of patients with depression initially consult a family or PC physician with little training in the identification of depressive disorders,13 time constraints in busy PC environments14 and the lack of validated screening tools in low and middle income countries.15
Because of the above, programmes have been developed to recognise depression,16,17 which recommend standardised tools. A number of tools exist to identify cases of depression; however, their benefits have not been fully determined and the literature reveals contradictory results.18 A recent systematic review suggests that of the screening tools, only the Patient Health Questionnaire-9 (PHQ-9) attains the optimum accuracy level for depression.19 The PHQ-9 is an adjectival scale derived from the Primary Care Evaluation of Mental Disorders (PRIME-MD) to assess depressive symptoms using the DSM-IV criteria.20 The PHQ-9 is shorter than most of the depression screening scales21,22 and is considered the best screening tool for depression in PC due to its accuracy, brevity, being in the public domain and multipurpose, and ease of administration, scoring and interpretation.19,23 The PHQ-9 has been translated into more than 20 languages and used in many countries and contexts.24 In PC, the sensitivity of the PHQ-9 was between 0.71 and 0.84 (mean 0.77) and its specificity was between 0.90 and 0.97 (mean 0.94),25 confirming its adequate psychometric performance in PC, albeit with some variations in the cut-off point (COP) and psychometric parameters that can be explained by the influence of cultural aspects in the response pattern.23 Its broad use is also supported by the findings of Williams et al., who concluded, in an analysis of more than 38 studies with more than 32,000 PC patients, that the PHQ-9 was equal or superior to other measures of depression.22 In addition, the DSM-5 MDD working group and the NICE guidelines consider the PHQ-9 the preferred measure to assess the presence of the depression and quantify its severity.21,22,26
The PHQ-9 has been evaluated in Colombia in university students27: however, it was not compared to a gold standard. The PHQ-9 criteria therefore need to be validated in PC in Colombia against a gold standard, in particular due to the opportunity PC services represent in the early detection of depression.28 As a result, this study's objective was to assess the validity of the PHQ-9 criteria, comparing it with the Mini-International Neuropsychiatric Interview (MINI) for screening for depressive symptoms in adult PC users in the Bucaramanga metropolitan area.
Materials and methodsDesignThis study was designed and analysed based on the recommendations of the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) declaration.29 An analytical observational study of the validity of a scale's criteria was conducted using reference criteria.
ParticipantsLocal PC users of both genders, aged 18–65 years were included. The PC centres belong to the Instituto de Salud de Bucaramanga [Bucaramanga Health institute] (ISABU), a State social enterprise that coordinates primary health care services in the Bucaramanga metropolitan area.
Subjects with psychoactive symptoms, cognitive decline, delirium or an intellectual disability that might prevent them from responding to the tools, those under the effects of psychoactive substances, with functional changes in vision or hearing that might prevent them from understanding the content of the survey, and those who did not understand Spanish were excluded. The sample size was calculated to evaluate the hypothesis regarding the characteristics of a diagnostic test30:
where π1 is the sensitivity of the standard (0.96) and π2 is the anticipated sensitivity of the PHQ-9 (0.88); Z1−α/2 was set at 1.96 and Z1-B at 1.28; δ was set at 0.08 (π1−π2). The result was 214. Participants were selected consecutively as they attended the health centres until a maximum number of subjects above 214 had been surveyed.ProceduresThe study was approved the ethics committee of ISABU and the Universidad de Santander [University of Santander], taking into account the current international31 and national norms32 for research with human subjects.
The PHQ-9 was translated following the recommendations for adaptation of self-reporting tests.33 A direct translation from the original scale was performed by two independent certified bilingual translators; discrepancies between the two translations were discussed, then a backtranslation was done into English, which was reviewed by the research team to assess its closeness to the original scale. The translated scale was then reviewed by 10 psychiatrists with research expertise or clinical experience to verify whether the items were consistent with the construct of depression, who also commented on the comprehension and wording of the items. Ten people from the general population with a history of depression also gave their opinions on the comprehension of the questions. The research group analysed and incorporated the patients’ and experts’ observations to obtain the new Colombian version (Fig. 1). With the new version of the scale, a pilot test was conducted with 21 subjects with characteristics similar to the study subjects but in other centres. They answered the questions without difficulties and no adjustments to the grammatical structure were needed.
The research team was trained in the structured psychiatric interview (MINI) and hetero-administration of the PHQ-9. The people responsible for administering the scales and structure interviews were professionals with clinical experience (four psychologists, two general practice residents and one psychiatrist) who received eight hours of training by the lead author, with theoretical and practical sessions, role playing, and observation of pilot interviews with feedback. The study participants were contacted in the waiting room as they arrived for outpatient appointments with a general practitioner for any reason. One member of the research group explained the nature of the study and gave them the informed consent form. The screening scales were read by trained members of the research team. After completing the PHQ-9, each participant was assessed on the same day in another consulting room by a different trained member of the team (psychologist or psychiatrist) who did not know their PHQ-9 result, to administer the MINI depression module. The questionnaires were reviewed by two independent reviewers and saved in a form generated in Excel.
ToolsPHQ-9The PHQ-9 is a screening scale that measures the presence and severity of depressive symptoms.34 The PHQ-935 is made up of the nine symptoms from DSM-IV MDE criterion A.20 These nine items are arranged in the form of an adjectival scale that assesses the presence of the symptom in the last two weeks (“not at all”, “several days”, “more than half the days” and “nearly every day”), scored from 0 to 3 to give a score between 0 and 27.36
It can be self- or hetero-administered and is used both algorithmically to make a probable diagnosis of MDE or as a continuous measure of scores from 0 to 27 with cut-off points (COP) at 5, 10, 15 and 20 representing levels of depressive symptoms, i.e. mild, moderate, moderately severe and severe.34 The scores can also be used dichotomously based on a COP to classify subjects with or without CSDS.37 According to Kroenke et al., the psychometric characteristics of the PHQ-9 have a sensitivity of 88% and a specificity of 88%, adequate internal consistency (Cronbach's α of 0.86–0.89), a test–retest score of 0.84, a concordance between self-administered and evaluator-administered tests of 84% and an area under the curve (AUC) of 0.95.34 In this study a COP of 8 or more was used to identify cases of CSDS, based on the meta-analysis by Manea et al.23 and the study by Rancans et al. in PC.38
Mini-International Neuropsychiatric InterviewThe MINI is a brief structured diagnostic interview that explores the diagnostic categories of the DSM-IV and the ICD-10.39 Its original version was developed by Sheehan et al.39 and Lecrubier et al.40 in the United States and France. It contains 130 questions organised into modules that assess 16 disorders from axis i of the DSM-IV and one personality disorder. The original version in English has a sensitivity range between 0.46 and 0.94 and a specificity between 0.72 and 0.97,39,40 excellent inter-rater (kappa 0.70) and test–retest reliability, and moderate validity of criteria compared to the Composite International Diagnostic Interview (CIDI) and the SCID-P.39,40 The MINI quickly gained international acceptance,41–43 has translated versions in 43 languages39 and its reliability and validity have been explored in its Italian,44 Japanese,45 Norwegian,46 Moroccan47 and Portuguese48 versions. The average administration time is 18.7±11.6min, with a mean of 15min.39 Together with the CIDI and the SCID-I, the MINI is considered a globally accepted gold standard for the diagnosis of mental disorders in clinical and research settings.49
Hospital Anxiety and Depression ScaleThe Hospital Anxiety and Depression Scale (HADS) was designed by Zigmond and Snaith in 198350 to detect mood disorders, especially those associated with somatic symptoms. It consists of 14 items, with an anxiety subscale (odd items) and a depression subscale (even items). Each item is graded on a four-point frequency scale from 0 to 3. The HADS has been translated into most European languages, Arabic, Hebrew, Urdu, Japanese and Chinese51 and its reliability and validity has been demonstrated in numerous studies.52 In Colombia, it was validated in cancer patients, finding adequate internal consistency (Cronbach's α of 0.85), a COP of 8 for the anxiety subscale and 9 for the depression subscale.53 These psychometric properties were confirmed in a populational sample (n=1500) in several cities in Colombia.54 In this work, the version adapted by Rico et al. was used.53
Patient Health Questionnaire-2The Patient Health Questionnaire-2 (PHQ-2) consists of the first two items of the PHQ-9, which are necessary to suspect the presence of depression according to the DSM-IV criteria.55 The scoring system is the same as for the PHQ-9 and scores range from 0 to 6. A COP of 3 is optimal for screening, but a recent meta-analysis suggests that a COP of 2 may increase its sensitivity.56 Patients who score positive for CSDS should be evaluated with the PHQ-9 to determine whether they meet the MDE criteria.57 Its clinical utility stems from the fact that it reduces the time taken in normal PC consultations, which are usually busy.58 The PHQ-2 has been found to have psychometric performance comparable to the PHQ-9, with good reliability, validity and sensitivity to change.56 In this work, a COP of 2 or above was used to identify patients with CSDS.59
Statistical analysisThe data were analysed in SPSS version 20.0,60 carefully verified and reviewed twice. A descriptive analysis of the qualitative and quantitative variables was performed. Cronbach's α and McDonald's ω coefficients were calculated to assess internal consistency; for concurrent validity, Spearman or Pearson correlations were estimated depending on the distribution of the variables. To assess the accuracy of the PHQ-9 as a screening tool compared to the MINI, the receiver operating characteristics (ROC) and AUC were analysed. The optimum COP for the PHQ-9 was determined taking into account validity indices: sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive and negative likelihood ratios (LR), Youden's index and ROC curve/AUC analysis.
ResultsParticipant characteristicsThree hundred and eighty-four users were contacted, of whom 95 did not agree to participate. Of the surveys conducted, 46 were discarded due to missing and inconsistent data; the analysis therefore included 243 participants, of whom 184 (75.7%) were female. The average age was 34.05 years with a standard deviation (SD) of 12.47 years. For male participants the average age was 33.59 years with a SD of 12.89 years, while for female participants the average was 34.20 years with a SD of 12.37 years. The sociodemographic characteristics of the sample can be seen in Table 1.
Description of the sociodemographic characteristics of patients with or without minor depressive symptoms receiving healthcare at Primary Care centres.
Variables | No. | % |
---|---|---|
Gender | ||
Male | 59 | 24.30% |
Female | 184 | 75.72 |
Area of origin | ||
Urban | 210 | 86.42 |
Rural | 33 | 13.58 |
Marital status | ||
Single | 98 | 40.33 |
Married | 53 | 21.88 |
Cohabiting | 83 | 34.16 |
Divorced | 5 | 2.06 |
Widowed | 4 | 1.65 |
Education | ||
Primary, not completed | 14 | 5.75 |
Primary, completed | 60 | 24.69 |
Secondary, not completed | 36 | 14.81 |
Secondary, completed | 88 | 36.21 |
Vocational, not completed | 18 | 7.41 |
Vocational, completed | 1 | 0.41 |
Technological, completed | 3 | 1.23 |
University, not completed | 3 | 1.23 |
University, completed | 20 | 8.23 |
Socio-economic stratum | ||
Stratum 1 | 103 | 42.39 |
Stratum 2 | 84 | 34.57 |
Stratum 3 | 47 | 19.34 |
Stratum 4 | 6 | 2.47 |
Stratum 5 | 3 | 1.23 |
The prevalence of CSDS was 27.2% according to the results of the PHQ-9 and 21.8% according to the MINI structured interview.
Internal consistencyA Cronbach's α coefficient of 0.80 and an ω coefficient of 0.81 were obtained. The overall internal consistency of the scale if each item is eliminated is shown in Table 2.
Convergent validityThe Kolmogorov–Smirnov test was used to establish the normality of the variables, for the purpose of deciding the type of test to use for the concurrent validity analysis of the PHQ-9 against the PHQ-2 and the HADS depression subscale (HADS-D). These variables were not found to have a normal distribution, to Spearman's Rho was used. The Spearman's Rho was 0.646 for the HADS-D (P<0.010) and 0.701 for the PHQ-2 (P<0.010).
Criterion validityThe ROC curve (Fig. 2) and accuracy indices for the PHQ-9 produced the results shown in Table 3. The AUC was 0.92 (95% CI: 0.88–0.963).
Description of the various cut-off points for the PHQ-9 Colombian version and validity coefficients.
Cut-off point | Sensitivity | Specificity | Youden's index | % correctly classified | PPV | NPV | LR+ | LR− |
---|---|---|---|---|---|---|---|---|
≥3 | 0.98 | 0.43 | 0.42 | 55.14 | 0.25 | 0.99 | 1.73 | 0.04 |
≥4 | 0.96 | 0.58 | 0.54 | 65.84 | 0.30 | 0.99 | 2.27 | 0.07 |
≥5 | 0.96 | 0.71 | 0.67 | 76.54 | 0.39 | 0.99 | 3.34 | 0.05 |
≥6 | 0.94 | 0.77 | 0.71 | 80.66 | 0.44 | 0.99 | 4.09 | 0.07 |
≥7a | 0.90a | 0.82a | 0.72a | 83.54a | 0.48a | 0.98a | 4.93a | 0.12a |
≥8 | 0.83 | 0.88 | 0.71 | 86.83 | 0.57 | 0.96 | 6.87 | 0.20 |
≥9 | 0.75 | 0.91 | 0.66 | 87.24 | 0.60 | 0.95 | 7.96 | 0.28 |
≥10 | 0.67 | 0.93 | 0.60 | 87.24 | 0.64 | 0.94 | 9.18 | 0.35 |
≥11 | 0.60 | 0.94 | 0.54 | 86.83 | 0.66 | 0.92 | 10.35 | 0.43 |
≥12 | 0.56 | 0.97 | 0.53 | 88.07 | 0.77 | 0.92 | 17.75 | 0.46 |
≥13 | 0.46 | 0.97 | 0.43 | 86.01 | 0.74 | 0.90 | 14.69 | 0.56 |
≥14 | 0.31 | 0.99 | 0.30 | 84.36 | 0.85 | 0.88 | 29.38 | 0.70 |
≥15 | 0.27 | 0.99 | 0.26 | 83.95 | 0.91 | 0.88 | 51.42 | 0.73 |
The optimum COP was a PHQ-9 score ≥7 (sensitivity 90.38 [95% CI: 81.41–99.36]; specificity 81.68 [95% CI: 75.93–87.42]; PPV 57.32 [95% CI: 46.00–68.63]; NPV 96.89 [95% CI: 93.90–99.88]; Youden's index 0.72 [95% CI: 0.62–0.82]; LR+ 4.93 [95% CI: 3.61–6.74]; LR− 0.12 [95% CI: 0.005–0.270]).
DiscussionTo the best of our knowledge, this is the first study on PHQ-9 criterion validity in PC in Colombia. The prevalence of MDE in this study was 21.8%. The Colombian version of the PHQ-9 demonstrated excellent diagnostic performance as a depression screening tool, as can be seen from the ROC curve and AUC. The PHQ-9 also demonstrated an adequate balance of sensitivity and specificity at the COP of ≥7 when compared with the MINI as a reference standard, establishing the PHQ-9's adequate criterion validity. The comparison of PHQ-9 scores against those from the HADS-D and the α and ω coefficients demonstrated good convergent validity and adequate internal consistency.
The percentage of subjects classified as having CSDS based on the PHQ-9 with the pre-established COP was 27.2% (95% CI: 26.3–28.9), higher than the prevalence of 22.3% (95% CI: 20.0–24.6) found in Bucaramanga using the Zung Self-rating Depression Scale.10 This difference can be explained by the poor diagnostic performance of the Zung scale in the Colombian population.61 With regard to the prevalence of MDE based on the MINI, in this sample it was 21.8% (95% CI: 20.8–23.5), which is within the expected range based on a meta-analysis of 41 studies in PC with an adjusted global prevalence of 19.5% (95% CI: 15.7–23.7).62 However, the prevalence of MDE in this study is a little higher than the 16.5% (95% CI: 12.3–21.6) reported in previous studies in the general population in Bucaramanga,11 which can be explained by the fact that this study was conducted in people attending PC centres, where the prevalence of depression is higher than in the general population63 and by the large proportion of women.64
Cronbach's α coefficient was 0.80 and McDonald's ω coefficient was 0.81, indicating good internal consistency.65,66 For a self-reporting tool to be reliable, Cronbach's α and McDonald's ω need to be at least 0.70.67 The internal consistency found in this study is in keeping with a previous study in Colombia37 and with others conducted in different languages, whose coefficients ranged from 0.79 to 0.89.68,70,71
Previous studies have demonstrated that the PHQ-9 has adequate concurrent validity with various measured, including the Hamilton Depression Rating Scale (HAM-D), short health assessment forms and even the PHQ-2.72 In our study, total PHQ-9 scores showed a statistically significant positive correlation with HADS-D and PHQ-2 scores (Spearman's Rho of 0.64 [P<0.01] for HAM-D and 0.70 [P<0.01] for PHQ-2), in keeping with previous studies in which the Pearson's coefficients for the PHQ-9 with the HAM-D and the Beck Depression Inventory (BDI) were 0.52 and 0.76, respectively.68,73–75 Meanwhile, a study of patients with Parkinson's disease showed that the PHQ-9 correlated positively with the Self-rating Depression Scale and the 15-item Geriatric Depression Scale, with a Spearman's coefficient of 0.63 in both cases.76 The correlation coefficients found in this study confirm the convergent validity of the PHQ-9, with Spearman's correlation coefficients between 0.60 and 0.80 indicating a good or considerable positive correlation.77
With regard to the COP, various studies have recommended a COP of 10 in the PHQ-9 for the identification of MDE.34 For example, in a study of PC users in China, an optimum COP of 10 produced a sensitivity of 0.87 and a specificity of 0.8.69 However, a recent meta-analysis of 18 studies demonstrated that the optimum COP of the PHQ-9 could range from 8 to 11, depending on the population studied; nevertheless, the balance of sensitivity and specificity is maintained for a COP of 7 (5 of the 18 studies included).23 In our study, the COP of 7 appears to have given the optimum balance between sensitivity and specificity, which was confirmed with an additional measure of accuracy: Youden's index,78 defined as the maximum vertical distance between the ROC curve and the 45 degree line, as an indicator of how far the curve is from an uninformative test.79 Youden's index is a function of sensitivity (Se) and specificity (Sp); it is calculated as (Se+Sp-1)80 and should be considered alongside the ROC curve as they are usually related.81 The range is 0–100 when converted into a percentage. Values >50% are generally considered acceptable for diagnostic accuracy.82
The values associated with the COP in our study are consistent with a study in older adults in PC, in whom PHQ-9 criterion validity was assessed by administering the MINI, in which an optimum COP≥7 (sensitivity 0.92; specificity 0.78) demonstrated the best psychometric characteristics.83 Nevertheless, this COP of 7 is lower than that found in most studies with the PHQ-9 in other populations. The cultural and demographic characteristics of the samples may be the reason for this difference.84 Stigma is an important aspect that can also influence people's response pattern to depression screening scales in our population, causing shame in people with mental illnesses, which limits the identification of psychopathological phenomena.85,86 It is worth noting that PHQ-9 COPs tend to be lower in middle and low income countries87–89 compared with high income countries.75,90,91 However, there are no studies looking at this phenomenon. This difference in optimum COP highlights the importance of validating screening tools in different social and cultural contexts.92
For a COP of ≥7, the sensitivity and specificity of the PHQ-9 in this sample were 90% and 83%, respectively. These findings are consistent with the study by Wang et al., in which the COP of ≥7 allowed an adequate balance between sensitivity and specificity (sensitivity 85%; specificity 86%).93 The accuracy indices in our study are therefore considered appropriate, as the screening tool is considered good when its sensitivity is between 79% and 97% and its specificity is between 63% and 86%.94 Wittkampf et al. systematically reviewed the psychometric properties of the PHQ-9 and found a sensitivity of 77% (71–84%) and specificity of 94% (90–97%), including studies in subgroups with a high prevalence of depression, such as PC users.25
The LR+ and LR− of the PHQ-9 in our sample, for a COP of ≥7, were 4.93 and 0.12, respectively. This means that, in a similar clinical context, a positive result in the PHQ-9 (COP≥7) if five times more common in a patient with depression than in one without depression, while a subject with a negative result would have a likelihood of having depression of less than 2%.95 These results are comparable with those obtained in the Chinese version of the PHQ-9, which with a COP of ≥7, had an LR+ and LR- of 5.99 and 0.17, respectively.93
The AUC of this Colombian version of the PHQ-9 for PC was 0.92, which indicates a high degree of accuracy96 and is consistent with previous studies in PC and other populations.69,71,93
This study's main strengths include the use of a clinical reference criterion to assess the PHQ-9 validity criterion, the adequate participant response rate (75.3%), adequate training of interviewers, adherence to the QUADAS-2 guidelines29 and the execution of a rigorous analysis plan. In addition, the PHQ-9 was translated in accordance with the standardised guidelines for transcultural adaptation of scales. The linguistic adaptation was supported by a group of experts, guaranteeing appropriate content validity.
This study has several limitations. Firstly, our study was conducted in a PC context, therefore the results cannot be generalised to the general population, whose characteristics would produce a different response pattern.84 Secondly, the study was limited to adults. There is growing evidence that adolescents are particularly affected by depressive disorders,97 so future studies in Colombia would need to assess the psychometric performance of the PHQ-9 in this population. Thirdly, this was a cross-sectional study, and as a consequence there will be a need, in the future, to design longitudinal studies to establish the sensitivity to change of the PHQ-9 in the Colombian population, as works exist that have used in to assess response to treatment of depression.98 Fourthly, the fact that the sample was predominantly female (75%) could affect the estimates of accuracy indices, as the prevalence of depression is higher in women than in men, giving a higher number of positive cases of depression.99 And fifthly, one relative weakness in the sample size, which was calculated following the recommendations of Sanchez et al. for comparing the sensitivity of a screening test with a reference standard.30 However, other authors, such as Buderer100 and Obuchowski101 demand larger samples. On the other hand, following Bean's criteria for comparing the sensitivity or specificity of two diagnostic tests, sample sizes similar to ours are obtained.102
In keeping with global results, the Colombian version of the PHQ-9 for PC has excellent psychometric performance as a screening test, which guarantees that it can be used in contexts with few resources and with weaknesses in the healthcare system, where the availability of psychiatrists is limited.103 Among the strategies to limit the burden of mental health disorders in low and middle income countries is integration of mental health in PC.104 One of the major barriers to achieving this goal is the lack of easy-to-administer and validated screening tools to detect depression. The validation of instruments such as the PHQ-9 in these contexts can help to solve this problem.105 It is known that just screening for depression is insufficient to mitigate the growing care needs for mental health disorders in low and middle income countries; nevertheless, given that depression contributes significantly to the burden of disease, having validated screening tools is the first step towards solving this problem.106 In some low income countries, there are cost-effective depression intervention programmes, in which screening tools can be used to identify appropriate participants.107 One of the main components of effective mental health interventions in PC is monitoring depressive symptoms using simple, brief and easy-to-administer questionnaire such as the PHQ-9.108
With the validation of this version of the PHQ-9, researchers in Colombia now have valid and reliable psychometric information about depression screening in PC, which will enable the PHQ-9 to be used in studies where it is necessary to identify depressive symptoms with an appropriate COP.
In conclusion, the results of this study indicate that the Colombian version of the PHQ-9 is a valid and reliable tool for screening for depression in a PC context in Bucaramanga, with a COP of 7 or above. The psychometric properties of this version of the PHQ-9 will need to be evaluated in different populations and other regions of the country. Future studies in Colombia should assess the PHQ-9's sensitivity to change.
FundingThis work was funded by the faculty of medicine of the Universidad de Santander (UDES) and the Instituto de Salud de Bucaramanga (ISABU). Project code: PIFE0118020041816EJ.
Authors’ contributionCarlos Arturo Cassiani-Miranda: design, scale adjustment, training of survey-takers and interviewer, collection of information, statistical analysis, digitisation, drafting and revision of the article.
Angy Karina Cuadros-Cruz: design, scale adjustment, collection of information and drafting of the article.
Harold Torres Pinzón: design, scale adjustment, statistical analysis, digitisation, drafting and revision of the article.
Orlando Scoppetta: scale adjustment, statistical analysis, drafting and revision of the article.
Jhon Henrry Pinzón-Tarrazona: collection of information, digitisation and drafting of the article.
Wendy Yulieth López-Fuentes: collection of information, digitisation and drafting of the article.
Andrea Paez: collection of information and drafting of the article.
Diego Fernando Cabanzo-Arenas: digitisation, drafting and revision of the article.
Sergio Ribero-Marulanda: collection of information, drafting and revision of the article.
Elkin René Llanes-Amaya: collection of information and drafting of the article.
Conflicts of interestThe authors have no conflicts of interest to declare.
To the expert panel for their contributions to the validation of appearance and content: Astrid I. Arrieta, Jaider A. Barros, Adalberto Campo-Arias, Mauricio Castaño, Jenny García, Luis A. Montenegro, Jorge A. Niño, Heidi C. Oviedo, Andrés M. Rangel, Jorge J. Téllez-Vargas.
Please cite this article as: Cassiani-Miranda CA, Cuadros-Cruz AK, Torres-Pinzón H, Scoppetta O, Pinzón-Tarrazona JH, López-Fuentes WY, et al. Validez del Cuestionario de salud del paciente-9 (PHQ-9) para cribado de depresión en adultos usuarios de Atención Primaria en Bucaramanga, Colombia. Rev Colomb Psiquiat. 2021;50:11–21.