Screening for depression in patients with cancer can be difficult due to overlap between symptoms of depression and cancer. We assessed validity of the Beck Depression Inventory (BDI-II) in this population.
MethodData was obtained in an outpatient neuropsychiatry unit treating patients with and without cancer. Psychometric properties of the BDI-II Portuguese version were assessed separately in 202 patients with cancer, and 376 outpatients with mental health complaints but without cancer.
ResultsConfirmatory factor analysis suggested a three-factor structure model (cognitive, affective and somatic) provided best fit to data in both samples. Criterion validity was good for detecting depression in oncological patients, with an area under the ROC curve (AUC) of 0.85 (95% confidence interval [CI], 0.76–0.91). A cut-off score of 14 had sensitivity of 87% and specificity of 73%. Excluding somatic items did not significantly change the ROC curve for BDI-II (difference AUCs = 0.002, p=0.9). A good criterion validity for BDI-II was also obtained in the non-oncological population (AUC = 0.87; 95% CI 0.81–0.91), with a cut-off of 18 (sensitivity=84%; specificity=73%).
ConclusionsThe BDI-II demonstrated good psychometric properties in patients with cancer, comparable to a population without cancer. Exclusion of somatic items did not affect screening accuracy.
Patients with cancer frequently experience symptoms of depression, which can negatively affect long-term quality of life, treatment compliance, health service use, and mortality (Andersen et al., 2014; Chida et al., 2008). The reported prevalence of depression in patients with cancer varies according to the type and clinical characteristics of cancer, the conceptualization of depression, and the criteria and methods that are used for diagnosis (Massie, 2004). While prevalence over the first five years following a cancer diagnosis (Mitchell et al., 2011; Pitman et al., 2018) may range from 4% to 20%, depression remains under-diagnosed and is often left untreated in patients with cancer (Walker et al., 2014), calling for an urgent identification of appropriate screening and assessment tools for use in routine clinical practice in this field. To address this need, the National Comprehensive Cancer Network (NCCN) (National Institute for Clinical Excellence et al., 2004) and the American Society of Clinical Oncology (ASCO) (Andersen et al., 2014) have published guidelines emphasizing the importance of formally assessing depressive symptoms regularly across the trajectory of care. These recommendations highlight the use of standardized measures, validated for oncological populations, with several depression assessment tools proving to be effective in this context. The validation of self-reported measures of depression is an important contribution to this field. When used appropriately, such instruments are a cost-effective and equitable means of identifying depressive symptoms, much less time and resource consuming than structured interviews (Vodermaier et al., 2009; Wakefield et al., 2015). Additionally, the selection of self-reported measures should be based on existing validation data in the population of interest (Ziegler et al., 2011). The most often used and recommended questionnaires for the oncological setting are the Hospital Anxiety and Depression Scale (HADS) (Zigmond & Snaith, 1983), the Patient Health Questionnaire (PHQ-9) (Kroenke et al., 2001), the Beck Depression Inventory (BDI-II) (Beck et al., 1996) and the Center for Epidemiologic Studies Depression Scale (CES-D) (Radloff, 1977). However, diagnosing depression in patients with cancer can be particularly challenging, as many symptoms of depression overlap with cancer-related symptoms and/or as treatment side-effects. Furthermore some symptoms may actually represent a normative response when the patient is confronted with threats to life or physical integrity, bad news, aggressive treatments, and/or pain (Ha et al., 2019; Massie, 2004).
The BDI-II is one of the most widely used self-report measures of depressive symptom severity. Validation studies have shown good to excellent psychometric properties across populations (Wang & Gorenstein, 2013). Severity cut-off scores were originally provided by Beck (Beck et al., 1996), allowing to distinguish between minimal (0 to 13), mild (14 to 19), moderate (20 to 28) and severe (29 and greater) depression. Importantly, the BDI-II was developed in accordance with the depression diagnostic criteria defined in DSM-IV, which recognizes the wide-ranging nature of depressive symptoms, generally categorized as cognitive, affective and somatic (American Psychiatric Association, 2000). However, and possibly due to the original intent of the BDI-II to measure depression globally, findings concerning its factor structure have been somewhat inconsistent. Several studies aimed to examine the dimensionality of the BDI-II in a variety of samples, trying to replicate the structure proposed by the authors of the scale, or proposing other novel structures (Huang & Chen, 2015). While Beck et al. (1996) originally suggested a two-factor correlated model comprising cognitive and somatic-affective factors, at least two studies identified a single BDI-II factor, in accordance with scoring instructions for the scale (Kim et al., 2002; Segal et al., 2008). Although Beck et al. (1996) reported an alternative two-factor model consisting of cognitive-affective and somatic factors, it was only developed because the first was not suitable to a student sample. On the other hand, the original two-factor model proposed by Beck et al. (1996) in a clinical outpatient sample, with cognitive and somatic-affective factors, has received support from other studies conducted with patients with physical illness (e.g. Arnau et al., 2001; Brown et al., 2012; Kojima et al., 2002; Viljoen et al., 2003). A three-factor model has also been suggested, including cognitive, affective and somatic factors (Beck et al., 2002).
Clearly much uncertainty remains regarding the latent structure of the BDI-II, which can be partially explained by the fact that items’ organization may vary according to the characteristics of the sample (Beck et al., 1996). This is particularly true when it comes to specific clinical populations, such as cancer patients and other vulnerable groups. In fact, even though the BDI-II was originally developed for use in the psychiatric setting, its use rapidly expanded to other contexts, including oncology. Specific concerns have been raised in the literature about the performance characteristics of the BDI-II in patients with cancer, since almost half of its items assess somatic symptoms. For instance, a study using a sample of hospitalized oncological patients showed that the BDI-II is highly saturated with items describing somatic complaints, suggesting that, in this particular population, scores on these items may be reflecting the intensity of cancer-related somatic symptoms, rather than depression symptoms (Wedding et al., 2007). On the other hand, several studies demonstrated that the BDI-II is able to accurately identify depression in a variety of samples of patients with cancer (Mitchell et al., 2012), with excellent internal consistency, good temporal validity and convergent validity with HADS-Depression (Mystakidou et al., 2007; Tobias et al., 2017). Studies that assessed criterion validity of the BDI-II in oncological populations (Hopko et al., 2007; Katz et al., 2004; Warmenhoven et al., 2012) all found good to excellent sensitivity and specificity values for the BDI-II total score, but proposed different cut-off scores for diagnosis of depressive disorders depending on the sample type. For instance, a cut-off score of 13 was proposed for patients with head and neck cancer (n=60) (Katz et al., 2004); 16 for patients with advanced metastatic cancer (n=46) (Warmenhoven et al., 2012); and 14 in a study with a heterogeneous, but smaller, sample of cancer types (n=33) (Hopko et al., 2007).
Data regarding construct and criterion validity of the BDI-II in the oncological populations in comparison with sample without a cancer diagnosis are thus lacking. Such a study would allow for a more specific and detailed investigation of the differential contribution of somatic items to the validity of the BDI-II in patients with cancer. In the present study we validated the BDI-II for oncologic and psychiatric populations, assessing the latent structure of the BDI-II, and how somatic items influence its screening accuracy in identifying depression in the oncological setting.
MethodsProcedures and participantsStudy procedures were reviewed and approved by the Champalimaud Foundation Ethics Committee, in Lisbon, Portugal. Data was collected between April 2013 and December 2019 during clinical routine visits to the outpatient neuropsychiatry clinic of the Champalimaud Clinical Center. Written informed consent was obtained from participants in accordance with the Declaration of Helsinki. The routine clinical protocol at admission was composed of a battery of self-reported, pen-and-paper assessment instruments, completed by participants while waiting for a Psychology or Psychiatry appointment. Screening of affective symptoms was followed by clinical assessment with a psychiatrist or a clinical interview with a psychologist. Patients were eligible for participation if they were at least 18 years of age, while exclusion criteria for both samples included: dementia; illiteracy or inability to understand the study instructions; clinically significant focal structural lesion of the central nervous system; history or clinical evidence of chronic psychosis; acute episode of neuropsychiatric disease requiring hospitalization, and current abuse or dependence of drugs or alcohol. Participants were then categorized in two groups: 1) confirmed diagnosis of cancer and active disease and/or under any oncological treatment; and 2) no past or current diagnosis of cancer.
MeasuresSociodemographic information and exclusion criteria were assessed with structured questionnaires. Details on medical data, including cancer diagnosis and cancer characteristics, were retrieved from electronic clinical records.
Depressive symptoms were evaluated with the Portuguese version of the BDI-II (Campos & Gonçalves, 2011), which is a 21-item self-report questionnaire that assesses severity of symptoms of depression occurring in the previous 15 days. Each item inquiries about a symptom and provides four response statements, graded from 0 to 3 according to the severity of the symptom. The total score ranges from 0 to 63 and reflects the sum of the scores of all items. The BDI-II was validated for the Portuguese population in 2011 with two non-clinical samples: a community sample and a college student's sample. The validation studies have shown good internal consistency values (0.90<α<0.91), adequate convergent validity with the Center of Epidemiologic Studies Depression Scale (CES-D) (Radloff, 1977), and a two-factor structure consisting of Cognitive-affective and Somatic factors (Campos & Gonçalves, 2011). The BDI-II administration was performed in a paper-and-pen format filled by patients directly in the protocol sheet, without the intervention of the clinician.
In the subset of patients that had a Psychology appointment we also applied the MINI (Sheehan et al., 1998), a structured psychiatric interview, based on DSM-IV diagnostic criteria, comprising modules for 15 psychiatric diagnoses or conditions, namely: major depressive disorder; dysthymia; suicidality; hypomanic or manic episode; panic disorder; agoraphobia; social phobia; obsessive-compulsive disorder; post-traumatic stress disorder; alcohol abuse or dependence; substance abuse or dependence; psychotic disorders; anorexia nervosa; bulimia nervosa; generalized anxiety disorder. For this study we used an European Portuguese adaptation of the Brazilian Portuguese version of the MINI 5.0.0 (Amorim, 2000) to discriminate between participants with or without depression, with the purpose of assessing criterion validity. MINI was chosen as the diagnostic standard for the validation process in order to avoid burdening the patients with time-consuming measures, as previous studies demonstrated a shorted time of application when compared with other interviews, maintaining good and similar psychometric properties (Amorim, 2000). MINI was only applied to patients who had been referred to a first-time clinical psychology session, where a brief psychological assessment with the psychologist is routinely performed. The remaining sample was referred to a first-time psychiatric consultation. In both cases (psychology or psychiatry appointments) patients filled in the BDI-II before starting the consultation.
Statistical analysisStatistical analyses were performed using the Statistical Package for the Social Sciences (SPSS Version 26.0; IBM SPSS, Inc., Chicago, IL). All analyses were two-tailed with p<0.05 considered significant. Descriptive statistics were used to characterize the sample and psychometric data, including means and standard deviations, minimum and maximum absolute values and percentage (for categorical data). Independent samples t-tests were performed to compare age, education and the scores for BDI-II across groups, and Chi-square (χ2) analysis for comparisons of gender.
Several psychometric properties of the BDI-II were assessed. A Confirmatory Factor Analysis was conducted using structural equation modelling statistics package AMOS 26.0 (SPSS AMOS, Version 26; IBM SPSS, Inc., Chicago, IL, USA) to verify whether the three theory-driven factor models (Table 1) presented an adequate fit for the study sample data. Model 1, a one-factor model, is based on a global construct of depression supporting the use of the BDI-II total score. Model 2, a two-factor model (Beck et al., 1996), divides symptoms in two factors: Cognitive2 and Somatic-affective. Finally, Model 3 represents a three-factor structure, with Cognitive3, Affective and Somatic items as independent factors (Beck et al., 2002). To evaluate the goodness of fit of the tested factorial structures, we considered the following indices: χ2/df (ratio of chi-square to degrees of freedom), the CFI (comparative fit index), the TLI (Tucker–Lewis Index), and RMSEA (Root Mean Square Error of Approximation). The fit of the model was considered good for χ2/df <3 (Arbuckle, J.L., 2009; Wheaton, 1987), CFI and TLI values above 0.95 (Bentler, 1990; Bentler & Bonett, 1980) and RMSEA values below 0.06 (Hu & Bentler, 1999; Marôco, J., 2014). Cronbach's alpha was used to measure internal consistency of the BDI-II total scale and of each BDI-II subscale, depending on the theoretical model.
Theoretical models of the Beck Depression Inventory-II (BDI-II).
To assess criterion validity, receiver operating characteristics (ROC) curves were calculated for the BDI-II total score and for the subscales of Models 2 and 3 (Table 1). Such curves plot the sensitivity and specificity of the scales for every possible cut-off point against the reference criterion, which for this study is the diagnosis of depression disorder according to the MINI. The area under the curve (AUC) of the ROC curve is a global assessment of diagnostic accuracy, with larger AUC indicating better accuracy. To guide interpretation, we considered AUC values of ≥0.9 as very good, ≥0.8 as good and ≥0.7 as fair (Rice & Harris, 2005). Optimal diagnostic cut-off scores were calculated and selected based on the highest Youden Index (sensitivity + specificity-1) (Hughes, 2015), indicating maximization of sensitivity (the probability for individuals with depression to be correctly identified by the scale) and specificity (the probability for individuals without depression to be correctly excluded by the scale). Based on the same method, positive predictive value (PPV), negative predictive value (NPV) and accuracy were also calculated to examine BDI-II's predictive value regarding the diagnosis of depression (Trevethan, 2017). These analyses were performed using MedCalc (Version 19.0; MedCalc Software, Ostend, Belgium).
ResultsSample characteristicsSociodemographic and clinical characteristics of the samples are shown in Table 2. A total of 210 patients with a cancer diagnosis (PC), referred for a Psychology or a Psychiatry appointment, were included. BDI-II scores were also collected from 376 community-dwelling patients with no current or previous cancer diagnosis (non-PC), at their first Psychiatry or Psychology appointment. Comparisons between groups revealed that the PC sample was older and comprised more females that the non-PC group. No statistically significant differences were found for education.
Sociodemographic and clinical data from each sample. Mean and standard deviation for all variables, except for gender (presented as percentage of males). Differences were tested using chi-square for gender and independent samples t-test for the other variables (p-values).
Note. Tumor site summarized as “not specified” and patients whose tumor stage is “unknown” were those who did not have that information available on their clinical files. Non-PC=Patients without a cancer diagnosis; PC= Patients with a cancer diagnosis.
Comparisons between groups regarding the BDI-II total score and the scores of each dimension from our tested models are presented in Table 3. When we compared BDI-II subscales from the two- and three-factor models across groups, we found that the Somatic dimension of Model 3 did not differ significantly between groups, in contrast to the remaining subscales and the BDI-II total score. The Cognitive dimension of both Model 2 and Model 3, the Somatic-Affective dimension of Model 2 as well as the Affective dimension of Model 3, had lower scores in the PC sample.
Mean, standard deviation and reliability values for the BDI-II total scale and for Model 2 and Model 3 scales in both PC and non-PC samples. Differences between populations were tested using independent samples t-test (p-values). Reliability values were calculated using Cronbach's alpha (α).
PC (n = 210) | Non-PC (n = 376) | ||||||
---|---|---|---|---|---|---|---|
Range | Mean (SD) | α | Range | Mean (SD) | α | p-value | |
Model 1 | |||||||
BDI total score | 0–56 | 20.2 (11.2) | 0.91 | 1–55 | 24.4 (12.2) | 0.91 | <0.001 |
Model 2 (Beck et al., 1996) | |||||||
Cognitive2 | 0–23 | 5.8 (5.0) | 0.84 | 0–24 | 8.3 (5.8) | 0.86 | <0.001 |
Somatic-Affective | 0–34 | 14.4 (7.1) | 0.84 | 0–34 | 16.2 (7.4) | 0.84 | 0.01 |
Model 3 (Beck et al., 2002) | |||||||
Cognitive3 | 0–19 | 4.7 (4.4) | 0.84 | 0–20 | 6.9 (4.9) | 0.84 | <0.001 |
Affective | 0–14 | 4.4 (3.3) | 0.83 | 0–15 | 5.6 (3.5) | 0.83 | <0.001 |
Somatic | 0–25 | 11.0 (5.2) | 0.79 | 0–26 | 11.9 (5.4) | 0.79 | 0.1 |
Note. Non-PC=Patients without a cancer diagnosis; PC= Patients with a cancer diagnosis.
Based on previous research on the BDI-II, we performed confirmatory factor analyses (CFA) to assess fit indices for the one-factor, two-factor and three-factor solutions for both the PC and non-PC samples, as shown in Table 4. The CFA results suggested that Model 3 (Cognitive3, Affective and Somatic factors) has good and better fit to the PC sample data than the two other models (χ2/df=1.81, p<0.001; CFI = 0.91; TLI = 0.89; RMSEA = 0.05). As shown in Fig. 1, the loadings for the items included in the Cognitive3 subscale ranged from 0.50 (Failure) to 0.77 (Disconformity with oneself), while for the Affective subscale they ranged from 0.47 (Suicidal thoughts) to 0.84 (Loss of interest), and items in the Somatic subscale ranged between 0.37 (fatigue) and 0.74 (loss of energy). As for the non-PC sample, Model 3 was also an adequate fit to the data (χ2/df=1.81, p<0.001; CFI = 0.91; TLI = 0.89; RMSEA = 0.04), although the two remaining models also showed adequate fit values (Table 4). These results confirm that the latent structure of the BDI-II was similar across groups, with three specific factors (cognitive, affective and somatic) providing the best fit to data.
Fit indices of the confirmatory factor analysis models.
Factor Models | X2 | df | X2/df | p | CFI | TLI | RMSEA | |
---|---|---|---|---|---|---|---|---|
PC (n=210) | One-factor | 201.2 | 189 | 1.1 | <0.001 | 1.0 | 1.0 | 0.02 |
Two-factor (Beck et al., 1996) | 199.5 | 188 | 1.1 | <0.001 | 1.0 | 1.0 | 0.02 | |
Three-factor (Beck et al., 2002) | 128.2 | 186 | 0.7 | <0.001 | 1.0 | 1.0 | 0.00 | |
Non-PC (n=376) | One-factor | 289.0 | 189 | 1.5 | <0.001 | 1.0 | 1.0 | 0.04 |
Two-factor (Beck et al., 1996) | 274.8 | 188 | 1.5 | <0.001 | 1.0 | 1.0 | 0.04 | |
Three-factor (Beck et al., 2002) | 200.5 | 186 | 1.1 | <0.001 | 1.0 | 1.0 | 0.01 |
Note. Non-PC=Patients without a cancer diagnosis; PC= Patients with a cancer diagnosis;
CFI = comparative fit index; df= degrees of freedom; RMSEA = root mean square error of approximation; TLI = Tucker-Lewis index; X2 = chi-square; X2/df = relative chi-square.
Confirmatory factor analysis of the BDI three-factor model (Beck et al., 2002) in the oncological sample with standardized parameter estimates and measurement errors.
Internal consistency of BDI-II scores and sub-scores was then estimated using Cronbach's alpha for the three proposed models. A Cronbach's alpha of 0.91 was obtained for the BDI-II total score (one-factor model) in both the PC sample and the non-PC sample. This value indicates excellent internal consistency of the BDI-II total scale, with slightly lower values, as expected, for each subscale of the two-factor and three-factor models, with Cronbach's alpha ranging from 0.79 to 0.86 in the PC and non-PC samples (Table 3).
Psychometric properties - criterion validityNinety-four PC and 202 non-PC participants (42.9% and 54.7% of the total sample, respectively) completed the MINI psychiatric interview. Sociodemographic characteristics of this subsample are presented in Table 5. Among them 48.6% and 59.9%, respectively, met diagnostic criteria for current major depressive episode/disorder or dysthymia, that we will designate jointly as depressive disorders, and were included as such in all subsequent analyses. The two groups differed significantly in severity of depression symptoms, with higher BDI-II scores in the non-PC sample reflecting the higher prevalence of depression in this group (Table 5).
Sociodemographic and psychometric data from the subsample that have MINI interview applied.
PC (n = 94) | Non-PC (n = 202) | ||||
---|---|---|---|---|---|
Range | Mean (SD) | Range | Mean (SD) | p-value | |
Gender (% male[n]) | 22.9% (48) | 34.8% (131) | 0.003 | ||
Age (years) | 30–87 | 58.0 (11.7) | 18–87 | 52.3 (17.1) | <0.001 |
Education (years) | 2–27 | 14.7 (3.9) | 3–22 | 14.1 (4.2) | 0.1 |
Model 1 | |||||
BDI total score | 0–56 | 20.2 (11.2) | 1–55 | 24.4 (12.2) | <0.001 |
Model 2 (Beck et al., 1996) | |||||
Cognitive2 | 0–23 | 5.8 (5.0) | 0–24 | 8.3 (5.8) | <0.001 |
Somatic-Affective | 0–34 | 14.4 (7.1) | 0–34 | 16.2 (7.4) | 0.01 |
Model 3 (Beck et al., 2002) | |||||
Cognitive3 | 0–19 | 4.7 (4.4) | 0–20 | 6.9 (4.9) | <0.001 |
Affective | 0–14 | 4.4 (3.3) | 0–15 | 5.6 (3.5) | <0.001 |
Somatic | 0–25 | 11.0 (5.2) | 0–26 | 11.9 (5.4) | 0.1 |
Note. Mean and standard deviation for all variables, except for gender (presented as percentage of males). Differences were tested using chi-square for gender and independent samples t-test for the other variables (p-values). Non-PC=Patients without a cancer diagnosis; PC= Patients with a cancer diagnosis
To assess criterion validity, we created receiver operating characteristic (ROC) curves using MINI diagnoses as the discriminator between participants with and without depressive disorders, among patients with a diagnosis of cancer (nPC=53 and nPC=41 respectively). An area under the curve (AUC) of 0.85 (95% Confidence interval [95% CI]: 0.76, 0.91) was obtained for the PC sample when using the BDI-II total scale (Model 1). Further analysis of the ROC curve showed that scores above 14 points correctly identified depressive disorder with a sensitivity of 87% and a specificity of 73%. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) based on the optimal cut-off scores for maximum accuracy of each BDI-II factor structure are further described in Table 6.
Diagnostic classification accuracy of the BDI-II studied models in both PC and non-PC samples, using the MINI Structured Interview as the discriminator between participants with Depressive Disorder and others without the disorder.
Note. Non-PC=Patients without a cancer diagnosis; PC= Patients with a cancer diagnosis; AUC = Area under the curve; PPV = Positive predictive value; NPV = Negative predictive value.
To compare discriminatory capacity of the BDI-II total scale in the two samples, a pairwise comparison of ROC curves between the PC and non-PC samples was performed. While AUC of the ROC curve in non-PC (0.87; 95% CI: 0.81, 0.91) had a different cut-off value (>18) than the PC population, we did not find statistically significant differences between the ROC curves of the two groups (difference between areas [DBA] = 0.02, p=0.7; Fig. 2a). These results suggest that the accuracy of BDI-II total scale to detect depressive disorders is similar in patients with cancer and in patients with psychiatric disorders without cancer, for whom the instrument was originally developed.
ROC curves for use of the BDI-II to identify depressive disorders. Plot of the true positive rate (100-specificity) against the false positive rate (sensitivity) for the different possible cut-offs of the BDI-II using the MINI diagnostic criteria for depression as the diagnostic instrument. (a) ROC curves of the BDI-II total scale in PC (blue line) and non-PC (green line) samples; (b) ROC curves of BDI-II total score (blue line) and Model 3Cognitive+Affective (red line) in PC sample. ROC = Receiving operating curves; Non-PC=Patients without a cancer diagnosis; PC= Patients with a cancer diagnosis.
Since Model 3 had slightly better fit to our study sample data than Model 1, and in order to assess whether the somatic dimension of the BDI-II influences its criterion validity, the same analyses were repeated using partial scores of the BDI-II score that excluded items of the Somatic dimension in Model 3. For patients with cancer, AUC of the ROC curve for the partial score was similar (0.85; 95% CI: 0.76, 0.92) to that of BDI-II full score, with a cut-off of 4 achieving the highest combination of sensitivity (89%) and specificity (71%) for the diagnosis of depressive disorders. In the psychiatric sample, on the other hand, AUC of the ROC curve for the partial score was 0.85 with a cut-off of 11 achieving the highest combination of sensitivity (77%) and specificity (76%). Importantly, ROC curves of the partial score for the two populations were again similar (DBA=0.002, p=0.9), further demonstrating that somatic items do not impair criterion validity of the BDI-II in patients with cancer (Fig. 2b).
DiscussionThe purpose of this study was to validate the Portuguese version of the BDI-II for patients with cancer and, in particular, to assess the contribution of somatic items for diagnostic accuracy. We demonstrate that the BDI-II is a valid measure to screen and assess depressive disorders in this population, with reliability, construct validity and criterion validity comparable to what we found in patients with psychiatric disorders but no cancer. Furthermore, we found that scores on somatic items do not decrease the diagnostic accuracy of the BDI-II in patients with cancer.
Based on the fact that several somatic symptoms of depression are also commonly reported by patients with cancer, we performed confirmatory factor analyses (CFA) of the unidimensional model of the BDI-II, as well as two-factor (Beck et al., 1996) and three-factor (Beck et al., 2002) solutions. The three-factor model consisting of cognitive, affective and somatic dimensions had the best fit to the data collected in patients with cancer. Subsequent reliability analyses of the BDI-II total score and subscale scores in both oncological and non-oncological samples demonstrated adequate to good internal consistency of the BDI-II subscales, although excellent internal consistency became evident when we used the BDI-II total score. These findings are consistent with reliability estimates reported in other studies conducted with clinical populations with other physical diseases (Wang & Gorenstein, 2013) and in particular with cancer (Mystakidou et al., 2007; Tobias et al., 2017). While internal consistency was lowest for the somatic dimension of the three-factor model, this was not specific to patients with cancer and also occurred in the control population.
We also found that patients with cancer, when compared with the control sample, had significantly lower BDI-II total scores, mainly due to lower scores on the two dimensions that do not include somatic items, i.e. cognitive and affective dimensions. Given the degree of physical burden associated with oncological disease, it may seem surprising that patients with cancer did not have substantially higher somatic symptom scores than psychiatric outpatients. In fact, other studies found that in cancer patients BDI-II scores were more saturated in somatic items when compared with non-somatic items, concluding that BDI-II may be inadequate to screen for depression in patients with cancer (Jakšić et al., 2013; Tobias et al., 2017; Wedding et al., 2007). While it is not clear what underlies the absence of differences between patients with and without cancer regarding somatic items in our study, ROC analyses showed that somatic items do not compromise the screening accuracy of the BDI-II. Analysis of the BDI-II using only the cognitive and affective dimensions yielded AUC values similar to those obtained with the BDI-II total scale, with no significant loss of sensitivity, specificity, PPV or NPV, thus showing that somatic items do not compromise BDI-II criterion validity.
Despite the inconsistencies of existing literature on BDI-II factor structure, results of our factor analyses are in line with those previously reported for patients with cancer (Jakšić et al., 2013; Tobias et al., 2017). In fact, our results showed factorial similarity across groups for all models tested, with the three factor model showing the best fit to the data in both groups. This three-dimension model of the BDI-II has important pragmatic advantages. First, specific results from each of the three BDI-II factors may help to determine the specific nature of each patient's symptom profile. Second, it may facilitate targeted interventions across time, for instance cognitive therapy for a depressive disorder predominantly characterized by cognitive symptoms. Finally, it may usefully guide the choice of optimal treatment at the individual level, since distinct symptoms of depression have been shown to respond differently to different treatments (Mallinckrodt et al., 2007; Paul et al., 2019). Nevertheless, it remains appropriate to use a global BDI-II score not only for screening depressive disorders, but also to assess severity and monitor response to treatment. This is consistent with the original development of the BDI-II as a measure of the global construct of depression (Brouwer et al., 2013), and also with our results of enhanced reliability of the global score.
A critical finding of this study was the confirmation that the BDI-II total scale accurately identifies depressive spectrum disorders in patients with cancer. To the best of our knowledge, this is the first study to suggest a specific BDI-II cut-off score for identifying depression in patients with cancer in general, including patients with diverse types of cancer in various stages. According to our findings, based on a structured psychiatric interview as the gold-standard, an optimal cut-off value of 14 has good sensitivity and adequate specificity, with an 80.7% probability of patients with scores above that cut-off having a depressive spectrum disorder. Sensitivity and positive predictive value (Trevethan, 2017) further support the use of BDI-II for screening in routine oncological practice. While lower than the cut-off of 18 that we found to be ideal for psychiatric out-patients in our sample, this cut-off value of 14 is in line with the study conducted by Warmenhoven et al. in a population with advanced cancer diagnosis (Warmenhoven et al., 2012), but not with two other studies exploring criterion-related validity of the BDI-II in patients with cancer. Katz et al. (2004) suggested a slightly lower cut-off score of 13 (sensitivity= 92%; specificity = 90%) based on a sample of 60 patients with head and neck cancer, while Hopkon et al. (2007), also assessing a group of patients with a variety of cancer types, suggested a cut-off value of 22. However, the latter suggestion was based on a limited sample of only 33 patients, 9 of them with no depression. This divergence in results shows that criterion validity studies conducted in specific cancer types are likely to be valid only for that very particular subpopulation, and that finding a cut-off that is more universally valid in the oncologic setting requires much larger samples comprising various types of cancer in diverse stages, such as what has been described here.
Furthermore, a detailed analysis of the BDI-II with only cognitive and affective dimensions showed similar AUC values when compared with the BDI-II total scale. Differences between sensitivity, specificity, PPV and NPV were also not significant. Importantly, these results show that somatic items do not compromise BDI-II criterion validity, suggesting that more important than classifying the physical symptoms into cancer-related or depression-related, is to value all of the symptoms reported and tailor patient-oriented interventions. The BDI-II in the oncological population proved to be as accurate as in the psychiatric population, as long as the appropriate cut-off value is used.
The strengths of our study include the use of a structured clinical interview to assess DSM-IV criteria and applied by certified psychologists in a routine clinical setting, a study sample representative of the diverse cancer types and stages, and the inclusion of a comparison group of psychiatric outpatients without a diagnosis of cancer. Nonetheless, our study is not free of limitations. As we analyzed retrospective data collected in routine care, it was not possible to match the samples regarding age, education and gender. Although differences were found for age and gender between the two groups, we did not find differences regarding level of education. It is important to consider that patients with cancer are expected to be older than non-oncological samples and our sample has considerable more patients with breast cancer, which contributes to an over-representation of the female gender. Notwithstanding, previous studies have reported no significant differences in BDI-II scores between different age- or gender-groups (de Sá Junior et al., 2019). A further limitation is our sample size for criterion validity analysis, which is smaller in our cancer sample compared to the psychiatric sample. Yet, our study still has an adequate sample size in the oncological group, considerably higher than previous studies that assessed criterion validity of the BDI-II. Finally, the use of the adapted Portuguese version of MINI 5.0.0 can also be a limitation, since this was a non-published version based on the Brazilian Portuguese version developed by Amorim (2000) and based on DSM-IV. In fact, validated structured clinical interviews of reference validated for the Portuguese population are not currently available. However, language and culture are very close between the two countries, the criteria for depressive disorders are also the same in both countries, and they are very close in DSM-IV and DSM 5.
Therefore, clinicians involved in oncological and/or mental health practice can use the BDI-II in patients with cancer to monitor symptoms of depression during the course of the disease, independently of the cancer type or stage. Nevertheless, it is important to use the appropriate cut-off to interpret patients’ scores. Moreover, the cut-off value proposed here should not be used if the BDI-II is to be applied to patients with dementia or any other condition that compromises patients’ ability to understand the scale. Such patients were excluded from our study and the psychometric properties of the scale thus remain unknown in that specific population. Also, clinicians should be aware of the fact that the BDI-II is not intended to be a diagnostic tool of depression, but rather a measure of depression symptom severity (Nejati et al., 2020) that can be used as a screening measure.
In conclusion, this study demonstrated that the BDI-II is a valid measure to assess depression in the oncological population, with psychometric properties comparable to those of a psychiatric sample. Our results suggest that a total score BDI-II cut-off of 14 has good sensitivity, PPV and NPV, and fair specificity in identifying depression in patients with cancer. Moreover, we showed that accuracy did not change with the omission of somatic items. Finally, our findings supported the use of a three-factor structure with cognitive, affective and somatic dimensions contributing for a general depression score. We believe that our findings, particularly the information about the latent structure of BDI-II and the adjusted cut-off points to this population, can facilitate the screening and identification of depressive disorders in the oncological setting, prompting an earlier referral of individuals in need of specialized treatment to proper psychological or psychiatric care.
Funding supportJO is supported by the NARSAD 2018 Young Investigator Award from the Brain & Behavior Research Foundation, (Grant ID: 27595). RL is supported by the 2018 Scientific Employment Stimulus from Fundação para a Ciência e Tecnologia, Portugal (CEECIND/04157/2018). DF, BS and AJO-M are supported by the BOUNCE project (grant agreement number 777167), and RL and AJO-M are supported by the FAITH project (grant agreement number 875358), both funded by the European Union's Horizon 2020 research and innovation programme. JBB-C and AJO-M are supported by grant FCT-PTDC/MEC-PSQ/30302/2017-IC&DT-LISBOA-01-0145-FEDER, funded by national funds from FCT/MCTES and co-funded by FEDER, under the Partnership Agreement Lisboa 2020 - Programa Operacional Regional de Lisboa. AJO-M is supported by grant FCT-PTDC/MED-NEU/31331/2017, funded by FCT/MCTES. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
The authors would like to thank the administrative team of the Champalimaud Clinical Center Neuropsychiatry Unit for support in this project.