INTRODUCTION
Crohn's disease (CD) and ulcerative colitis (UC) are chronic diseases that have significant physical, social, and psychological repercussions1. Traditional clinical approach, focused on disease activity assessment, does not embody many difficulties patients experience due to their disease2. Therefore, a more comprehensive clinical evaluation must also include a tool to assess the individual's health-related quality of life. Such an approach may result into a better understanding of the patient's global situation and should contribute to better patient care.
The Inflammatory Bowel Disease Questionnaire (IBDQ) is the most widely used health-related quality of life instrument for patients with IBD. It was developed in 1989 by Guyatt and co-workers3 in Canada. The aim of the IBDQ was to provide clinicians with a reliable, valid and a disease-specific questionnaire for identifying and quantifying the subjective health status for patients with IBD. Up till then a variety of generic questionnaires that were standard measures of health status were available4,5, but these measures may not focus adequately on the specific problems of IBD patients. The IBDQ has been translated and validated into several languages such as Dutch, UK-English, Korean, Swedish, Greek, Norwegian, Chinese, German, Portuguese, Japanese5-15. These versions have confirmed the soundness of the psychometric properties of IBDQ previously reported by the Canadian group3. There exists a validated Spanish version of the IBDQ16. These authors translated the modified version of the IBDQ of Love et al17 and determined its validity, reliability and responsiveness. However, few studies have used the 36-item version by Love et al, whereas the 32-item original version by Guyatt et al3 is the most used in studies and also in clinical settings.
Therefore, the aim of this study was to assess the psychometric properties of a Spanish version of the 32-item IBDQ, in order to provide a comparable version of this questionnaire for future gastroenterological studies. Exploratory and confirmatory factor analyses were used to examine the instrument's underlying factor structure. Therefore, the study aimed at investigating whether this version of the IBDQ is valid and reliable for UC and CD patients.
METHODS
Patients
One hundred and eighty-six IBD patients were invited to take part in the study. Patients were recruited through the IBD Unit of Hospital Clinic of Barcelona, Spain. All patients had been previously diagnosed as CD or UC on the basis of Lennard-Jones criteria18. Inclusion criteria established that patients were suffering a relapse or were in clinical remission but had had at least one exacerbation in the last 24 months. UC patients who had undergone a colectomy were excluded from the study. Patients were given a set of questionnaires, asked to fill them out at home and mail them in the next two weeks in the pre-addressed, postage-paid envelopes provided.
Questionnaires
The Inflammatory Bowel Disease Questionnaire (IBDQ)
The IBDQ to be validated in this study is a translation of the 32-item original questionnaire that evaluates four dimensions: bowel symptoms (e.g. loose stools, abdominal pain), systemic symptoms (e.g. fatigue, altered sleep patterns), emotional functioning (e.g. anger, irritability, depression), and social functioning (e.g. work attendance, need to cancel events). Responses are graded on a 7-point Likert scale (7 = not a problem; 1 = a very severe problem). The dimensional scores are the sum of the scores of the items included in each dimension. The sum score is the summation of the individual scores and giving a possible range of 32 to 224. Higher scores mean better quality of life.
Hospital Anxiety and Depression Scale (HADS)
The HADS19 is a 14-item scale that evaluates anxiety and depression in medical patients. The total score is obtained by summing the ratings on the 14 items. The Spanish version of the HADS was translated by Herrero et al20. This instrument is used in several studies and also for validation of other instruments.
Harm Avoidity (HA), a subscale of Temperament and Character Inventory (TCI)
The TCI21 is an instrument used for the dimensional study of the temperament and character components of personality. It is divided into 7 independent dimensions. Briefly, one dimension is HA, which reflects a tendency to be shy, careful, passive, insecure, worried in anticipation of possible danger. The Spanish version of the TCI was validated by Gutiérrez et al22 and it is a reliable and valid instrument.
Assessment of Disease Activity
The disease activity of patients with CD was assessed by the Crohn's Disease Activity Index (CDAI). It is a disease-specific index that has demonstrated acceptable reliability and validity. CDAI scores less than 150 indicate clinical remission, while higher scores indicate active disease23. Patients with UC were assessed by the Truelove-Witts Index24 This scale has been widely used in research and clinical care because it is easy to use and is believed to give an accurate measure of the physician's assessment of activity. Scores less than eight indicate clinical remission, while a relapse is defined by a total score on the Truelove-Witts Index of at least 8 points.
Statistical analysis
Descriptive data on frequency, percentage, means and standard deviations were used for the demographic and clinical characteristics. The IBDQ was evaluated regarding internal and external validity and internal consistency reliability.
Internal validity
The aim of the internal validation of the Spanish version of IBDQ was to test the structure of the questionnaire and to ascertain if the use of a sum score and the 4-dimension scores could be confirmed.
Confirmatory factor analyses were performed using R v 2.2.125. We evaluated the fit of our data to the original four-factor structure, using the maximum likelihood estimation method. An exploratory factor analyses was carried out in the questionnaire using a principal-component method, followed with varimax rotation and Kaiser normalization. A scree plot was performed for determining the number of underlying factors with eigen-values above unity.
To assess if an item measured the same area of health as the dimension it was supposed to belong, the following strategy was adopted: for the Item Convergent Validity, the correlation between an item and its dimension should be higher than 0.40. Likewise, for the Item Discriminant Validity the correlation should be at least 0.1 better to its own dimension than to any of other dimensions.
To investigate if items measure one or several areas of health (correlation between scales), correlation between the four dimensions were performed. If correlation were below than 0.70 the dimensions would measure different areas of health.
External validity
The concurrent and divergent validity and discriminating power were checked to test the external validity of the Spanish version of IBDQ.
The concurrent validity was analyzed by correlating the IBDQ dimensions with the HADS and activity indexes. We tested the following hypotheses: a) The emotional function was hypothesized to have a high correlation with the HADS, and b) the bowel symptoms subscale was hypothesized to have a high correlation with the CDAI or the Truelove-Witts Index.
Validity was considered to have been confirmed when the scales measuring related concepts had a large correlation (r > 0.50), and when these correlations were greater than those measuring different concepts.
To assess its divergent validity, the total score of the IBDQ was correlated with the score of the Harm Avoidance scale of the TCI. We hypothesized to have a small correlation (r between 0.10 to 0.29).
To investigate the power of the IBDQ to discriminate patients with exacerbation from those in remission (discriminating power), they were divided into two groups according to their activity index. Data were analysed using the Mann-Whitney U test for independent samples. The accepted level of statistical significance was p < 0.01.
Reliability
The internal consistency reliability was calculated with Cronbach's alpha coefficient (McNemar method was used for correlation coefficient adjust) for the whole questionnaire and also for each dimension. The minimum acceptable value for alpha was 0.70.
Ethics
The protocol was approved by the ethical Committee of Research Ethics of Hospital Clínic. Each participant gave written informed consent before participation.
RESULTS
Out of 186 outpatients with CD and UC asked to participate and accepted. Of these, 39 cases did not send the questionnaire set before 15 days. Thus, the study comprised 147 patients, 76 (51.7%) with CD and 71 (48.3%) with UC. Demographic and disease-related factors are summarized in table I.
Internal validity
Factor structure of the Spanish IBDQ
The confirmatory factor analysis failed to reproduce the original 4-factor structure proposed by Guyatt et al. The model had a significant Chi2 (Model Chisquare = 1522.4; p < 0.001), a Goodness-of-fit index (GFI) of 0.59, and a root-mean-square error of approximation (RMSEA) of 0.12. Considering criteria by Hu and Bentler26 (that is to say, Chi2 have to be non-significant, adequate fit indices are GFI of 0.90 or greater and RMSEA 0.06 or lower) our results indicate a poor fit.
A principal component analysis was performed with a varimax rotation and Kaiser normalization. Table II displays the factor loadings greater than 0.40 from the factor analysis of 32 IBDQ items. An eigen-value greater than 1.0 as the criterion for cut-off resulted in six factors being extracted from the entire pool of items. However, the scree plot showed a flattening of the curve on factors 4, 5 and 6. The three analyses were performed, and the 4-factor analysis was accepted as the best one. They explained 60.8% of the variance. Factor loading and principal components analyses are given in table II.
Factor I: emotional function
According to the factor analysis, 10 of the items had their highest loading on this factor. Eight of these came from the emotional function dimension of the originally IBDQ, and the remaining two from the systemic symptoms subscale. Questions 6 and 27 have a similar factor loading on Factor I and Factor II and III respectively. Owing to their contents, we decided to include them to Factor I. This Factor explained 17.9% of the variance.
Factor II: social function
There are 6 questions with their highest loading on Factor II. Five items came from the social function subscale of the originally IBDQ, and one belonged to the emotional function subscale. This Factor explained 15.9% of the variance.
Factor III: bowel movement
Eight questions had their highest loading on Factor III, six of which came from the bowel symptoms dimension of the originally IBDQ. Question 11 has similar high factor loading on Factor II and Factor III, but given its content, it has been included in Factor III. This Factor explained 14.6% of the variance.
Factor IV: bowel pain and discomfort
There were seven items with their highest loading on Factor IV. Four items belonged to the bowel function subscale, two items belonged to the systemic symptom subscale, and one item to the emotional function subscale. Item 9 had also a high factor loading on Factor III, but given its high loading on Factor IV and its content, we included it in this last factor. Factor IV explained 12.4% of the variance.
The item «sleeping disturbance» from the systemic symptom dimension, did not load onto any of these four factors. This might be a case for excluding this item from the Spanish version of the IBDQ.
Item convergent and item discriminant validity
Convergent and discriminant validity for the four IBDQ original dimensions are presented in table III.
The correlation between the items and their hypothetical scales was > 0.4 for all items apart from the item «worried about surgery» that correlated < 0.40 with all dimensions. The percentage of items in each dimension that correlated at least 0.1 higher with their own dimension than with other dimension was 90% for bowel symptoms, 80% for systemic symptoms, 66% for emotional function, and 100% for social function.
Correlation between scales
Inter-dimensional correlations for the four dimensions are presented in table IV. It can be observed that only the correlation between emotional function and systemic symptoms was higher than 0.70.
External validity
Concurrent validity
Table IV illustrates the correlation between the four dimensions and clinical variables (disease activity and HADS).
As expected, we found a correlation higher than 0.5 between the Truelove-Witts Index and the CDAI with the bowel function subscale. However, the correlation between the CDAI and some unrelated domains (systemic symptoms and social function) were higher than ex pected.
The correlation between the HADS and emotional function was high. However, the correlation between the HADS and systemic symptoms was higher than we expected. This could be explained by the high inter-dimensional correlation between emotional function and systemic symptoms subscales (r = 0.74).
Divergent validity
The correlation between the total score of the IBDQ and the Harm Avoidance scale of the TCI was higher than we hypothesized (r = 0.33). However, divergent validity could be considered acceptable.
Discriminating power
The results showed that the IBDQ had a very high power (p < 0.001) to discriminate exacerbated patients and remitted patients in both UC and CD (data not shown).
Reliability
The Cronbach's alpha value for IBDQ sum score was 0.95, for the bowel symptoms subscale was 0.87, for the systemic symptoms subscale was 0.74, for emotional function subscale was 0.91, and for social function subscale was 0.88. All values surmounted the criterion for acceptable internal reliability.
DISCUSSION
The aim of the present study was to evaluate the psychometric properties of a Spanish version of the IBDQ. It is the first time that a Spanish version of the 32-item IBDQ has been validated. This study has shown that this version of IBDQ is a valid and reliable measure of health-related quality of life for Spanish IBD patients.
In relation to the internal structure of the IBDQ, there are some concerns. The confirmatory factor analysis failed to reproduce the original four-factor structure proposed by Guyatt et al3. These results are in agreement with the three previous studies that have included a factor analysis for the IBDQ validation7,9,11, which also have found other factor structures that the one proposed by Guyatt et al. Two of them7,11 found five underlying factors, and the other one9 found a structure of six factors. One explanation might be that in the development of the original IBDQ, items were arranged into four dimensions on the basis of clinical reasoning, and no factor analyses were checked. Moreover, an exploratory analysis to determine both the number of factors and the items in each factor in the Spanish version of IBDQ could be of paramount importance. The factor structure found in this study respect the four factors of the original IBDQ, but the content of each factor and their items are slightly different. Although the results does not resemble the original structure, given that the factor structure found has been confirmed by exploratory factor analysis, the Spanish version of the IBDQ can be considered a useful instrument for gastroenterologists.
The convergent item validity was excellent and the discriminant item validity was acceptable. The inter-dimensional correlations revealed moderate correlations between the four dimensions, thus the IBDQ seem to measure several areas of health. Therefore, the IBDQ could be more understandable as a group of different dimensions than as a total sum score for the whole questionnaire, although a global score of the IBDQ is also interpretable.
In terms of external validity, the Spanish version of IBDQ showed acceptable concurrent and divergent validity. These results might suggest that the IBDQ does not assess a personality trait on the tendency to be neurotic but different quality of life aspects. Concerning the discriminating power, results suggested that the Spanish IBDQ was able to discriminate successfully between patients with different clinical disease activity, which endorse more validity to the instrument.
Finally, in our study, the internal consistency reliability was very high for both the whole scale and for each dimension. These results suggest that the Spanish version of IBDQ shows an excellent internal consistency. The coefficient found in this study is alike to that obtained in other validated versions of the IBDQ, such as 0.95 in the Swedish version9, 0.92 in the Portuguese version14, and 0.93 in the Dutch version6.
In conclusion, the Spanish IBDQ has been shown to be a valid and reliable instrument to measure health-related quality of life in IBD patients, although our data indicate that the questionnaire is based on a different factor structure from the original IBDQ. There exists a factor structure with four dimensions that are reasonably related to the quality of life in IBD patients. Thus, this instrument has proved to be useful in the quality of life related aspects evaluation for clinical trials, as well as for the therapeutic management of patients with CD and UC.
ACKNOWLEDGEMENTS
This work is supported by Ministerio de Ciencia y Tecnología Grant SAF2002-02211, and grant C03/02 from the Instituto de Salud Carlos III. M. Sans is supported by a grant from Fundación Ramón Areces. MJ. Portella is supported by a postdoctoral contract Juan de la Cierva, form the Ministerio de Educación y Ciencia.