(p < 0.001). In some analyses, greater quality appeared to be associated with heath centers that were also teaching centers, reference in the document to health care, and women´s health programs. Conclusions. Document quality varied significantly in different health care areas, and certain characteristics (chronic health problems, multidisciplinary design and specific design, reference to specific health services offered) were associated with greater document quality. Reference to acute health problems, design by only one type of professional (physicians or nurses), inclusion as part of a larger program, and lack of reference to specific health services offered at a given center were characteristics with a greater risk for low document quality.
Introduction
Good design and the prevention of quality-related problems through planning are two necessary components of programs for health care quality.1-3 The development of clinical practice guidelines is one technique that can be used to design quality of care. These documents can take a variety of forms, as noted in appendix B of the classic IOM report,4 which gives examples of different presentation formats for algorithms, clinical protocols, and other documents.
Among the different working techniques clinicians use, the development of clinical practice guidelines for specific medical problems is probably one of the most widely known and often used strategies, as it is an essential part of the process of clinical care improvement.5-7 Thus the development of such guidelines is one of the most strongly supported activities at the international level as part of programs aimed at managing health care,8-10 and as an important field of research.11-14
Clinical practice guidelines are tools that facilitate decision-making by physicians and that help them to fight against uncertainty and diminish variability in clinical practice. However, for the guidelines to be effective they must be satisfy minimum requirements for formal quality, i.e., they must be validly structured so that the results of their application can be evaluated. A bad tool can simply be ignored, or it may invalidate the clinical results obtained as a result of its implementation.
As part of the ongoing research by our working group,15-19 and as a follow-on to our analysis of the structural quality of these documents,19 we set out in the present study to identify the variables that influenced formal quality of clinical practice guidelines.
Material and methods
According to the 1998 census, the autonomous region of Murcia in Southeastern Spain has 1115068 inhabitants. Most primary health care in this region is provided through the public network of health care centers; the present study was based on documents from these centers.
On the basis of an earlier study at one health center,16 our research group undertook a retrospective study of all protocols developed in the region of Murcia between January 1985 and January 1994. During that period 31 of the 39 existing health centers used formal protocols for at least some of their activities.
Protocols were identified by searching the regional registers of the Consejería de Sanidad (health council of the regional government) and the «gerencias de atención primaria» (primary care management offices), and by a telephone survey of all health center coordinators. A total of 470 clinical protocols were submitted to the research group or obtained in the course of a visit to the center. Eight documents were excluded from the analysis because of missing data (authors, health center, etc.); in all, 462 documents were evaluated. The independent variables and their categories are shown in table 1.
The selection of structural quality criteria and the description of each criterion are detailed along with other characteristics of the materials and methods in an earlier publication.19 To evaluate structural quality of the protocols we used compliance with each of the 9 quality criteria established for each protocol.
To identify those characteristics that had an overall influence on structural quality of the documents we investigated the relation between the independent variables and two dependent variables: compliance rate and ratlog. Compliance rate for a given protocol is the relationship between the number of structural quality criteria this document met and the total number of structural criteria considered (9). The variable designated ratlog was a qualitative variable with two categories: 0 if the percentage of compliance with individual criteria was higher than the mean compliance rate for all protocols, or 1 if it was lower.
Most of the independent variables were qualitative in nature. The dichotomous variables were: whether the protocol formed part of a larger health care program, accreditation of the protocol, and whether it dealt with a health service that formed part of the services normally provided by the center. Qualitative variables with more than two categories were: health care area, teaching accreditation of the center, participation of more than one type of professional in the development of the protocol, population group served by the protocol, nature of the health problem, and type of activity covered by the protocol. The variables «year the center was opened» and «year of accreditation of the protocol» were also treated as categorical variables with four possible «values»: 1986, 1987, 1992 or 1993.
Nonbinary qualitative variables with k categories were converted to dummy variables and broken down into k-1 imaginary binary variables such that the non-introduced variable served as the reference category.20
Multivariate analysis was used to search for characteristics of the protocols or of their writing and development process that were significantly associated with greater quality of the protocol design. Two types of analysis were used: multiple regression and logistic regression.
Multiple regression
Multiple regression analysis was used to try to identify those characteristics that improved the protocols regardless of whether the compliance rate was below the mean for all protocols analyzed. That the data fulfilled the requirements for using this approach was verified (linear relation, normal distribution, homoscedasticity and absence of colinearity) with analysis of residuals and analysis of tolerance.21 The «enter» method was used with compliance rate (mean compliance per protocol) as the dependent variable. As independent variables we used those mentioned above. Year of accreditation of the protocol was subsequently excluded because of the low number of documents (175 out of 462) for which this information was available.
Logistic regression
Logistic regression was used to try to identify the characteristics of the best protocols, i.e., to find out which features were associated with a level of structural quality higher than the mean for all protocols.
We calculated the adjusted odds ratio (OR) at a 95% confidence interval for each characteristic analyzed as an independent variable and for the variable «ratlog».
All statistical analyses were done with the PC(+)#r Statistical Package for Social Sciences (SPSS).
Results
Of the 519 documents identified from different registries, we obtained 470 (49 were irretrievable); 462 of these underwent full evaluation.
Only a few documents had missing data for the variables «year the center was opened» (5 cases, 1.1% of the total sample) or «inclusion in services offered by the center» (45 cases, 9.7%). We considered these rates to be low and within our expectations. Many protocols did not specify the «year the protocol was written» (311 cases, 67.3%) or the «year of accreditation of the protocol» (175 cases, 37.9%), as this information was missing from the document itself and from the registry. Therefore these two variables were excluded in most of the subsequent analyses.
Multiple regression with the mean number of defects per document as the dependent variable showed that better document quality was significantly associated with women´s health programs (p<0.03) and inclusion in the services normally offered by the center (p<0.001). Features that were associated with worse document quality were health care area other than that those corresponding to the city of Murcia (which was used as the reference location), non-teaching health center, protocol written by nurses only or by physicians only, protocol dealing with an acute health problem, and inclusion of the protocol as an annex to a larger health program rather than being developed specifically as an independent protocol or clinical guideline (table 2).
Table 3 (showing characteristics that yielded significant results) details the results of the logistic regression analysis with the dependent variable «ratlog», which identified documents whose percent compliance rate was greater than or equal to the mean. Table 4 (showing ORs and 95% confidence intervals) summarizes the results of the overall analysis of protocol quality. The ORs showed significant associations for the Cartagena and Cieza-Molina health care areas in comparison to the Murcia area. Other significant associations were identified for protocols written by physicians alone in comparison to protocols written by a multidisciplinary team, protocols designed for acute or other health problems in comparison to chronic health problems, protocols designed for follow-up procedures rather than for acute care, protocols that formed part of a health care program, and protocols designed for a service normally offered by the health center. The likelihood of a given clinical guideline being of greater than average quality was at least threefold as high (OR=2.98) if the document dealt with a service normally offered by the center than if it was developed for a service not normally provided by that center. Document quality was worse for protocols included as an annex to a larger program in comparison to those that were developed specifically as an independent document.
Discussion
This report details our experience in an evaluation of structural quality of protocols or clinical guidelines developed by primary care teams in the region of Murcia during a relatively long (8-year) period.
Limitations of the design
The design and application of the method used here were straightforward and have been evaluated and validated previously.1-3 The approach has been shown to be feasible and useful in primary health care, and its external validity and reproducibility in other autonomous communities have been verified in part. The quality dimension this study investigated was scientific and technical quality.1-3
We evaluated nearly all documents (462) in the universe of study (519), which guarantees the representativeness of our results. The process used to search for and locate relevant documents highlighted some difficulties in obtaining copies because of the lack of a central register for the entire region or of any other type of institutional register at the time of study in at least three of the six regional health care districts. Individual health centers did not keep central registers or files of all such documents; this fact accounts for some of the missing documents, suggests that professionals may not be able to access these instruments, and identifies an area in need of improvement. When this research project was in the design stage, we expected to find about 100 documents; the final figure was nearly fivefold as high. This high number may be interpreted to reflect the positive attitude of professionals toward the development of written instruments for health care, at least at the beginning of and throughout the study period.
Construction and content validity with relation to the choice of quality criteria was ensured by the fact that the study was based on previously published work, and by the process we used to guarantee reliability (interobserver agreement).
We believe some of the possible sources of bias in data collection were obviated by the detailed design of the data collection sheet, the simplicity of the data collection process, previous training and analysis of the reliability of the observers´ criteria. Because the number of lost data was low and because they were distributed in a near-random manner across all health centers, we believe these losses did not bias our results.
Comparison with earlier studies
A search of the literature found no similar studies in Spain, thus comparisons with earlier results were difficult. One earlier study16 yielded overall results similar to ours: an initial evaluation of one health center showed overall quality of the protocols developed there to be very low, but did not identify the factors that detracted from quality. Other documents developed at primary care centers within the framework of regular training activities such as the national Continuing Medical Education Program have been found to have higher structural quality, although they were also in need of substantial improvement in some areas.18
The few available studies from other countries22-24 were only partially comparable with the present report, because of differences in document collection, in the features analyzed in the documents, in the criteria used (despite methodological similarities), or in the way the results were presented. An evaluation of 855 clinical guidelines developed at 22 health centers in the Cambridge area between 1989 and 1997 showed that 75% of the guidelines referred to clinical or disease management activities,22 a finding that appears to be similar to our results.
Two earlier studies investigated many more quality criteria, which numbered 37 in one case23 and 25 in another.24 Moreover, these criteria were often much more stringent than ours. These studies also found that overall quality of the guidelines was poor: in both cases mean percentage compliance with criteria (in three groups of dimensions) was below 50% for almost all criteria. The evaluation of centers in the Cambridge area emphasized that 38% of the documents failed to state the date when they were written,22 a problem that was much more serious in our study (67% of the documents were missing this information). In another study23 the year of publication was given in 12.8% to 23.0% of the documents. The importance of this finding resides in the speed with which scientific evidence becomes outdated, and the consequent need to indicate when the contents of the guidelines were updated. This information, according to one review, was given in only 14.3% of all protocols.24
Problems with the references were found in up to 90% of all documents in the Cambridge study22, and in 74.2% of all guidelines in the review mentioned above.24 In our material this figure (85.1%) fell between these two percentages. Thus errors in the references appear to be one of the most important and frequent defects in primary care protocols. However, none of these studies analyzed the results with an aim to identifying the factors that were related with greater structural quality of the protocols.
Practical applicability of the results
We believe that research on the quality of health care instruments is timely and valuable because of its repercussions for (among other areas) health and the costs of health care. Our finding that the date the health center opened was not significantly related with document quality can probably be taken to imply that no improvement in quality can be expected in centers where the staff members have more experience working together. The findings allowed us to identify the factors that have a positive influence on quality. Further study would allow us to determine the conditions under which these factors occur, and might make it possible to reproduce these conditions in order to favor the production of high-quality documents such as those written by multidisciplinary teams.
A number of other opportunities for improvement were also identified, e.g., documents produced at certain health centers, those developed at nonteaching centers, guidelines that dealt with acute health problems, and guidelines included as an annex to a larger health program.
Directions for future research
Our study was undertaken as part of a research program with several areas for further development. We hope to develop corrective measures to improve the structural quality of clinical guidelines in primary health care in the region of Murcia, to re-evaluate the structure of this type of document throughout the region after corrective measures have been in place, and to validate a newly-developed instrument for the evaluation of clinical guidelines with regard to design, development, function and scientific evidence. In addition, future work will be aimed at evaluating the relevance of the guidelines, evaluating variability in the recommendations and in the scientific evidence on which they are based, and quantifying the actual use of these tools by professionals. We also hope to identify the characteristics clinical guidelines should have in order to be both useful and utilized, by comparing the structural quality criteria we used with those designed by professional members of the primary care teams who will use these tools.
In conclusion, structural quality was associated with the proportion of criteria the document fulfilled and with compliance with more than the mean number of criteria. Quality varied across health care areas within the region of Murcia. Better guidelines were developed by multidisciplinary groups of authors, dealt with chronic health problems, referred to care provided as part of the center´s regular services, and were developed specifically as a tool for quality design and planning.
Better quality appeared to be associated, in some multivariate analyses, with teaching centers, with acute care, and with women´s health programs. Other characteristics were found to have room for improvement, e.g., guidelines developed at certain health care districts, at nonteaching centers, and those developed for acute health problems.
Acknowledgments
We thank all those persons who assisted us with this study, especially the members of the research group and members of the participating primary care teams.
Correspondence: José Saura Llamas, C/ Atenas 21, 30120 El Palmar (Murcia), Spain. E-mail: csgoya@iname.es This research was carried out with financial support from FIS project 94/1187. None of the authors has any conflict of interest.The data in this article were first reported in a doctoral dissertation supported by FIS project 94/1187. They were communicated at the II Congreso de Calidad Asistencial [Second Congress on Quality of Care] held in Murcia in March 1999.Manuscript accepted for publication 4 July 2001.