Buscar en
Psicología Educativa - Educational Psychology
Toda la web
Inicio Psicología Educativa - Educational Psychology An introduction to the use of evidence-centered design in test development
Información de la revista
Vol. 20. Núm. 2.
Páginas 79-87 (diciembre 2014)
Descargar PDF
Más opciones de artículo
Vol. 20. Núm. 2.
Páginas 79-87 (diciembre 2014)
Open Access
An introduction to the use of evidence-centered design in test development
Introducción al diseño centrado en la evidencia en la construcción de tests
Michael J. Zieky1
Educational Testing Service, Princeton, U.S.A
Este artículo ha recibido

Under a Creative Commons license
Información del artículo
Descargar PDF

The purpose of this article is to describe what Evidence-Centered Design (ECD) is and to explain why and how ECD is used in the design and development of tests. The article will be most useful for readers who have some knowledge of traditional test development practices, but who are unfamiliar with ECD. The article begins with descriptions of the major characteristics of ECD, adds a brief note on the origins of ECD, and discusses the relationship of ECD to traditional test development. Next, the article lists the important advantages of using ECD with an emphasis on the validity of the inferences made about test takers on the basis of their scores. The article explains the nature and purpose of the “layers” or stages of the ECD test design and development process: 1) domain analysis; 2) domain modeling; 3) conceptual assessment framework; 4) assessment implementation; and 5) assessment delivery. Some observations about my experience with the early application of ECD for those who plan to begin using ECD, a brief conclusion, and some recommendations for further reading end the article.

Evidence centered design
Test development
Test design
Test construction
Evidentiary reasoning

El objetivo de este trabajo es describir qué es y explicar por qué y cómo se utiliza el Diseño Centrado en la Evidencia (DCE) para diseñar y construir tests. Este trabajo está pensado especialmente para personas que ya estén algo familiarizadas con las prácticas tradicionales de construcción de tests pero que desconozcan el DCE. Comienza con una descripción de las características fundamentales del DCE, continua con un breve apunte acerca de su origen y analiza su relación con la práctica tradicional en la construcción de tests. A continuación, se indican las ventajas que conlleva la utilización del DCE, resaltando su impacto en la validez de las inferencias realizadas sobre los sujetos en base a sus puntuaciones en los tests. En el artículo se explica la naturaleza y el objetivo de las ‘capas’ o etapas en el proceso de diseño y construcción de tests con el DCE: 1) análisis del dominio, 2) modelado del dominio, 3) marco conceptual de la evaluación, 4) implementación de la evaluación y 5) administración de la evaluación. Para terminar, se ofrecen algunos comentarios acerca de la experiencia del autor en la aplicación del DCE para aquellos que estén pensando en empezar a utilizarlo, junto a una breve conclusión y alguna recomendación acerca de lecturas adicionales sobre el tema.

Palabras clave:
Diseño centrado en la evidencia
Desarrollo de tests
Diseño de tests
Construcción de tests
Razonamiento a partir de la evidencia
El Texto completo está disponible en PDF
[Almond et al., 2002]
Almond, R. G., Steinberg, L. S., & Mislevy, R. J. (2002). Enhancing the design and delivery of assessment systems: A four-process architecture. Journal of Technology, Learning, and Assessment, 1(5). Available from
[American, 2014]
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
[Bejar, 2011]
Bejar, I. (2011). A validity-based approach to quality control and assurance of automated scoring. Assessment in Education: Principles, Policy & Practice, 18, 319-341. Retrieved from
[Bennett, 2010]
Bennett, R. E. (2010). Cognitively based assessment of, for, and as learning (CBAL): A preliminary theory of action for summative and formative assessment. Measurement: Interdisciplinary Research and Perspectives, 8, 70-91. doi: 10.1080/15366367.2010.508686.
[Deane and Song, 2014]
P. Deane, Y. Song.
A case study in principled assessment design: Designing assessments to measure and support the development of argumentative reading and writing skills.
Psicología Educativa, 20 (2014), pp. 99-108
[de la Torre and Minchen, 2014]
J. de la Torre, N. Minchen.
Cognitively diagnostic assessments and the cognitive diagnosis model framework.
Psicología Educativa, 20 (2014), pp. 89-97
[Graf, 2009]
Graf, E. A. (2009). Defining mathematics competency in the service of cognitively based assessment for grades 6 through 8 (Research Report 09-42). Princeton, NJ: Educational Testing Service.
[Hansen et al., 2008]
Hansen, E. G., Mislevy, R. J., & Steinberg, L. S. (2008). Evidence-centered assessment design for reasoning about accommodations for individuals with disabilities in NAEP reading and math (Research Report 08-38). Princeton, NJ: Educational Testing Service. Hines, S. (2010). Evidence-centered design: The TOEIC® speaking and writing tests (Re-search Report 10-07). Princeton, NJ: Educational Testing Service.
[Huff, 2010]
K. Huff.
The promises and challenges of implementing evidence-centered design in large-scale assessment.
Applied Measurement in Education, 23 (2010), pp. 310-324
[Huff et al., 2013]
Huff, K., Alves, C. B., Pellegrino, J., & Kaliski, P. (2013). Using evidence-centered design task models in automatic item generation. In M. J. Gierl & T. M. Haladyna (Eds.), Automatic item generation Theory and practice (pp. 102-118). New York: Routledge.
[Messick, 1989]
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-104). Washington, DC: American Council on Education.
[Mislevy, 1994]
Mislevy, R. J. (1994). Evidence and inference in educational assessment. Psychometrika, 59, 439-483.
[Mislevy, 2006]
Mislevy, R. J. (2006). Cognitive psychology and educational assessment. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 257-306). Washington, DC: American Council on Education/Praeger.
[Mislevy et al., 2003]
Mislevy, R. J., Almond, R. G., & Lukas, J. F. (2003). A brief introduction to evidence-centered design (Research Report 03-16). Princeton, NJ: Educational Testing Service.
[Mislevy et al., 1999]
Mislevy, R. J., Almond, R. G., Yan, D., & Steinberg, L. S. (1999). Bayes nets in educational assessment: Where do the numbers come from? In K. B. Laskey & H. Prade (Eds.), Proceedings of the fifteenth conference on uncertainty in artificial intelligence (pp. 437-446). San Francisco, CA: Morgan Kaufmann.
[Mislevy et al., 2010]
Mislevy, R. J., Bejar, I. I., Bennett, R. E., Haertel, G. D., & Winters, F. I. (2010). Technology supports for assessment design. In B. McGaw, E. Baker, & P. Peterson (Eds.), International encyclopedia of education (3rd ed., volume 8, pp. 56-65). Amsterdam, Netherlands: Elsevier.
[Mislevy and Haertel, 2006]
Mislevy, R. J., & Haertel, G. (2006). Implications of evidence-centered design for educational testing. Menlo Park, CA: SRI International.
[Mislevy et al., 2011]
Mislevy, R., Haertel, G., Yarnall, L., & Wentland, E. (2011). Evidence-centered task design in test development. In C. Secolsky (Ed.), Measurement, assessment, and evaluation in higher education (pp. 257-276). New York, NY: Routledge.
[Mislevy and Riconscente, 2005]
Mislevy, R. J., & Riconscente, M. M. (2005). Evidence-centered design: Layers, structures, and terminology. Menlo Park, CA: SRI International.
[Mislevy et al., 1999a]
Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (1999). Evidence-centered assessment design. Princeton, NJ: Educational Testing Service.
[Mislevy et al., 2003a]
Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 3-67.
[Mislevy and Yin, 2012]
Mislevy, R. J., & Yin, C. (2012). Evidence-centered design in language testing. In G. Fulcher & F. Davidson (Eds.), Routledge handbook of language testing (pp. 208-222). London, England: Routledge.
[Pellegrino, 2014]
Pellegrino, J. W. (2014). Assessment as a positive influence on 21st century teaching and learning: A systems approach to progress. Psicología Educativa, 20, 65-77.
[Scalise and Wilson, 2006]
Scalise, K., & Wilson, M. (2006). Analysis and comparison of automated scoring approaches: Addressing evidence-based assessment principles. In D. M. Williamson, R. J. Mislevy, & I. I. Bejar (Eds.), Automated scoring of complex tasks in computer-based testing (pp. 15-47). Mahwah, NJ: Lawrence Erlbaum Associates.
[Sheehan et al., 2007]
Sheehan, K. M., Kostin, I., & Futagi, Y. (2007). Supporting efficient, evidence-centered item development for the GRE® verbal measure (Research Report 07-29). Princeton, NJ: Educational Testing Service.
[Stocking and Swanson, 1993]
Stocking, M., & Swanson, L. (1993). A method for severely constrained item selection in adaptive testing. Applied Psychological Measurement, 17, 277-292.
[Tannenbaum et al., 2008]
Tannenbaum, R. J., Robustelli, S. L., & Baron, P. A. (2008). Evidence-centered design: A lens through which the process of job analysis may be focused to guide the development of knowledge-based content specifications. CLEAR Exam Review, 19, 26-33.
[Toulmin et al., 1958]
Toulmin, S. E. (1958). The uses of argument. Cambridge, England: Cambridge University Press.
[Van Rijn et al., 1958]
Van Rijn, P. W., Graf, E. A., & Deane, P. (2014). Empirical recovery of argumentation learning progressions in scenario-based assessments of english language arts. Psicología Educativa, 20, 109-115.

Correspondence concerning this article should be addressed to Michael J. Zieky. ETS MS 04N. 660 Rosedale Road. Princeton, NJ, USA, 08541.

Copyright © 2014. Colegio Oficial de Psicologos de Madrid
Descargar PDF
Opciones de artículo