Kardong-Edgren Suzan, Oermann Marilyn H, Rizzolo Mary Anne, Odom-Maryon Tamara
About the Authors Suzan Kardong-Edgren, PhD, RN, CHSE, FAAN, ANEF, is a professor and director of the RISE Center, School of Nursing and Health Sciences, Robert Morris University, Moon Township, Pennsylvania. Marilyn H. Oermann, PhD, RN, FAAN, ANEF, is Thelma M. Ingles Professor of Nursing and director of evaluation and educational research, Duke University School of Nursing, Durham, North Carolina. Mary Anne Rizzolo, EdD, RN, FAAN, ANEF, is a consultant for the National League for Nursing. Tamara Odom-Maryon, PhD, is a professor of research, Washington State University College of Nursing, Spokane. For more information, contact Dr. Kardong-Edgren at
Nurs Educ Perspect. 2017 Mar/Apr;38(2):63-68. doi: 10.1097/01.NEP.0000000000000114.
This article reports one method to develop a standardized training method to establish the inter- and intrarater reliability of a group of raters for high-stakes testing.
Simulation is used increasingly for high-stakes testing, but without research into the development of inter- and intrarater reliability for raters.
Eleven raters were trained using a standardized methodology. Raters scored 28 student videos over a six-week period. Raters then rescored all videos over a two-day period to establish both intra- and interrater reliability.
One rater demonstrated poor intrarater reliability; a second rater failed all students. Kappa statistics improved from the moderate to substantial agreement range with the exclusion of the two outlier raters' scores.
There may be faculty who, for different reasons, should not be included in high-stakes testing evaluations. All faculty are content experts, but not all are expert evaluators.