Skip navigation

We’ve launched our Expert Advisory Panel - read more about the Innovation Agenda here.

What is the reliability of the Physiotherapy Competency Examination?

Reliability is the extent to which the scores would be reproducible on repeated administrations of the exam. Since we cannot administer the same exam to the same individuals, we use statistical methods to estimate the reliability based on the results of a single exam given to a single group.

The bodies that develop standards for educational and psychological tests (the American Educational Research Association, the American Psychological Association and the National Council on Measurement in Education) do not set numerical thresholds for reliability, even for use in specific types of decision making. The reason for this is that a “one size fits all” approach to reliability is not consistent with the context specific nature of psychometrics. Different standards must be used for different kinds of exam results.

When an exam is testing clinical competence in order to determine whether a candidate should be licensed, it is important that a test consistently classifies candidates as passing or failing relative to a standard. The most important reliability for licensing exams is the consistency of classification: would the same candidates be classified as passing and failing the exam on a repeated administration?

For the Written Component, we use the Cronbach alpha coefficient to assess the reliability of test results. The Written Component consistently achieves acceptable Cronbach alpha values.

In 2015, CAPR changed the method for assessing reliability of classification for the Clinical Component from Subkoviak to Livingston and Lewis (1995) method. With this approach, we can define a coefficient of consistency of classification. The Clinical Component of the Physiotherapy Competency Examination (PCE) consistently achieves acceptable values for criterion-referenced consistency of classification at the passing score for both the total score criterion and the number of stations criterion.

Finally, it is more important that an exam is considered reliable when compared to other exam programs than to an arbitrary external standard. In this respect, for the Clinical Component, Norcini7 found that the “reproducibility of the [total binary score] is not equivalent to most written exams, but it is comparable to other Objective Structured Clinical Examination/oral examination formats.”