Scaled scores and PEP


Scaled scores and PEP

BY Oswald Leon

Wednesday, August 21, 2019

Print this page Email A Friend!

The wait is now over. The Ministry of Education has rolled out its inaugural results format for Primary Exit Profile (PEP). Students, parents, teachers, and the wider society are still in the unravelling stage trying to make sense of this new result format, which is part of the paradigm of ensuring that standardised test results have a consistent meaning for all test takers.

PEP provides a common basis for evaluating and comparing test takers' abilities in specific content areas. Because of its varied forms, for PEP to have consistency in score interpretation, the test scores are often transformed into a set of values called scaled scores. When we talk about scaled scores, we mean scores that have been mathematically transformed from one set of numbers (usually raw scores) to another set of numbers in order to make them comparable or aid users in interpreting test results, for example, across different editions or “forms” of the same test.

Raw score is the total number of points a test taker obtains by answering questions correctly on a test. Also, a per cent-correct score represents the percentage of questions a test taker answers correctly on a test. For example, if the test taker correctly answered 20 out of 50 questions on a test, then his or her per cent-correct score would be 40 per cent.

For purposes of this discussion, it is also important to explain the meaning of the word comparability and begin with what it is not. Comparability is not an attribute of a test or test form, nor is it a yes/no decision. Instead, comparability is the degree to which scores resulting from different assessment conditions can support the same inferences about what students know and can do. Comparability becomes important when we make the claim that students and schools are being held to the same standard, particularly when those designations are used in a high-stakes accountability context (Evans & Lyons, 2017).

Why shift from percentage to scaled scores?

For fair and consistent decisions to be made on exam results scores should be comparable. This means that scores from different forms of a test should indicate the same level of performance regardless of which exam form a test taker has received. This will factor in the potential variability (measures of dispersion) in difficulty between unique exam forms. Although strict adherence to common test specifications or blueprints allow test developers to create multiple forms, they are rarely, if ever, exactly equal in difficulty. When we speak of difficulty, we mean, the proportion of students who answered the item correctly. This makes it hard to use the percentage-based scores because they do not always represent a fair comparison of the test takers' performances on the different forms of the same test.

Let's illustrate the point, getting 50 per cent correct on a hard form may mean the test taker has more knowledge and skill than another test taker getting 60 per cent correct on a relatively easier form. For the same reason, the raw scores cannot be used to compare test takers' performances on different forms. When two test takers get the same raw score on two different forms, the test taker who took the more difficult form has demonstrated a higher level of performance than the test taker who took the relatively easier form.

To achieve comparability, standardised test results report scaled scores. This involves two processes known as scaling and equating. Scaling is the process by which scaled scores are obtained by statistically adjusting and converting raw scores on to a common scale to account for differences in difficulty across different forms. Equating is done to adjust the passing score as needed to account for any differences in form difficulty. Equating procedures measure the difficulty of each exam form and adjust the passing score as needed so that the same level of test taker performance is reflected in the passing score regardless of the difficulty of the form. Equating procedures ensure that an equivalent passing standard for each form is maintained. Test takers who happen to take a slightly more difficult exam form are not penalised. Likewise, test takers who take the slightly easier exam form are not given an unfair advantage.

Usefulness of scaled scores

Table 1 shows reported scaled scores that are obtained by statistically adjusting and converting raw scores on to a common scale score to account for differences in difficulty across different forms. Even though Table 1 is hypothetical, it illuminates our understanding of how scaled scores function. Based on the table, for an easier form, a test taker needs to answer slightly more questions correctly to get a particular scaled score. For a more difficult form, a test taker can get the same scaled score, answering slightly fewer questions correctly. The reason for this is that scaled scores take into consideration the level of difficulty of the test item and the level of ability of the test taker. The scale score of an item is a measure of the extent of skills and knowledge required from a student to be successful on the item. A difficult item has a high scale score because it requires more technical skills and richer knowledge to be answered correctly than items lower on the scale. Thus, the utility of scaled score comes into sharper focus because they allow for meaningful score interpretations and, at the same time, minimising misinterpretations and inappropriate inferences (Tan & Michel, 2011).

This is in keeping with the purpose of PEP — measuring students' readiness for grade seven, functioning as a means of placing students in secondary schools, generating an academic profile of each student, and providing accurate information about students' knowledge, abilities and skills and across several subjects areas.

Let us be true to our educational mandate: “Every child can learn, every child must learn.”

Oswald Leon is a measurement specialist. Send comments to the Observer or

Now you can read the Jamaica Observer ePaper anytime, anywhere. The Jamaica Observer ePaper is available to you at home or at work, and is the same edition as the printed copy available at




1. We welcome reader comments on the top stories of the day. Some comments may be republished on the website or in the newspaper � email addresses will not be published.

2. Please understand that comments are moderated and it is not always possible to publish all that have been submitted. We will, however, try to publish comments that are representative of all received.

3. We ask that comments are civil and free of libellous or hateful material. Also please stick to the topic under discussion.

4. Please do not write in block capitals since this makes your comment hard to read.

5. Please don't use the comments to advertise. However, our advertising department can be more than accommodating if emailed:

6. If readers wish to report offensive comments, suggest a correction or share a story then please email:

7. Lastly, read our Terms and Conditions and Privacy Policy

comments powered by Disqus



Today's Cartoon

Click image to view full size editorial cartoon