ISSN (Print) - 0012-9976 | ISSN (Online) - 2349-8846

A+| A| A-

Role of Chance in Objective Type Competitive Examinations

The format of multiple choice examinations is used to determine rankings in competitive examinations in various higher education institutions. A statistical experiment based on actual multiple choice question papers reveals the preponderance of chance in determining rankings based on performance. With so much of "chance" determining performance, it is questionable if multiple choice-based competitive examinations provide a good measure of the intellectual abilities of students.

COMMENTARYEconomic & Political Weekly EPW december 20, 200821Role of Chance in Objective Type Competitive ExaminationsPramod Srivastava, Shobhit MahajanThe format of multiple choice examinations is used to determine rankings in competitive examinations in various higher education institutions. A statistical experiment based on actual multiple choice question papers reveals the preponderance of chance in determining rankings based on performance. With so much of “chance” determining performance, it is questionable if multiple choice-based competitive examinations provide a good measure of the intellectual abilities of students.One of the distinguishing charac-teristics of the higher education scenario in recent times has been the dominance of competitive examina-tions. Immediately after class XII, it is im-possible to get into any professional course on the basis of the school leaving marks alone. From medicine to engineering to fashion design – every single course and indeed institution demands that one sits for an open examination to get admission. And this continues till after one graduates for postgraduate studies in medicine, engineering, management, computer ap-plications, etc. Finally, a whole range of government/public sector jobs are also available only after clearing competitive examinations, the coveted civil services being only the most well known.There is of course a stated rationale be-hind this proliferation of competitive ex-aminations. The logic goes as follows: given that the standards of various exam-inations (Central Board of Secondary Education (CBSE), state boards, univer-sities, etc) are so different, there is no way that marks obtained in these examina-tions can be compared and used to assess the quality of the candidate. Hence, the need to carry out another as-sessment. Ofcourse, it is assumed by each agency that its examination in some sense does indeed assess the true quality of the student, something which is far from obvious. Without going into the merits and de-merits of this logic, in this piece, we want to try and see if the performance in the en-trance examinations does indeed measure the capabilities of the student. More spe-cifically, we want to see if there is an element of chance in the performance. Multiple Choice ExaminationsIt is a common sentiment that the passing in a competition is, apart from many other factors, a matter of chance. Here we attempt to give this statement a fairmeasure of objective meaning with regard to one kind of method of testing, namely, multiple choice objective exami-nations. Multiple choice objective exami-nations have reduced dependence on long, essay type questions and eliminated the uncertainty associated with the multi-plicity of human examiners. This format makes the scripts simple to evaluate and, in fact, capable of being evaluated by a machine. The elimination of the human examiner has led many to believe that the result of a multiple choice test is sacrosanct. Of course, there is no doubt that the subjec-tivity which might creep in because of a human examiner is eliminated with this format. However, to jump from this to say that the result of a multiple choice exami-nation is a true representation of the inherent capabilities of the examinee is where the problem arises. For instance, if the questions were replaced by a different set of questions which are of equal level of difficulty, would the marks scored by the examinees remain more or less the same? If not, then there is a hidden sampling error in the choice of questions. This error can, in principle, be analysed by making the ex-aminees undergo many equivalent tests over a short span of time. Clearly, this is not a very feasible experiment with any reasonable sample of students. Instead, to try and understand the inherent sampling errors in multiple choice tests, we adopt another procedure which mimics the actual experiment closely. Our experiment is based on the actual data from the entrance examina-tion to a professional course which was taken by over 57,000 students. The actual test consisted of two papers, with 100 questions in each paper and the totalmarks of each student in the two papers were used to rank the students. Ofcourse, since the marks were only out of 200 and the number of students was many times that, there was a lot of degeneracy – a large number of people had the same rank and there were gaps in the ranks whenever more than one person gets the same mark and hence rank.With this original data set, we prepared another fictitious data set which comprised Pramod Srivastava (pramodk.srivastava@gmail.com) retired as professor of Physics from the University of Delhi and Shobhit Mahajan (shobhit.mahajan@gmail.com) is with the Department of Physics and Astrophysics, University of Delhi, Delhi.
COMMENTARY

10 question papers with 50 questions each. These 50 questions were taken randomly from the original 100 questions in one p aper. (We repeated the whole exercise with the other paper with no substantial change in the results.) In this way now, we had 11 test papers for each candidate – one original one and 10 fictitious ones.

Since the fictitious papers were made from the original test paper, the marks of

Figure 1: Ranking Noise (Paper 1) with Errors

Original ranks 50000

40000

30000 Sampled ranks with errors

20000

10000

0 100 90 80 70 60 50 40 30 20 10 0

Marks

Figure 2: Ranking Noise (Paper 1) with Errors

3500

3000 Sampled ranks with errors

2500 2000 1500 1000 500

Original ranks 0 95 90 85 80 75 70 65 60 Marks

Figure 3: Histogram of Ranks in 10 Sample Papers for Students Having 68 Marks in Original Paper

350 300 250 200 150 100

50 0 250 500 750 1250 1750 2250 2750 3250 3750 4000

1000 1500 2000 2500 3000 3500

Figure 4: Ranking Noise (Paper 2)

8000 7000 6000 5000 4000

Sampled ranks with errors 3000

2000

1000 Original ranks 0 95 90 85 80 75 70 65 60 Marks

each candidate in each of the 10 fictitious papers were known. Thus each candidate had 11 marks and ranks – one original and 10 in the fictitious papers. In fact, there are several students associated with each original rank and each of these students has 10 other fictitious ranks.

We plot the original data on a graph, i e, make a plot of the ranks in the original data vs the corresponding marks. This curve

is shown as the unbroken line in Figure 1.

At each rank, we now take the number of students who got that rank and find their ranks and marks in the 10 fictitious papers. Typically, there are around a 100 students at most ranks and remembering that each of these students has 10 other ranks, we get a large enough sample to warrant a statistical interpretation. We take this distribution and compute its mean and in the original Paper 1), we take the 164 students who have got this rank. For these 164 students, we consider their ranks in each of the 10 fictitious papers. Thus we have in all 1,640 ranks in the fictitious p apers and find their mean and the standard deviation of this distribution. The distri bution is shown in Figure 3.

It is interesting to see the same graphs for Paper 2 which we exhibit as Figure 4. The second paper shows a trend similar to the first paper. In detail however the number of candidates at each rank is more than in the Paper 1. The curve therefore rises more steeply and the standard errors are also a little higher.

The detailed graph of the sample in this case is shown for the rank 968 (73 marks in Paper 2) as Figure 5 (p 23). The number of candidates at this rank in this paper is

185. This histograms presented is therefore for 1,850 ranks. The mode of the distribution is near 1000 and the quartiles are near 550 and 1,575.

We have performed this analysis with a set of 10 sample papers. To see if the trend is dependent on the sample size (number of papers) we repeated the analysis for a sample of 50 papers and we find

standard deviation, a very similar histogram. This is shown

which gives us an estimate of the spread around the mean.

This information is plotted in Figure 1 as the broken line. The data points are the mean of the distribution at each rank and the error bars are the standard deviation.

To make the data more obvious, the same plot for the first 30 distinct ranks is shown in Figure 2.

To further study the samples, we choose an arbitrary rank in the original paper. We take a rank near 1,000, assuming that in a typical in Figure 6 (p 23).

From the data, we calculate the number of “students” who are in a given range of marks. For Paper 1, we see that at rank 1,060 in the original paper, there were 164 students. If we take the distribution of their marks in the 10 fictitious papers, we see that roughly 50% of the students lie between rank 758 and 1,521, a relatively large spread in ranks. Similarly, for Paper 2, the rank we have chosen is 968 and in this case the spread for 50% is between ranks 646 and 1,639. It is remarkable that the two papers show a relatively similar spread.

Recall that these are students who got exactly the same marks (rank) in the original test. Furthermore, the fictitious papers that we have made are sampled from the original paper itself. But, even with this, there is indeed a spread in the ranks. What this implies is that the exact rank that a student

examination, rank gets in such an examination is crucially 1,000 would be the dependent on the sample of questions that qualifying rank. At are chosen from the larger set of questions. rank 1,060 (68 marks In our case, this was 50 questions from a

december 20, 2008 EPW Economic & Political Weekly

25 quar

Dear reader,

To continue reading, become a subscriber.

Explore our attractive subscription offers.

Click here

Comments

(-) Hide

EPW looks forward to your comments. Please note that comments are moderated as per our comments policy. They may take some time to appear. A comment, if suitable, may be selected for publication in the Letters pages of EPW.

Back to Top