Archive for November 5th, 2010

It may come as some surprise that I have serious reservations about the various tests out there intended to measure the “IQ” of those people who fall outside the normal parameters of standardized, professional testing. You would be forgiven for assuming that I would relish the chance to show just how much beyond the boundaries of regular testing I actually fall, right?


There are a number of problems, I feel, and the possibility that some people will find ways to cheat on an unsupervised power test is the least of them.

For a start, there is the question of sample size. A professional test will typically be beta-tested on hundreds, if not thousands, of volunteers across a broad spectrum of ability. This process is used by the test developer to eliminate bad items, clarify items that are ambiguous, and iron out any other possible flaws. No publisher wants to release a test that unfairly penalizes, nor unfairly rewards, testees for spurious reasons, and that is why tests are “normed” to ensure that they are statistically reliable.

Since the entire science of statistics is a numbers game, and random factors tend to be averaged out when the sample size is sufficiently large, it stands to reason that when there only exists a very limited number of specimens of a certain type, any random factors among their number are going to cause disproportionate skewing of the figures.

A person with an IQ of 175 represents a rarity of about one person in a million. That means that there will only be approximately 6,500-7,000 such individuals on Planet Earth. It is probable that the majority of that number live in poverty in the Third World, never received any formal education, don’t speak any language in which the test is available, or just simply have no interest in taking tests of this type. It follows, then, that the number of individuals on which a test for that range can be normed may be insufficient to meet the usual criteria.

The second problem I have has to do with what goes into the tests themselves. Usually, when we seek to measure something, we have a rough idea what it is that we are seeking to measure. I would expect, therefore, that items on tests are chosen with the purpose in mind of testing a specific aspect of a person’s cognitive potential – vocabulary, pattern recognition, numerical aptitude, or whatever.

After seeing some of the high-range tests I have seen online, I wonder whether the author had any idea what he was trying to test. Was there actually a purpose behind the questions, if not a theory of IQ, at least a working set of opinions? Because it appears to me that many such tests include items just because the author found them attractive or because he thought that they would be “hard” to solve, not because they actually measure that aspect of cognitive functioning with any reliability. I seriously question the tests’ construct validity.

That leads us onto my third objection, which has to do with the fact that many high-range tests, in order to make the questions “hard”, resort to requiring specialist academic knowledge, throwing in arcane bits of vocabulary, or expecting a knowledge of higher mathematics that, while no doubt  justifying this by allowing the use of dictionaries, references and calculators, and an unlimited amount of time to complete the questions, still leaves the person who starts out with no familiarity with those subject areas at a disadvantage.

For an example of what I mean, consider the following.  Imagine a problem that asks you to calculate the amount of heat loss from a building. You are given information to do with areas and dimensions, the amount of heat given out over a given time by the type of heating system in the building, and information regarding how much heat is lost via different types of building materials. It is probable that laypeople of a certain level of ability, sitting for an unlimited amount of time with a calculator, would be able to come up with a plausible answer. Now imagine an experienced surveyor is presented with the same problem. He would probably be able to come up with an accurate solution in minutes, given that this is the type of thing he has to do frequently in his job, and he would have the exact set of algorithms at his fingertips. Furthermore, he would be able to take into account factors that he would understand from the test question that might never occur to the layperson, for reasons of familiarity with the subject area. What has not been taken into account here is CONTEXT. If I have to teach myself some abstruse topic in geometry to be able to tackle a high-range test question, what might I miss, just because I am missing the context and familiarity that an expert lecturer in mathematics may have? This is not “IQ”, but the ability to research bits of arcane and arbitrary data for the purpose of answering a single, random question. As it happens, I’m pretty good at hunting down such trivia and working with it, but that’s not the point.

Finally, one would expect that any scale of measurement is measuring the same thing from top to bottom. A thermometer doesn’t hit one hundred degrees and then increasingly start measuring something else. It measures temperature all the way up and down its range. Arthur Jensen argues that beyond a certain level, tests cease to measure “pure” IQ and increasingly measure specialised abilities.

One possible workaround would be to take the same type of questions and tasks as are included in tests like the Wechsler or Stanf0rd-Binet, and extend the scale on each subtest by including longer memory sequences, harder vocabulary words, more complex matrix items etc. Even though this would remove the tendency to attempt to increase the difficulty of the test by including items requiring specific academic knowledge, there is still the problem of adequate sample size for norming, as mentioned earlier.

If it is not possible to measure IQ with any certainty beyond a certain level (most professional tests seem to top out at about IQ 160), then I guess I’m happy to just leave it at that. There are so few people at or above that range, what useful purpose does discriminating further serve?

Read Full Post »

%d bloggers like this: