Changes: Ceiling effect

Latest revision as of 23:27, 5 November 2012

Statistics: Scientific method · Research methods · Experimental design · Undergraduate statistics courses · Statistical tests · Game theory · Decision theory

The term ceiling effect has two distinct meanings, referring to the level at which an independent variable no longer has an effect on a dependent variable, or to the level above which variance in an independent variable is no longer measured or estimated. An example of the first meaning, a ceiling effect in treatment, is pain relief by some kinds of analgesic drugs, which have no further effect on pain above a particular dosage level. An example of the second meaning, a ceiling effect in data-gathering, is a survey that groups all respondents into income categories, not distinguishing incomes of respondents above the highest level asked about in the survey instrument.

Treatment

A ceiling effect in treatment, when changes in the level of an independent variable has no further effect on a dependent variable, is a commonly encountered observation in pharmacology and medicine.

In medicine, a ceiling effect is defined as "the phenomenon in which a drug reaches a maximum effect, so that increasing the drug dosage does not increase its effectiveness."^[1] Sometimes drugs cannot be compared across a wide range of treatment situations because one drug has a ceiling effect.^{[citation needed]}

Data-gathering

A ceiling effect in data-gathering, when variance in an independent variable is not measured or estimated above a certain level, is a commonly encountered practical issue in gathering data in many scientific disciplines. Such a ceiling effect is often the result of constraints on data-gathering instruments. When a ceiling effect occurs in data-gathering, there is a bunching of scores at the upper level reported by an instrument.^[2]

Response bias constraints

A population survey about lifestyle variables influencing health outcomes might include a question about smoking habits. To guard against the possibility that a respondent who is a heavy smoker might decline to give an accurate response about smoking, the highest level of smoking asked about in the survey instrument might be "two packs a day or more." This results in a ceiling effect in that persons who smoke three packs or more a day are not distinguished from persons who smoke exactly two packs. A population survey about income similarly might have a highest response level of "$100,000 per year or more," rather than including higher income ranges, as respondents might decline to answer at all if the survey questions identify their income too specifically. This too results in a ceiling effect, not distinguishing persons who have an income of $500,000 per year or higher from those whose income is exactly $100,000 per year.

Range of instrument constraints

The range of data that can be gathered by a particular instrument may be constrained by inherent limits in the instrument's design. Often design of a particular instrument involves tradeoffs between ceiling effects and floor effects. When many subjects have scores on a variable at the upper limit of what an instrument reports, data analysis is difficult because some actual variation in the data is not reflected in the scores obtained from that instrument.^[3]

A ceiling effect is said to occur when a high proportion of subjects in a study have maximum scores on the observed variable. This makes discrimination among subjects among the top end of the scale impossible. For example, an examination paper may lead to, say, 50% of the students scoring 100%. While such a paper may serve as a useful threshold test, it does not allow ranking of the top performers. For this reason, examination of test results for a possible ceiling effect, and the converse floor effect, is often built into the validation of instruments such as those used for measuring quality of life.^[4]

In such a case, the ceiling effect keeps the instrument from noting a measurement or estimate higher than some limit not related to the phenomenon being observed, but rather related to the design of the instrument. A crude example would be measuring the heights of trees with a ruler only 20 meters in length, if it is apparent on the basis of other evidence that there are trees much taller than 20 meters. Using the 20-meter ruler as the sole means of measuring trees would impose a ceiling on gathering data about tree height. Ceiling effects and floor effects both limit the range of data reported by the instrument, reducing variability in the gathered data. Limited variability in the data gathered on one variable may reduce the power of statistics on correlations between that variable and another variable.

College admission tests

In the various countries that use admission tests as the main element or an important element for determining eligibility for college or university study, the data gathered relates to the differing levels of performance of applicants on the tests. When a college admission test has a maximum possible score that can be attained without perfect performance on the test's item content, the test's scoring scale has a ceiling effect. Moreover, if the test's item content is easy for many test-takers, the test may not reflect actual differences in performance (as would be detected with other instruments) among test-takers at the high end of the test performance range. Mathematics tests used for college admission in the United States and similar tests used for university admission in Britain illustrate both phenomena.

Cognitive psychology

In cognitive psychology, the measurement of the time to respond to a given stimulus is often of interest. In these measurements, a ceiling may be the lowest possible number (the fewest milliseconds to a response), rather than the highest value, as is the usual interpretation of "ceiling". In response time studies, it may appear that a ceiling had occurred in the measurements due to an apparent clustering around some minimum amount of time (such as the 250 ms needed by many people to press a key). However, this clustering could actually represent a natural physiological limit of response time, rather than an artifact of the stopwatch sensitivity (which of course would be a ceiling effect). Further statistical study, and scientific judgment, can resolve whether or not the observations are due to a ceiling or are the truth of the matter.

Validity of instrument constraints

IQ testing

Some authors^{[attribution needed]} on gifted education write about ceiling effects in IQ testing having negative consequences on individuals. Those authors sometimes claim such ceilings produce systematic underestimation of the IQs of intellectually gifted people. In this case, it is necessary to distinguish carefully two different ways the term "ceiling" is used in writings about IQ testing.

IQ scores can differ to some degree for the same individual on different IQ tests (age 12–13 years). (IQ score table data and pupil pseudonyms adapted from description of KABC-II norming study cited in Kaufman 2009.^[5])
Pupil	KABC-II	WISC-III	WJ-III
Asher	90	95	111
Brianna	125	110	105
Colin	100	93	101
Danica	116	127	118
Elpha	93	105	93
Fritz	106	105	105
Georgi	95	100	90
Hector	112	113	103
Imelda	104	96	97
Jose	101	99	86
Keoku	81	78	75
Leo	116	124	102

The ceilings of IQ subtests are imposed by their ranges of progressively more difficult items. An IQ test with a wide range of progressively more difficult questions will have a higher ceiling than one with a narrow range and few difficult items. Ceiling effects result in an inability, first, to distinguish among the gifted (whether moderately gifted, profoundly gifted, etc.), and second, results in the erroneous classification of some gifted people as above average, but not gifted.

Suppose that an IQ test has three subtests: vocabulary, arithmetic, and picture analogies. The scores on each of the subtests are normalized (see standard score) and then added together to produce a composite IQ score. Now suppose that Joe obtains the maximum score of 20 on the arithmetic test, but gets 10 out of 20 on the vocabulary and analogies tests. Is it fair to say that Joe's total score of 20+10+10, or 40, represents his total ability? The answer is no, because Joe achieved the maximum possible score of 20 on the arithmetic test. Had the arithmetic test included additional, more difficult items, Joe might have gotten 30 points on that subtest, producing a "true" score of 30+10+10 or 50. Compare Joe's performance with that of Jim, who scored 15+15+15 = 45, without running into any subtest ceilings. In the original formulation of the test, Jim did better than Joe (45 versus 40), whereas it is Joe who actually should have gotten the higher "total" intelligence score than Jim (score of 50 for Joe versus 45 for Jim) using a reformulated test that includes more difficult arithmetic items.

Writings on gifted education bring up two reasons for supposing that some IQ scores are underestimates of a test-taker's intelligence:

they tend to perform all subtests better than less talented people;
they tend to do much better on some subtests than others, raising the inter-subtest variability and chance that a ceiling will be encountered.

Statistical analysis

Ceiling effects on measurement compromise scientific truth and understanding through a number of related statistical aberrations.

First, ceilings impair the ability of investigators to determine the central tendency of the data. When a ceiling effect relates to data gathered on a dependent variable, failure to recognize that ceiling effect may "lead to the mistaken conclusion that the independent variable has no effect."^[2] For mathematical reasons beyond the scope of this article (see analysis of variance), this reduced variance reduces the sensitivity of scientific experiments designed to determine if the average of one group is significantly different from the average of another group (for example, a treatment given to one group may produce an effect, but the effect may escape detection because the mean of the treated group won't look different enough from the mean of the untreated group).

Thus "ceiling effects are a complex of matters and their avoidance a matter of careful evaluation of a range of issues."^[2]

Notes

Bibliography

Baker, Hans (2004). Illustrated Medical Dictionary, Lotus Press.
(2005) The SAGE Dictionary of Statistics: A Practical Resource for Students in the Social Sciences, Third, SAGE.
Kaufman, Alan S. (2009). IQ Testing 101, 151–153, New York: Springer Publishing.
Po, Alain Li Wan (1998). Dictionary of Evidence-based Medicine, Radcliffe Publishing.
Vogt, W. Paul (2005). Dictionary of Statistics & Methodology: A Nontechnical Guide for the Social Sciences, Third, SAGE.

External links

Template:Use dmy dates

This page uses Creative Commons Licensed content from Wikipedia (view authors).

[Baker2004p40-1] Baker 2004

[Cramer2005p21-2] 2.0 ^2.1 ^2.2 Cramer 2005

[Vogt2005p40-3] Vogt 2005

[Po1998p20-4] Po 1998

[Kaufman2009pp151-153-5] Kaufman 2009

[1]

[2]

[3]

[4]

[5]

@@ Line 1: / Line 1: @@
+*{{StatsPsy}}
-In [[statistics]], the term '''ceiling effect''' refers to an effect whereby data cannot take on a value higher than some "ceiling." Ceiling effects present statistical problems similar to those of "floor effects". Specifically, the utility of a measurement strategy is compromised by a lack of variability. In the case of a ceiling effect, the majority of scores are at or near the maximum possible for the test. This presents two major problems.
+The term '''ceiling effect''' has two distinct meanings, referring to the level at which an [[independent variable]] no longer has an effect on a [[dependent variable]], or to the level above which variance in an independent variable is no longer measured or estimated. An example of the first meaning, a ceiling effect in treatment, is pain relief by some kinds of [[analgesic]] drugs, which have no further effect on pain above a particular dosage level. An example of the second meaning, a ceiling effect in data-gathering, is a survey that groups all respondents into income categories, not distinguishing incomes of respondents above the highest level asked about in the survey instrument.
+== Treatment ==
-First, the test is unable to measure phenomenon or traits above its ceiling. For example, a ceiling effect on an IQ test would be problematic because it suggests there are a substantial number of people with intelligence levels too high to be measured by the test. Thus, the test fails to distinguish between the people scoring at the top, or ceiling, of the test.
+A ceiling effect in treatment, when changes in the level of an independent variable has no further effect on a dependent variable, is a commonly encountered observation in pharmacology and medicine.
-Second, most statistical procedures rely on scores being variable and evenly distributed. Often, statistical tests assume that scores are distributed in a "normal distribution", commonly called the bell curve. With strong ceiling effects, distributions are usually distorted with little variability. This violates statistical assumptions and limits the possibility of finding effects.
+In medicine, a ceiling effect is defined as "the phenomenon in which a drug reaches a maximum effect, so that increasing the drug dosage does not increase its effectiveness."<ref name="Baker2004p40" /> Sometimes drugs cannot be compared across a wide range of treatment situations because one drug has a ceiling effect.{{Citation needed|date=December 2011}}
+== Data-gathering ==
+A ceiling effect in data-gathering, when variance in an independent variable is not measured or estimated above a certain level, is a commonly encountered practical issue in gathering data in many scientific disciplines. Such a ceiling effect is often the result of constraints on data-gathering instruments. When a ceiling effect occurs in data-gathering, there is a bunching of scores at the upper level reported by an instrument.<ref name="Cramer2005p21" />
+=== Response bias constraints ===
+A population survey about lifestyle variables influencing health outcomes might include a question about smoking habits. To guard against the possibility that a respondent who is a heavy smoker might decline to give an accurate response about smoking, the highest level of smoking asked about in the survey instrument might be "two packs a day or more." This results in a ceiling effect in that persons who smoke three packs or more a day are not distinguished from persons who smoke exactly two packs. A population survey about income similarly might have a highest response level of "$100,000 per year or more," rather than including higher income ranges, as respondents might decline to answer at all if the survey questions identify their income too specifically. This too results in a ceiling effect, not distinguishing persons who have an income of $500,000 per year or higher from those whose income is exactly $100,000 per year.
+=== Range of instrument constraints ===
+The range of data that can be gathered by a particular instrument may be constrained by inherent limits in the instrument's design. Often design of a particular instrument involves tradeoffs between ceiling effects and [[floor effect]]s. When many subjects have scores on a variable at the upper limit of what an instrument reports, data analysis is difficult because some actual variation in the data is not reflected in the scores obtained from that instrument.<ref name="Vogt2005p40" />
+<blockquote>A ceiling effect is said to occur when a high proportion of subjects in a study have maximum scores on the observed variable. This makes discrimination among subjects among the top end of the scale impossible. For example, an examination paper may lead to, say, 50% of the students scoring 100%. While such a paper may serve as a useful threshold test, it does not allow ranking of the top performers. For this reason, examination of test results for a possible ceiling effect, and the converse floor effect, is often built into the validation of instruments such as those used for measuring quality of life.<ref name="Po1998p20" /></blockquote>
+In such a case, the ceiling effect keeps the instrument from noting a measurement or estimate higher than some limit not related to the phenomenon being observed, but rather related to the design of the instrument. A crude example would be measuring the heights of trees with a ruler only 20 meters in length, if it is apparent on the basis of other evidence that there are trees much taller than 20 meters. Using the 20-meter ruler as the sole means of measuring trees would impose a ceiling on gathering data about tree height. Ceiling effects and floor effects both limit the range of data reported by the instrument, reducing variability in the gathered data. Limited variability in the data gathered on one variable may reduce the power of statistics on correlations between that variable and another variable.
+==== College admission tests ====
+In the various countries that use admission tests as the main element or an important element for determining eligibility for college or university study, the data gathered relates to the differing levels of performance of applicants on the tests. When a college admission test has a maximum possible score that can be attained without perfect performance on the test's item content, the test's scoring scale has a ceiling effect. Moreover, if the test's item content is easy for many test-takers, the test may not reflect actual differences in performance (as would be detected with other instruments) among test-takers at the high end of the test performance range. Mathematics tests used for college admission in the United States and similar tests used for university admission in Britain illustrate both phenomena.
+==== Cognitive psychology ====
+In [[cognitive psychology]], the measurement of the time to respond to a given stimulus is often of interest. In these measurements, a ceiling may be the lowest possible number (the fewest milliseconds to a response), rather than the highest value, as is the usual interpretation of "ceiling". In response time studies, it may appear that a ceiling had occurred in the measurements due to an apparent clustering around some minimum amount of time (such as the 250 ms needed by many people to press a key). However, this clustering could actually represent a natural physiological limit of response time, rather than an artifact of the stopwatch sensitivity (which of course would be a ceiling effect). Further statistical study, and scientific judgment, can resolve whether or not the observations are due to a ceiling or are the truth of the matter.
+=== Validity of instrument constraints ===
+==== IQ testing {{anchor | IQ testing}}<!-- This section is linked from [[Intelligence quotient]] and [[Intellectual giftedness]] and other articles--> ====
+Some authors{{Who|date=January 2011}} on gifted education write about ceiling effects in IQ testing having negative consequences on individuals. Those authors sometimes claim such ceilings produce systematic underestimation of the IQs of [[Intellectual giftedness|intellectually gifted]] people. In this case, it is necessary to distinguish carefully two different ways the term "ceiling" is used in writings about IQ testing.
+{| class="wikitable sortable" style="font-size:small" align="right" summary="Sortable table showing actual I.Q. scores of twelve students on three different I.Q. tests, with students identified by pseudonyms in cited data source."
+|+ IQ scores can differ to some degree for the same individual on different IQ tests (age 12–13 years). (IQ score table data and pupil pseudonyms adapted from description of KABC-II norming study cited in Kaufman 2009.<ref name="Kaufman2009pp151-153" />)
+! class="unsortable" |Pupil!!KABC-II!!WISC-III!!WJ-III
+|- align="right"
+|Asher||90||95||111
+|- align="right"
+|Brianna||125||110||105
+|- align="right"
+|Colin||100||93||101
+|- align="right"
+|Danica||116||127||118
+|- align="right"
+|Elpha||93||105||93
+|- align="right"
+|Fritz||106||105||105
+|- align="right"
+|Georgi||95||100||90
+|- align="right"
+|Hector||112||113||103
+|- align="right"
+|Imelda||104||96||97
+|- align="right"
+|Jose||101||99||86
+|- align="right"
+|Keoku||81||78||75
+|- align="right"
+|Leo||116||124||102
+|}
+The ceilings of IQ subtests are imposed by their ranges of progressively more difficult items.  An IQ test with a wide range of progressively more difficult questions will have a higher ceiling than one with a narrow range and few difficult items.  Ceiling effects result in an inability, first, to distinguish among the gifted (whether moderately gifted, profoundly gifted, etc.), and second, results in the erroneous classification of some gifted people as above average, but not gifted.
+Suppose that an IQ test has three subtests:  vocabulary, arithmetic, and picture analogies.  The scores on each of the subtests are normalized (see [[standard score]]) and then added together to produce a composite IQ score.  Now suppose that Joe obtains the maximum score of 20 on the arithmetic test, but gets 10 out of 20 on the vocabulary and analogies tests.  Is it fair to say that Joe's total score of 20+10+10, or 40, represents his total ability?  The answer is no, because Joe achieved the maximum possible score of 20 on the arithmetic test.  Had the arithmetic test included additional, more difficult items, Joe might have gotten 30 points on that subtest, producing a "true" score of 30+10+10 or 50.  Compare Joe's performance with that of Jim, who scored 15+15+15 = 45, without running into any subtest ceilings.  In the original formulation of the test, Jim did better than Joe (45 versus 40), whereas it is Joe who actually should have gotten the higher "total" intelligence score than Jim (score of 50 for Joe versus 45 for Jim) using a reformulated test that includes more difficult arithmetic items.
+Writings on gifted education bring up two reasons for supposing that some IQ scores are underestimates of a test-taker's intelligence:
+# they tend to perform all subtests better than less talented people;
+# they tend to do much better on some subtests than others, raising the inter-subtest variability and chance that a ceiling will be encountered.
+=== Statistical analysis ===
+Ceiling effects on measurement compromise scientific truth and understanding through a number of related statistical aberrations.
+First, ceilings impair the ability of investigators to determine the central tendency of the data. When a ceiling effect relates to data gathered on a dependent variable, failure to recognize that ceiling effect may "lead to the mistaken conclusion that the independent variable has no effect."<ref name="Cramer2005p21" /> For mathematical reasons beyond the scope of this article (see [[analysis of variance]]), this reduced variance reduces the sensitivity of scientific experiments designed to determine if the average of one group is significantly different from the average of another group (for example, a treatment given to one group may produce an effect, but the effect may escape detection because the mean of the treated group won't look different enough from the mean of the untreated group).
+Thus "ceiling effects are a complex of matters and their avoidance a matter of careful evaluation of a range of issues."<ref name="Cramer2005p21" />
+== See also ==
+*[[Floor effect]]
+== Notes ==
+{{reflist|2|refs=
+<ref name="Baker2004p40">{{Harvnb |Baker|2004|page=40}}</ref>
+<ref name="Cramer2005p21">{{Harvnb |Cramer|2005|page=21}}</ref>
+<ref name="Vogt2005p40">{{Harvnb |Vogt|2005|page=40}}</ref>
+<ref name="Po1998p20">{{Harvnb |Po|1998|page=20}}</ref>
+<ref name="Kaufman2009pp151-153">{{Harvnb |Kaufman|2009|pages=151–153}}</ref>
+}}
+== Bibliography ==
+* {{cite book |title=Illustrated Medical Dictionary |last=Baker |first=Hans |year=2004 |publisher=Lotus Press |ref=harv}}
+* {{cite book |title=The SAGE Dictionary of Statistics: A Practical Resource for Students in the Social Sciences |last1=Cramer |first1=Duncan |last2=Howitt |first2=Dennis Laurence |year=2005 |edition=Third |publisher=SAGE |page=21 (entry "ceiling effect") |isbn=978-0-7619-4138-5 |laysummary=http://www.sagepub.com/booksProdDesc.nav?prodId=Book225578 |laydate=1 August 2010 |ref=harv}}
+* {{cite book |title=IQ Testing 101 |last=Kaufman |first=Alan S. |authorlink=Alan S. Kaufman |year=2009 |publisher=Springer Publishing |location=New York |isbn=978-0-8261-0629-2 |pages=151–153 |ref=harv}}
+* {{cite book |title=Dictionary of Evidence-based Medicine |last=Po |first=Alain Li Wan |year=1998 |publisher=Radcliffe Publishing |page=20 |isbn=978-1-85775-305-9 |ref=harv}}
+* {{cite book |title=Dictionary of Statistics & Methodology: A Nontechnical Guide for the Social Sciences |last=Vogt |first=W. Paul |year=2005 |edition=Third |publisher=SAGE |page=40 (entry "ceiling effect") |isbn=978-0-7619-8855-7 |laysummary=http://www.sagepub.com/booksProdDesc.nav?prodId=Book226470 |laydate=1 August 2010 |ref=harv}}
+== Further reading ==
+{{Empty section|date=December 2011}}
+== External links ==
+{{Experimental design}}
+{{Use dmy dates|date=July 2011}}
+[[Category:Medical statistics]]
+[[Category:Statistical terminology]]
+<!--
+[[de:Deckeneffekt]]
+[[eu:Sabai-efektu]]
+-->
+{{enWP|Ceiling effect}}

v·d·e Design of experiments
Scientific Method	Scientific experiment Statistical design Control Internal & external validity Experimental unit Blinding Optimal design: Bayesian Random assignment Randomization Restricted randomization Replication versus subsampling Sample size
Treatment & Blocking	Treatment Effect size Contrast Interaction Confounding Orthogonality Blocking Covariate Nuisance variable
Models & Inference	Linear regression Ordinary least squares Bayesian Random effect Mixed model Hierarchical model: Bayesian Analysis of variance (Anova) Cochran's theorem Manova (multivariate) Ancova (covariance) Compare means Multiple comparison
Designs: Completely Randomized	Factorial Fractional factorial Plackett-Burman Taguchi Response surface methodology Polynomial & rational modeling Box-Behnken Central composite Block Generalized randomized block design (GRBD) Latin square Graeco-Latin square Orthogonal array Latin hypercube Repeated measures design Crossover study Randomized controlled trial Sequential analysis Sequential probability ratio test
* Glossary Category Statistics portal Statistical outline Statistical topics