Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language | Individual differences | Personality | Philosophy | Social |
Methods | Statistics | Clinical | Educational | Industrial | Professional items | World psychology |

Cognitive Psychology: Attention · Decision making · Learning · Judgement · Memory · Motivation · Perception · Reasoning · Thinking  - Cognitive processes Cognition - Outline Index

This article is about reinforcement in the context of operant conditioning. For its use in classical conditioning see that article.

In operant conditioning, reinforcement is any change in an organism's surroundings that:

  • occurs regularly when the organism behaves in a given way (that is, is contingent on a specific response), and
  • is associated with an increase in the probability that the response will be made or in another measure of its strength.

For example: you give your dog food every time it sits when you tell it to. If the dog becomes more likely to sit when told to, sitting is considered to have been reinforced by the administration of food contingent on it.

Note that it is the behavior that is reinforced, not the dog. The food serves as a reinforcer, reinforcing or strengthening that behavior, only to the extent that sitting subsequently occurs more often or more quickly because of it.

The study of reinforcement has produced an enormous body of reproducible experimental results. Reinforcement is the central concept and procedure in the experimental analysis of behavior.

Schedules of reinforcement Edit

Main article: Schedule of reinforcement

Schedule of reinforcement

A chart demonstrating the different response rate of the schedules of reinforcement, each hatch mark designates a reinforcer being given

When enough of the variations in an animal's surroundings are reduced or "controlled," its behavior patterns after reinforcement are remarkably predictable. When rates of reinforcement are adjusted in particular ways, even very complex behavior patterns can be predicted. A schedule of reinforcement is the protocol for determining which responses (i.e., which individual occurrences of a given behavior) will be reinforced. The two extremes are continuous reinforcement, in which every response results in reinforcement, and extinction, in which no response is reinforced.

Other schedules include:

  • Fixed ratio (FR), in which every nth response is reinforced.
  • Fixed interval (FI), in which reinforcement occurs after the passage of a specified length of time from the beginning of training or from the last reinforcement, provided that at least one response occurred in that time period.
  • Variable ratio (VR), in which the number of responses required between reinforcements varies, but on average equals a predetermined number.
  • Variable interval (VI), in which reinforcement occurs after the passage of a varying length of time around an average, provided that at least one response occurred in that period.

Ratio schedules produce higher rates of responding than interval schedules. Variable schedules produce higher rates than fixed schedules. The variable ratio schedule produces both the highest rate of responding and the greatest resistance to extinction (that is, resistance to "petering out"). One notable example is gambling behavior. In the fixed ratio schedule, there's a pause after a reinforcer is delivered. This is called a post-reinforcement pause. The fixed interval schedule do produce post-reinforcement pauses, but they are scalloped-shape. Any responses produced before the elapsed time are not reinforced, therefore a subject has learned to respond at a gradual rate. If an organism is subject to a fixed ratio schedule and there is a sudden increase in the number of responses necessary to obtain a reinforcer (say from FR50 to FR250) then the organism is observed to pause periodically before the delivery of the reinforcer. This phenomenon is called the ratio strain and it contrasts with the usual pattern of postreinforcement pause - ratio run and reinforcement in FR-schedules. Concerning extinction, partial reinforcement schedules are more resistant than continuous reinforcement schedules. This phenomenon is called the Partial reinforcement extinction effect (PREE). Ratio schedules tend to be more resistant than interval schedules and variable schedules more resistant than fixed ones.

Types of reinforcements Edit

Reinforcement is a change in the environment causing the rate of behavior responses of the subject to remain the same or increase. There are two types of behavioral reinforcers.

  • Positive reinforcement changes the animal's surroundings by adding an appetitive stimulus: a physical object (like a food pellet or paycheck) or energy (like light from a lamp).
  • Negative reinforcement changes the surroundings by removing an aversive stimulus - such as turning off a painful electric current or removing a hated ex-spouse's picture.

  decreases rate of behavior increases rate of behavior
presented positive punishment positive reinforcement
taken away negative punishment negative reinforcement

Distinguishing "positive" from "negative" in these cases is largely a matter of emphasis. For example, in a very warm room, a current of external air serving as reinforcement may be positive because it is relatively cool but negative because it removes the uncomfortably hot air. Some reinforcement can simultaneously be both positive and negative. For example, a drug addict may take drugs for the added euphoria and to get rid of withdrawal symptoms. Another example is eating. Eating adds pleasurable flavors while removing feelings of hunger. The distinction seems to have no real use in research or applied psychology, although one may some day be found. Until then, many behavioral psychologists simply refer to reinforcement or punishment—without polarity—to cover all consequent environmental changes.


Punishment is any change in an animal's surroundings that occurs after a given behavior or reponse which reduces the frequency of that behavior occuring again in the future. As with reinforcement, it is the behavior, not the animal, that is punished. Whether a change is or is not punishing is only known by its effect on the rate of the behavior, not by any "hostile" features of the change. In positive punishment or type I punishment, an experimenter punishes a response by adding an aversive stimulus into the animal's surroundings (a brief electric shock, for example). In negative punishment or type II punishment, a positive reinforcer is removed (as in the removal of a feeding dish). As with reinforcement, it is not usually necessary to speak of positive and negative in regard to punishment.

Punishment is not a mirror effect of reinforcement. In experiments with laboratory animals and studies with children, punishment decreases the frequency of a previously reinforced response only temporarily, and it can produce other "emotional" behavior (wing-flapping in pigeons, for example) and physiological changes (increased heart rate, for example) that have no clear equivalents in reinforcement.

Punishment is considered by some behavioral psychologists to be a "primary process" – a completely independent phenomenon of learning, distinct from reinforcement. Others see it as a category of negative reinforcement, creating a situation in which any punishment-avoiding behavior (even standing still) is reinforced.

Aversive stimulus, punisher, and punishing stimulus are synonyms. Punishment may be used for (a) an aversive stimulus or (b) the occurrence of any punishing change or (c) the part of an experiment in which a particular response is punished.

Other reinforcement terms Edit

  • An unconditioned reinforcer, sometimes called a primary reinforcer, is a stimulus or situation considered to be inherently reinforcing (such as affection, food, or opportunity for sleep).
  • A conditioned reinforcer, sometimes called a secondary reinforcer, is a stimulus or situation that has acquired reinforcing power after being paired in the animal's environment with an unconditioned reinforcer or an earlier conditioned reinforcer (such as praise).
  • A generalized reinforcer is a conditioned reinforcer that has been paired with many other reinforcers (such as money).
  • Differential reinforcement of incompatible behavior (DRI) is used in reducing an already frequent behavior without punishing it by reinforcing a specific incompatible response (like leaving a room so that fighting with someone in it is not possible).
  • In differential reinforcement of other behavior (DRO), any behavior other than some undesired behavior is reinforced.
  • Differential reinforcement of low response rate (DRL): a behavior is reinforced only if it occurred infrequently. "If you ask me for a potato chip no more than once every 10 minutes, I will give it to you. If you ask more often, I will give you none."
  • Differential reinforcement alternate behavior (DRA): the reinforcers for the undesirable behavior are used instead for a more desirable behavior. For example, a teacher will pay attention to students who sit than those who walk or talk in class.
  • In reinforcer sampling a potentially reinforcing but unfamiliar stimulus is presented to an animal without regard to any prior behavior. The stimulus may then later be used more effectively in reinforcement.
  • Social reinforcement involves various sorts of access to and interaction with others.
  • Satiation occurs when a stimulus that had reinforced some behavior no longer seems to do so.

Shaping & chaining Edit

Shaping involves reinforcing successive, increasingly accurate approximations of a response desired by a trainer. In training a rat to press a lever, for example, simply turning toward the lever will be reinforced at first. Then, only turning and stepping toward it will be reinforced. As training progresses, the response reinforced becomes progressively more like the desired behavior. Chaining is similar but involves reinforcing various simple behaviors separately and then linking them together in a more complex series.

Controversies Edit

The standard idea of behavioral reinforcement has been criticized as circular, since it appears to argue that response strength is increased by reinforcement while defining reinforcement as something which increases response strength. Other definitions have been proposed, such as F. D. Sheffield's "consummatory behavior contingent on a response," but these are not broadly used in psychology.

History of the terms Edit

In the 1920s Russian physiologist Ivan Pavlov may have been the first to use the word reinforcement with respect to behavior, but (according to Dinsmoor) he used its approximate Russian cognate sparingly, and even then it referred to strengthening an already-learned but weakening response. He did not use it, as it is today, for selecting and strengthening new behavior. Pavlov's introduction of the word extinction (in Russian) approximates today's psychological use.

In popular use, positive reinforcement is often used as a synonym for reward, with people (not behavior) thus being "reinforced," but this is contrary to the term's consistent technical usage. Negative reinforcement is often used by laypeople and even social scientists outside psychology as a synonym for punishment. This is contrary to modern technical use, but it was B. F. Skinner who first used it this way in his 1938 book. By 1953, however, he followed others in thus employing the word punishment, and he re-cast negative reinforcement for the removal of aversive stimuli.

See alsoEdit

References & BibliographyEdit

External links Edit

Types of learning
Avoidance conditioning | Classical conditioning | Confidence-based learning | Discrimination learning | Emulation | Experiential learning | Escape conditioning | Incidental learning |Intentional learning | Latent learning | Maze learning | Mastery learning | Mnemonic learning | Nonassociative learning | Nonreversal shift learning | Nonsense syllable learning | Nonverbal learning | Observational learning | Omission training | Operant conditioning | Paired associate learning | Perceptual motor learning | Place conditioning | Probability learning | Rote learning | Reversal shift learning | Second-order conditioning | Sequential learning | Serial anticipation learning | Serial learning | Skill learning | Sidman avoidance conditioning | Social learning | Spatial learning | State dependent learning | Social learning theory | State-dependent learning | Trial and error learning | Verbal learning 
Concepts in learning theory
Chaining | Cognitive hypothesis testing | Conditioning | Conditioned responses | Conditioned stimulus | Conditioned suppression | Constant time delay | Counterconditioning | Covert conditioning | Counterconditioning | Delayed alternation | Delay reduction hypothesis | Discriminative response | Distributed practice |Extinction | Fast mapping | Gagné's hierarchy | Generalization (learning) | Generation effect (learning) | Habits | Habituation | Imitation (learning) | Implicit repetition | Interference (learning) | Interstimulus interval | Intermittent reinforcement | Latent inhibition | Learning schedules | Learning rate | Learning strategies | Massed practice | Modelling | Negative transfer | Overlearning | Practice | Premack principle | Preconditioning | Primacy effect | Primary reinforcement | Principles of learning | Prompting | Punishment | Recall (learning) | Recency effect | Recognition (learning) | Reconstruction (learning) | Reinforcement | Relearning | Rescorla-Wagner model | Response | Reinforcement | Secondary reinforcement | Sensitization | Serial position effect | Serial recall | Shaping | Stimulus | Reinforcement schedule | Spontaneous recovery | State dependent learning | Stimulus control | Stimulus generalization | Transfer of learning | Unconditioned responses | Unconditioned stimulus 
Animal learning
Cat learning | Dog learning  Rat learning 
Neuroanatomy of learning
Neurochemistry of learning
Adenylyl cyclase  
Learning in clinical settings
Applied Behavior Analysis | Behaviour therapy | Behaviour modification | Delay of gratification | CBT | Desensitization | Exposure Therapy | Exposure and response prevention | Flooding | Graded practice | Habituation | Learning disabilities | Reciprocal inhibition therapy | Systematic desensitization | Task analysis | Time out 
Learning in education
Adult learning | Cooperative learning | Constructionist learning | Experiential learning | Foreign language learning | Individualised instruction | Learning ability | Learning disabilities | Learning disorders | Learning Management | Learning styles | Learning theory (education) | Learning through play | School learning | Study habits 
Machine learning
Temporal difference learning | Q-learning 
Philosophical context of learning theory
Behaviourism | Connectionism | Constructivism | Functionalism | Logical positivism | Radical behaviourism 
Prominant workers in Learning Theory|-
Pavlov | Hull | Tolman | Skinner | Bandura | Thorndike | Skinner | Watson 
Category:Learning journals | Melioration theory 

This page uses Creative Commons Licensed content from Wikipedia (view authors).

Ad blocker interference detected!

Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.