Operant Conditioning: Principles, Reinforcement, and Learning
The essential feature of operant conditioning lies in the reinforcement (e.g., food) received for a specific operant behavior (e.g., pressing a toggle). Operant conditioning reflects the fact that an animal operates, or acts, within its natural or laboratory environment to produce an effect. This effect determines whether the animal repeats the response or continues to behave as before.
Operant conditioning is a learning theory that attempts to explain new behaviors. It proposes that such behaviors are acquired because the perceived consequences either increase or decrease the frequency of these operant behaviors. It is a behavior that is new to the organism and not programmed into its genetic code.
Reinforcement in Operant Conditioning
Reinforcing Event: Delivering a stimulating environment that meets a need of the organism (e.g., providing a reward).
Reinforcing Stimulus: An environmental stimulus that, when applied, increases the frequency of a specific behavior.
Types of Operant Conditioning
- Learning by Reinforcement: Learning in which a new behavior increases in frequency after receiving a reinforcing stimulus.
- Avoidance Learning: Learning a new behavior that ends or prevents an aversive (unpleasant) stimulus, increasing the frequency of that behavior to avoid recurrence.
- Superstitious Learning: Learning where a chance result or reinforcing event increases the frequency of a behavior.
- Punishment Learning: Learning where an organism decreases the frequency of behaviors that were followed by aversive or unpleasant consequences.
- Forgetting: Behaviors that do not receive or stop receiving reinforcement tend to decrease in frequency and eventually disappear.
Outline of Operant Conditioning
- Conditioned Stimulus: Lever inside the box.
- Conditioned Response: Pressing the toggle.
- Unconditioned Stimulus: Food pellet.
- Unconditioned Response: Eating.
Reinforcement Schedules
- Fixed Ratio Schedule: Reinforcement is provided after a constant number of responses. Example: Payment for every 5 sales. Graphically, this shows frequent pauses.
- Variable Ratio Schedule: Reinforcement is provided after a variable number of responses. This produces a high response rate without significant breaks. Example: Gambling (which is why it is so addictive).
- Fixed Interval Schedule: Reinforcement is provided after a fixed time interval from the last reinforcement. This causes no responses immediately after reinforcement, but the rate gradually increases and peaks immediately before the next reinforcement.
- Variable Interval Schedule: Reinforcement is provided after a time interval that varies (e.g., from seconds to hours) after the last reinforcement. The response rate is relatively constant.
Differences: Pavlovian vs. Operant Conditioning
- Pavlovian Conditioning: The connection is between a new stimulus and a reflex response.
- Thorndike (Operant) Conditioning: The connection is between a given stimulus and a new response.