Attitude Measurement Scales/ Attitudinal Scales: Thurston, Likert and Semantic Differential Scaling

Attitude Measurement Scales assess an individual’s enduring evaluation—positive, negative, or neutral—toward an object, person, issue, or behavior. In business research, attitudes predict behaviors such as purchase intentions, employee turnover, or brand loyalty. However, attitudes are latent constructs: they cannot be directly observed, only inferred from responses. Attitudinal scales provide systematic procedures for converting subjective evaluations into quantifiable data. Major scaling techniques include Likert scales (summated ratings), Thurstone scales (equal-appearing intervals), Guttman scales (cumulative), and Semantic Differential scales (bipolar adjectives). Each method differs in construction complexity, underlying assumptions, and statistical properties. Choosing the appropriate scale depends on research objectives, resources, and required measurement level (ordinal vs. interval). Well-constructed attitudinal scales exhibit reliability (consistency) and validity (measuring what they intend). Poorly designed scales produce misleading, non-actionable results.

1. Likert Scale (Summated Rating Scale)

The Likert scale, developed by Rensis Likert (1932), is the most widely used attitudinal scale in business research. Respondents indicate their degree of agreement or disagreement with a series of statements using a symmetric scale, typically 5 or 7 points: (1) Strongly Disagree, (2) Disagree, (3) Neutral, (4) Agree, (5) Strongly Agree. Each item’s score is summed or averaged to produce a total attitude score. Likert scales assume that each item measures the same underlying construct and that equal intervals between response options approximate interval measurement. Advantages: easy to construct, administer, and understand; high reliability with multiple items. Disadvantages: response biases (acquiescence—tendency to agree; central tendency—avoiding extremes); ordinal nature debated; multi-dimensionality risks when items tap multiple constructs. Best practices include balanced scales (equal positive/negative statements) and reverse-coded items.

2. Thurstone Scale (Equal-Appearing Interval Scale)

The Thurstone scale, developed by Louis Thurstone (1928), uses judges to assign scale values to statements before administration. Construction process: (1) Generate 50–100 opinion statements about an attitude object. (2) Have 50+ judges rate each statement on an 11-point favorability scale (1 = extremely unfavorable, 11 = extremely favorable). (3) For each statement, compute the median rating (scale value) and interquartile range (ambiguity). (4) Select 20–30 statements with evenly spaced medians and low ambiguity. (5) Respondents check all statements they agree with. (6) Respondent’s score = median of selected statements. Advantages: produces true interval measurement; judges’ consensus ensures uni-dimensionality. Disadvantages: extremely labor-intensive; judges require training; rarely used in modern commercial research due to Likert’s practicality. Historical importance remains, but contemporary researchers almost exclusively choose Likert or Semantic Differential scales.

3. Guttman Scale (Cumulative Scale)

The Guttman scale, developed by Louis Guttman (1944), assumes items form a perfect hierarchical order: agreement with a higher-difficulty item implies agreement with all lower-difficulty items. For example, measuring attitude toward a brand: (1) “I have heard of Brand X.” (2) “I have tried Brand X.” (3) “I prefer Brand X.” (4) “I recommend Brand X.” A respondent agreeing with item 3 must agree with items 1 and 2. The pattern is fully reproducible from the total score (number of agreements). Coefficient of reproducibility (>0.90) indicates scalability. Advantages: perfect uni-dimensionality by design; efficient (minimal items). Disadvantages: real data rarely achieve perfect scalability; constructing items with hierarchical properties is difficult; limited to developmental or knowledge-based constructs. Applications include measuring adoption stages (innovation diffusion), competency hierarchies, and attitudes with logical dependencies. Rare in routine business research.

4. Semantic Differential Scale

The Semantic Differential scale, developed by Charles Osgood (1957), measures the connotative meaning of an object using bipolar adjective pairs. Respondents rate the object on a 5- or 7-point scale between opposites such as good–bad, modern–traditional, fast–slow, strong–weak. While originally capturing three dimensions (evaluation, potency, activity), business researchers often use it uni-dimensionally for attitude measurement using evaluation pairs (good–bad, favorable–unfavorable, pleasant–unpleasant, high quality–low quality). Advantages: captures emotional and attitudinal responses quickly; cross-culturally adaptable; reduces acquiescence bias (no agreement/disagreement phrasing). Disadvantages: adjective pairs must be clearly opposite; meaning can shift across cultures; requires literacy. Applications: brand image tracking, advertising effectiveness (pre-post campaign comparisons), product perception studies, and competitor positioning. Scores are typically averaged across pairs, treated as interval data. The semantic differential is efficient, reliable, and highly visual.

5. Stapel Scale

The Stapel scale, developed by Jan Stapel (1950s), is a unipolar rating scale measuring both direction and intensity of attitude toward a single object. It presents a single adjective (e.g., “Reliable”) centered with a numerical scale from +5 to -5 (or +3 to -3), lacking a neutral zero point. Respondents rate how accurately the adjective describes the object: +5 = very accurate, 0 = neutral/undecided, -5 = very inaccurate. Advantages: no need for bipolar adjective pairs (simpler than semantic differential); eliminates neutral midpoint ambiguity; easier to administer by phone. Disadvantages: less familiar to respondents; scoring interpretation can be confusing; unipolar nature may bias responses. Applications: measuring corporate image, product attribute ratings, and service quality perceptions. The Stapel scale correlates highly with semantic differential scores but remains less common. It is most useful when suitable bipolar opposites are difficult to identify.

6. Graphic Rating Scale

Graphic rating scales present a continuous line (typically 100 mm) between two extreme endpoints representing the attitude continuum. Respondents place a mark on the line; the score is the distance (mm) from the low endpoint. Endpoints are labeled (e.g., “Very Unsatisfied” to “Very Satisfied”), but intermediate positions are unlabeled. Advantages: theoretically infinite gradations (ratio scale properties); visually intuitive; avoids forced-choice categories; sensitive to small differences. Disadvantages: requires manual measurement in paper versions (ruler); not suitable for telephone surveys; some respondents struggle with abstract placement; digital versions require precise touch or click input. Applications: customer effort scoring, perceived risk assessment, pain scales, and well-being research. In online surveys, slider controls implement graphic scales digitally. While offering high sensitivity, graphic scales may produce lower reliability than discrete Likert scales due to ambiguity in mark placement across respondents.

7. Rank Order Scales

Rank order scales require respondents to order a set of objects (brands, attributes, product features) from most preferred to least preferred. No numerical values are assigned; only order matters. For n objects, the lowest possible rank is 1 (most preferred) and highest is n (least preferred). Advantages: forces discrimination among alternatives; eliminates rating inflation (all items rated highly); simple and intuitive; useful when comparative rather than absolute judgment is natural. Disadvantages: provides only ordinal data (no distance information); pairwise comparisons are lost; becomes cumbersome for more than 7–10 objects (respondent fatigue). Applications: brand preference rankings, feature importance prioritization, job offer comparisons, and supplier selection. Rank order scales can be converted to paired comparison data but lose information. Analysis uses Friedman test, Kendall’s W (coefficient of concordance), and rank-order correlation. They are excellent for relative preference but poor for measuring absolute attitude strength.

8. Paired Comparison Scale

<

p class=”ds-markdown-paragraph” style=”text-align: justify;”>Paired comparison scales present respondents with two objects at a time and ask which is preferred on a specific attribute (e.g., “Which brand has better quality—A or B?”). For n objects, the number of pairs is n(n-1)/2. For 5 brands, that is 10 comparisons; for 10 brands, 45 comparisons—quickly becoming impractical. Advantages: simple, intuitive judgment; forces choice; high reliability for small sets; avoids context effects of rating scales. Disadvantages: time-consuming for more than 5–6 objects; no absolute intensity information; assumes transitivity (if A>B and B>C then A>C), which real respondents often violate. Applications: product testing (small sets of concepts), candidate selection, job attribute importance, and taste tests (cola vs. pepsi). Analysis: count proportion of times each object is preferred; compute scale values using Thurstone’s Law of Comparative Judgment (Case V). Paired comparison is ideal for sensory discrimination but less common for general attitude measurement.

2 thoughts on “Attitude Measurement Scales/ Attitudinal Scales: Thurston, Likert and Semantic Differential Scaling

Leave a Reply

error: Content is protected !!