# The Science of Writing Clarity

Research-backed principles for maximum comprehension and readability.

## Contents
- Sentence Length Research
- Readability Metrics
- Cognitive Load Theory
- Plain Language Studies
- Active vs Passive Voice
- Word Complexity Research

## Sentence Length Research

### American Press Institute Study

Comprehension rates by sentence length:
- **8 words or less**: Very easy (near 100% comprehension)
- **14 words**: 90%+ comprehension (optimal)
- **17 words**: Standard difficulty
- **21 words**: Fairly difficult
- **25 words**: Difficult (comprehension drops significantly)
- **43 words**: Less than 10% comprehension

### Screen Reader Accessibility Study (2020)

21 blind participants tested sentence lengths with screen readers:
- **16-20 words**: Highest comprehension, lowest cognitive workload
- **Over 25 words**: Significant comprehension impairment

### Practical Guidelines

**Target Averages:**
- General audience: 15-20 words
- Professional audience: 18-22 words
- Technical content: 20-25 words maximum

**Variation is key**: Mix short (5-10), medium (15-20), and occasional longer sentences (25-30) for rhythm.

## Readability Metrics

### Flesch Reading Ease (FRE)

Scale: 0-100 (higher = easier)

| Score | Reading Level | Example |
|-------|--------------|---------|
| 90-100 | 5th grade | Very easy, understood by 11-year-old |
| 80-90 | 6th grade | Easy, conversational |
| 70-80 | 7th grade | Fairly easy |
| 60-70 | 8th-9th grade | Standard (optimal for most content) |
| 50-60 | 10th-12th grade | Fairly difficult |
| 30-50 | College | Difficult |
| 0-30 | College graduate | Very difficult |

**Formula factors:**
- Average sentence length (words per sentence)
- Average syllables per word

### Flesch-Kincaid Grade Level

Estimates US grade level required to understand text.

**Targets by audience:**
- General public: 6th-8th grade
- Business communication: 8th-10th grade
- Technical professional: 10th-12th grade
- Academic: 12th+ grade

### Validation

70+ years of research confirms these formulas work because:
- Syllable complexity correlates with word familiarity
- Sentence length correlates with cognitive load
- Both are measurable and actionable

## Cognitive Load Theory

### Core Principles (John Sweller, 1980s)

Three types of cognitive load:

1. **Intrinsic load**: Inherent complexity of the content (unavoidable)
2. **Extraneous load**: Mental effort from poor presentation (reducible)
3. **Germane load**: Productive effort building understanding (desirable)

**Goal**: Minimize extraneous load to maximize comprehension.

### Working Memory Constraints

George Miller's research (1956):
- Working memory holds 5-9 "chunks" of information
- Each clause or phrase competes for limited slots
- Complex sentences overload working memory

**Practical implications:**
- One idea per sentence
- Chunk related information together
- Use headings to create mental categories

### Application to Writing

**High extraneous load (avoid):**
- Complex sentence structures
- Jargon requiring mental translation
- Dense paragraphs without visual breaks
- Passive voice requiring agent identification

**Low extraneous load (prefer):**
- Simple subject-verb-object structure
- Familiar vocabulary
- Visual hierarchy with headings and lists
- Active voice with clear actors

## Plain Language Studies

### Joseph Kimble's Research (50+ studies, 18 on legal documents)

**Study 1: Judicial Opinions (251 lawyers)**
- 61% preferred plain language versions
- Average sentence reduction: 25 → 19 words (24%)
- Rating improvement: 6/10 → 7/10

**Study 2: Lawsuit Papers (292 judges)**
- 66% preferred plain English
- 58% preferred informal plain English
- Page reduction: 3.25 → 2.5 pages (23%)
- Sentence reduction: 25.2 → 17.8 words (29%)

**Study 3: Court Forms (controlled experiment, 60 participants)**
- Plain language proof of service: 81% vs 61% accuracy (+33%)
- Plain language subpoena: 95% vs 65% accuracy (+46%)
- Document length reduction: ~40%

**Study 4: FCC Marine Radio Rules (Government regulation)**
- Correct answers: 10.66/20 → 16.85/20 (+58%)
- Response time: 2.97 → 1.62 minutes (-45%)
- Difficulty rating: 4.59/5 → 1.88/5 (-59%)

### Key Finding

Plain language consistently outperforms complex prose across all reader types—including lawyers, judges, and specialists who work with complex language daily.

## Active vs Passive Voice

### Experimental Evidence

Systematic review (2024) analyzing 9 studies:

**Consistent findings:**
- Active voice = reduced readability scores (easier)
- Active voice = faster processing (15-20% improvement)
- Active voice = higher comprehension scores
- Active voice = fewer grammatical errors by readers
- Active voice = rated as more coherent

### Cognitive Explanation

Active voice matches natural language processing:
- Brain expects: subject → verb → object
- Passive requires mental reordering to identify agent
- Additional transformation step consumes working memory

### When Passive is Acceptable

Use passive voice only when:
- Actor is unknown: "The window was broken"
- Actor is irrelevant: "The experiment was conducted"
- Object emphasis needed: "The proposal was rejected"
- Hedging is appropriate: "Mistakes were made"

**Default to active** in all other cases.

### Conversion Examples

| Passive | Active |
|---------|--------|
| The report was written by the team | The team wrote the report |
| Errors were found in the code | We found errors in the code |
| The meeting was scheduled for 3pm | I scheduled the meeting for 3pm |
| A decision was made to proceed | The board decided to proceed |

## Word Complexity Research

### Ohio State Jargon Study (2019, N=650)

**Key finding**: Jargon reduced processing fluency by 76% even when definitions were provided.

**Effects of jargon:**
- Greater resistance to persuasion
- Increased risk perceptions
- Lower support for proposals
- Effect persisted across multiple topics

**Mechanism**: Jargon creates disfluency (difficult processing experience). Brain misattributes difficulty from jargon to the content itself, triggering skepticism.

### Syllable Reduction Impact

Each syllable reduction improves readability scores:

| Complex | Simple | Syllable Reduction |
|---------|--------|-------------------|
| utilize | use | 3 → 1 |
| commence | start | 2 → 1 |
| terminate | end | 3 → 1 |
| demonstrate | show | 3 → 1 |
| facilitate | help | 4 → 1 |
| subsequent | later | 3 → 2 |
| approximately | about | 5 → 2 |
| endeavor | try | 3 → 1 |

### Compound Effect

Small edits compound across documents:
- 8,000-word document
- Replace 50 complex words
- Reduce 100 syllables
- Improve readability by 1-2 grade levels
- Increase comprehension by 10-20%

## Line Length Research

### Optimal Character Count

**Desktop**: 50-75 characters per line (66 optimal)
**Mobile**: 30-50 characters per line
**WCAG Standard**: Maximum 80 characters

### Eye-Tracking Evidence (Baymard Institute)

**Too long (100+ characters):**
- Hard to focus on text
- Difficult to find next line
- Eye fatigue increases 43%
- Comprehension drops 20%

**Too short (<40 characters):**
- Breaks reading rhythm
- Causes premature line jumps
- Increases cognitive load

**Optimal range (50-75 characters):**
- Natural eye movement
- Subconscious energized at each line start
- 28% improvement in reading speed

## Practical Application

### Before Writing

1. Define target audience reading level
2. Set readability goal (FRE 60-70 for general)
3. Plan average sentence length (15-20 words)

### During Writing

1. Write in active voice by default
2. Use familiar words over jargon
3. Limit sentences to one main idea
4. Break complex ideas into multiple sentences

### After Writing

1. Run readability analysis
2. Identify sentences over 25 words → split
3. Find passive constructions → convert
4. Replace complex words → simplify
5. Re-check readability score

### Validation Metrics

Your content is optimized when:
- Average sentence length: 15-20 words
- Maximum sentence length: 25-30 words
- Passive voice: <10% of sentences
- Flesch Reading Ease: 60-70 (adjust for audience)
- Flesch-Kincaid Grade: 8-9 (adjust for audience)
