FIELD REPORT / OPTIMISATION 30 January 2025 · 5 min read PETE DEVKOTA

Optimisation

A/B Testing Your Email Subject Lines: A Complete Guide

A systematic approach to subject line testing that compounds results over time, without burning your list on bad experiments.

Pete Devkota

Founder, emailOptimize · 30 January 2025 · 5 min read

Table of contents

What You’re Actually Testing
The Variables Worth Testing (In Order)
1. Personalisation vs Generic
2. Curiosity vs Clarity
3. Emoji vs No Emoji
4. Length: Short vs Long
5. Question vs Statement
How to Set Up a Valid Test in Klaviyo
Building a Testing Log
What Not to Test
A Testing Calendar
Sources

Most email A/B testing is random. Two subject lines written in the same 20 minutes, split 50/50, declared a winner based on whichever opened 2% higher. Then the test is forgotten and next month’s campaign starts from scratch.

That approach wastes tests. Consider: 43% of people open an email based on the subject line alone (HubSpot), yet only 47% of marketers currently A/B test their subject lines. The majority are sending without testing, which means the upside for brands that do it properly is significant. Here’s a structured method that actually compounds.

What You’re Actually Testing

A subject line is not a standalone asset. It’s a promise that has to match three things:

The sender: who the subscriber believes they’re hearing from
The content: what the email actually delivers
The subscriber’s context: what they care about right now

Tests that ignore this three-way relationship produce unreliable results. A subject line that wins in a brand-launch context will fail for a seasonal promotion. The winning variable isn’t the subject line. It’s the match between subject line and context.

Good A/B testing disciplines itself around one question at a time.

The Variables Worth Testing (In Order)

1. Personalisation vs Generic

Does adding the subscriber’s first name improve open rate? For most ecommerce lists, the answer is yes on average: personalising the subject line with the recipient’s first name increases open rates by 13–28%, but with significant segment variation. Personalisation works better with engaged subscribers than with cold or lapsed ones. Test this once per major segment tier rather than globally.

2. Curiosity vs Clarity

“You forgot something” versus “Your cart is waiting. Save 10%.” Curiosity works when there’s enough sender trust to make the subscriber want to know more. Clarity works when the subscriber needs a reason to open. New subscribers typically respond better to clarity. Champions respond better to curiosity.

This is the most impactful variable to test and the most context-dependent. Don’t test this once and globalise the result. Test it by segment and revisit quarterly.

3. Emoji vs No Emoji

Results vary enormously by brand, list age, and inbox provider. Generally: emoji in subject lines improve open rates on mobile for younger audiences, and have neutral-to-negative effects on older business-professional lists. Test this once, define a winner, and set a standard.

4. Length: Short vs Long

Short subject lines (20–35 characters) perform better in environments where the full subject line is visible in the inbox preview. Long subject lines (50–65 characters) can perform better when the additional specificity creates relevance. The data suggests the most-opened campaigns are 45% more likely to use subject lines between 20 and 40 characters, but the preview text interaction matters here, so test long subject lines with short preview text and vice versa.

5. Question vs Statement

“Ready to try our best seller?” versus “Our best seller is back in stock.” Questions engage curiosity and invite participation. Statements communicate urgency or news. For promotional campaigns, statements typically win. For educational content, questions often win.

How to Set Up a Valid Test in Klaviyo

Sample size: You need at least 1,000 recipients per variant for statistical significance on open rate. With smaller lists, use the same variant system but don’t declare a definitive winner. Treat the results as directional.

Winner metric: Use open rate for subject line tests. Click rate introduces too many variables (content, offer, CTA placement) to attribute to the subject line.

Wait time before calling the winner: Minimum 4 hours. Ideally 8–12 hours to capture late openers. Klaviyo’s automatic winner feature defaults to 4 hours, which is adequate for most lists.

Sample percentage: Test 50/50 on your full send list if your list is under 20,000. For larger lists, test on 20% (10% per variant), then send the winner to the remaining 80%. This reduces the risk of deploying a loser to your full list.

Building a Testing Log

This is the part most brands skip. It’s also where the compounding happens.

For every test, record:

Hypothesis (what you expected and why)
Segment (who received the test)
Variables (exactly what changed)
Result (open rate for each variant, confidence interval if calculable)
Context (campaign type, send day, time, offer)
Conclusion (what this means for future tests)

After 10–15 tests, patterns emerge. You might find that curiosity-based subject lines consistently outperform clarity-based ones on your Champions segment, but underperform on new subscribers. That’s a finding you can act on systematically, and it’s more valuable than any single test result.

What Not to Test

Don’t test two entirely different concepts. If variant A is a curiosity hook and variant B is a personalised urgency subject line, you don’t know which variable drove the difference. Test one change at a time.

Don’t test on lapsed segments. Open rates in lapsed segments are noisy and don’t reflect engaged subscriber behaviour. Tests run here produce unreliable data that generalises poorly.

Don’t declare a winner too early. A 1–2% open rate difference on a sample of 500 is not statistically significant. The threshold for meaningful difference on open rate is typically 3–4% with a sample of 1,000+.

A Testing Calendar

Run one subject line test per campaign, without exception. Over a 12-month period with 2–3 campaigns per week, that’s 100–150 tests. Even with only 30% yielding directional learnings, you’ll have built a brand-specific playbook for subject line performance that no generic best practice can replicate.

The baseline you’re trying to beat: the average email open rate in 2026 is 30.7% (up from 26.6% in 2024). Regular A/B testing of subject lines can unlock 20–40% relative lifts in open rates on clean, engaged lists, which means the best-performing programs are well above that average.

That’s the compounding. Not any single winning subject line, but the systematic knowledge of what works for your subscribers, built over time.

Need help structuring your testing framework? Book a free strategy call.

Sources

Share this post

Back to all posts