Why the Testing Method Matters
FTP is supposed to represent the highest power you can sustain for roughly one hour. But almost nobody tests with an actual 60-minute effort. Instead, we use shorter protocols with correction factors to estimate that threshold.
The problem is that each protocol leans on different energy systems. A 20-minute test is mostly aerobic with a small anaerobic contribution. A ramp test relies heavily on your VO2max and anaerobic capacity. Critical Power sidesteps the issue entirely by modeling the relationship mathematically.
For athletes with balanced physiology, these methods converge. For everyone else, the differences can be 10-20 watts — enough to put every training zone in the wrong place.
The 20-Minute Test
This is the Coggan protocol from Training and Racing with a Power Meter and remains the most widely validated field test. The full protocol takes about 50 minutes including warm-up.
Protocol
- 20 minutes easy spin warm-up
- 3 × 1-minute high-cadence efforts (100+ rpm)
- 5 minutes all-out — this depletes anaerobic capacity (W') so the 20-minute effort reflects aerobic output
- 5 minutes easy recovery
- 20 minutes as hard as you can sustain — aim for even or slightly negative-split pacing
- Cool-down
FTP = 20-minute average power × 0.95
The 5% correction accounts for the anaerobic energy that still contributes to a 20-minute effort, even after the blow-out. Research by Borszcz et al. (2018) found that 20-minute power × 0.95 correlates at r = 0.97 with lab-measured MLSS in trained cyclists.
Who It Favors and Fails
The 0.95 factor is a population average. Athletes with a strong anaerobic system — sprinters, track cyclists — may produce a 20-minute number that overstates their true threshold. For these riders, 0.92-0.93 is more accurate.
Conversely, pure diesel-type time trialists who lack anaerobic punch may find 0.95 slightly conservative. The test also demands good pacing discipline. Going out too hard and fading 10 watts over the second half is the single most common mistake.
Key takeaway
The 20-minute test is the gold standard for a reason: it has the highest correlation with lab-tested lactate threshold. But the 0.95 factor is not universal — adjust it based on your rider type and whether you did the 5-minute blow-out.
The Ramp Test
Popularized by platforms like Zwift and TrainerRoad, the ramp test is the simplest protocol. You start easy, the power target increases every minute, and you ride until you physically cannot hold the target. No pacing decisions. No suffering for 20 minutes.
Protocol
- Start at approximately 100W (or ~50% of estimated FTP)
- Increase by a fixed increment every minute — typically 20W for men, 10-15W for women
- Continue until failure (you cannot maintain the target power or cadence drops below ~60 rpm)
- Total duration is usually 12-25 minutes
FTP = highest 1-minute average power (MAP) × 0.75
The 75% factor derives from the assumption that FTP is approximately 75% of Maximal Aerobic Power (MAP). This holds reasonably well at the population level but varies from 72% to 80% across individuals.
Who It Over- and Underestimates
Overestimates for: Athletes with high anaerobic capacity (sprinters, punchy riders). They push further into the ramp using anaerobic energy, inflating MAP and producing an FTP that is 5-15% too high. They then fail every threshold workout.
Underestimates for: Time trialists and endurance riders with poor anaerobic capacity but excellent fatigue resistance. They quit the ramp earlier (low MAP) but can sustain a higher fraction of it for an hour. Their true FTP/MAP ratio may be 0.80-0.82, not 0.75.
A 2020 study by Sitko et al. found that ramp-derived FTP overestimated MLSS by an average of 10.2% in well-trained cyclists. The effect was largest in riders with high anaerobic work capacity.
Key takeaway
The ramp test is convenient but the least individualized. The 0.75 factor does not account for anaerobic capacity differences. Use it for tracking trends over time, but validate with a longer test if your threshold workouts feel impossible (too high) or trivial (too low).
The 8-Minute Test
Developed as a middle ground between the demanding 20-minute test and the imprecise ramp test. You perform two 8-minute maximal efforts with 10 minutes of recovery between them.
Protocol
- Thorough warm-up (15-20 minutes including short efforts)
- 8 minutes all-out — steady, maximal effort
- 10 minutes easy recovery
- 8 minutes all-out — repeat
- Cool-down
FTP = average of the two 8-minute powers × 0.90
The 10% correction is larger than the 20-minute test because a shorter effort draws more from anaerobic reserves. Taking the average of two efforts reduces the impact of a single pacing error.
Pros and Cons
Pros: Easier to pace than 20 minutes. Two attempts average out mistakes. Less mentally daunting.
Cons: The 0.90 factor has less research validation than the 0.95 factor for 20-minute efforts. Two hard efforts in one session is mentally taxing in a different way. Less commonly used, meaning fewer benchmarks to compare against.
Critical Power: The Mathematical Approach
Critical Power (CP) is fundamentally different from the protocols above. Instead of a single test with a correction factor, CP uses a mathematical model fitted to multiple maximal efforts across different durations.
The Model
The CP model defines the relationship between power and duration using a two-parameter hyperbolic equation:
t = W' / (P − CP)
Where t is time to exhaustion, P is power output, CP is Critical Power (the asymptote — the power you could theoretically sustain indefinitely), and W' is the anaerobic work capacity above CP (measured in kilojoules).
To derive CP, you need at least three maximal efforts at different durations — typically 3, 7, and 12 minutes, or 2, 5, and 15 minutes. A curve-fitting algorithm solves for both CP and W' simultaneously.
CP vs. FTP: Are They the Same?
No. CP is typically 3-8% higher than FTP. Research by Jones et al. (2019) found that CP overestimates MLSS by approximately 8% on average, while FTP (20-min × 0.95) overestimates by about 4%.
The difference exists because CP is a mathematical asymptote, not a physiological steady state. You can sustain CP-level power for 20-40 minutes, not indefinitely. For training zone purposes, many coaches apply a 5% reduction: FTP ≈ CP × 0.95.
Why CP Is More Robust
CP does not depend on a single correction factor for a single duration. It models the entire power-duration curve, making it less sensitive to day-to-day variation in motivation or anaerobic depletion. It also provides W' as a bonus — a direct measurement of your anaerobic battery.
The downside: it requires multiple maximal efforts, ideally on separate days. And the standard two-parameter model breaks down at very short (<2 min) and very long (>30 min) durations.
Key takeaway
Critical Power models the entire power-duration relationship rather than relying on a single effort and correction factor. It is the most robust approach, but CP sits 3-8% above true MLSS — use CP × 0.95 for training zone calculation.
Which Test Should You Use?
There is no single best test for every athlete. The right choice depends on your experience, equipment, and goals.
| Scenario | Best Protocol | Why |
|---|---|---|
| First time testing | Ramp test | No pacing skill needed. Gets you a starting number. |
| Setting accurate zones | 20-minute test | Highest correlation with MLSS. Best for zone accuracy. |
| Pacing is a weakness | 8-minute test | Two shorter efforts reduce pacing error. |
| Advanced / data-driven | Critical Power | Most robust. Also gives W' for anaerobic profiling. |
| Sprinter / high anaerobic | 20-min (0.92 factor) | Ramp will overestimate. Use a lower correction factor. |
| Tracking progress monthly | Ramp test | Quick and repeatable. Good for trend detection. |
One practical rule: whatever test you choose, use the same protocol every time. Switching between protocols makes it impossible to compare results across time. Consistency matters more than which specific test you pick.
Paincave's Approach: No Formal Test Needed
Paincave takes a different approach entirely. Instead of asking you to schedule a formal test day, it automatically detects your FTP from your ride data using a rolling 90-day window.
The algorithm finds your best 20-minute power from any ride in the last 90 days and applies the 0.95 correction factor. When you produce a new best effort — whether in a race, a group ride, or a hard solo session — Paincave detects the breakthrough and updates your FTP automatically.
This approach has three advantages over formal testing:
- No dedicated test days — your FTP updates organically from normal training and racing
- Always current — if your fitness changes, your FTP follows within the 90-day window
- Race-day accuracy — maximal efforts in competition often exceed what athletes produce in isolated tests, because external motivation is higher
The trade-off is that you need at least one near-maximal 20-minute effort in your recent training history. If you only ride at endurance pace, automatic detection will not capture your true threshold. But for most athletes who include any intensity in their training, the rolling-window approach keeps zones accurate without the mental and physical cost of formal testing.
Key takeaway
Paincave tracks your best 20-minute power over a rolling 90-day window and updates FTP automatically when breakthroughs happen. No test days, no pacing anxiety, no disrupted training weeks.
Common Testing Mistakes
1. Skipping the 5-Minute Blow-Out
The 5-minute all-out effort before a 20-minute test is not a warm-up luxury. It depletes W' (your anaerobic battery) so that the 20-minute result reflects aerobic capacity. Without it, your 20-minute power may be 3-5% too high, and the 0.95 factor will not fully correct for it.
2. Poor Pacing
Starting a 20-minute test 10% too hard is the most common error. You feel strong for the first 5 minutes, then fade progressively for the remaining 15. A well-paced test has negative splits — the second half slightly harder than the first. Use your estimated FTP as a floor and add 3-5% for your opening power.
3. Indoor vs. Outdoor Mismatch
Most cyclists produce 5-15% less power indoors due to heat buildup, reduced inertia, and absence of external cues. Testing indoors and then using that FTP for outdoor riding means your zones are too easy. Test in the environment where you primarily train, or maintain separate values.
4. Testing Too Often
An FTP test is a maximal effort that generates significant fatigue. Testing every week wastes training time and creates unnecessary recovery debt. Test every 4-6 weeks during a training block, or let automatic detection handle it.
5. Testing When Fatigued
If you test at the end of a heavy training week, your result will be suppressed by accumulated fatigue. Schedule tests after 1-2 rest days when your legs are fresh. The number should reflect your capability, not your current tiredness.
6. Using the Wrong Factor for Your Physiology
The standard correction factors (0.95 for 20 min, 0.75 for ramp) are population averages. If your threshold workouts consistently feel impossible, your FTP is too high. If they feel too easy, it is too low. Adjust the factor by 2-3%, or switch protocols.
The Bottom Line
Every FTP testing method is an estimate. The 20-minute test has the strongest research backing. The ramp test is convenient but biased by anaerobic capacity. The 8-minute test splits the difference. Critical Power is the most scientifically rigorous but requires the most effort.
What matters most is not which test you choose — it is that your zones feel right in practice. If your threshold workouts are barely survivable for 10 minutes, your FTP is too high. If you can hold "threshold" for 45 minutes comfortably, it is too low. The test gives you a starting point. Your body tells you whether it is correct.