Wednesday, July 01, 2009

The Sins of Sins (Testing FTP #2)

One of the most frequently referenced items on this good ol’ blog of mine (I can call it old, it’s in its fourth year – that’s officially old in interwebby speak) was an item I penned about ways to estimate your Functional Threshold Power (FTP - maximal quasi steady state average power one can sustain for about an hour).

That post was basically an expansion on the original information provided by Dr Andrew Coggan, publicly posted many years ago on the Wattage Forum and dubbed “The Seven Deadly Sins”.

Indeed since writing it, this one blog item has been viewed nearly 45,000 times.
Here is the link to the original post:
The Seven Deadly Sins

It was recently suggested to me (by Steve Palladino) that it might be worthwhile to pen a follow up to that post. One that explores some of the common mistakes people make when attempting to estimate their FTP. So here are a few thoughts on the subject.

As is often the case, none of this is particularly original, most of these are just accumulated tidbits of information and knowledge and it is by no means an exhaustive list. I may even have some of it wrong. You may have others worth adding or corrections – by all means, let me know – happy to add them to the examples listed.

Before getting into the list – The Sins of Sins – I will say that estimating FTP is important and the reasons for that are outlined in my previously linked post. It's not important in a “curing cancer” kind of way, but getting it right to at least a reasonable level of accuracy is pretty darn handy as there are many other very useful facets of training and racing with power that rely on having a good FTP estimate.

One doesn’t need to be completely anal about it and testing really often is not typically necessary (a few times a year is usually enough – the appropriate frequency depends on individual circumstances). Also nailing it down to the watt is not necessary either, the nearest five watts is typically more than sufficient.

The Sins of Sins – Top 10 (in no specific order):
SOS #1 – Not testing at all
SOS #2 – Not using an accurate power meter
SOS #3 – Using inconsistent methodologies
SOS #4 – Not replicating riding conditions in testing
SOS #5 – Ignoring signs that FTP has changed
SOS #6 – 95% of a 20-min mean maximal power = FTP
SOS #7 – Using NP from rides < < 1-hour
SOS #8 – Inappropriate use of the CP model
SOS #9 – Not performing maximal efforts
SOS #10 – “I’ve got an NP buster!”

OK, let’s examine each in a little more detail...

Sin of Sins #1 – Not testing at all
OK, this might seem a bit redundant, but honestly there are people who think they can get away with no testing at all but still want to know what their FTP is. Or that testing is such an impost in the training / racing schedule that it is “harmful” to schedule it. Bollocks.

Given the adage “training is testing, testing is training” then really there’s no excuse for never doing an effort or two in order to nail down one’s FTP more tightly than a lame guess. Stop wondering and go and do it. Gee, I feel better already.

Of course, an experienced eye can often inspect the mass of an individual’s power meter data and probably come up with a reasonable SWAG. But far better to schedule a test and be certain.

Sin of Sins #2 – Not using an accurate power meter
(and/or not using a power meter at all)

This is also a pretty obvious sin of sins but it happens. If you are going to use a power meter, it makes a lot of sense to ensure you are collecting accurate data. Otherwise how are you going to be sure that changes in power output as reported are in fact representative of actual changes in performance?

Check your meter’s calibration and make sure you perform the appropriate torque zero / zero offset procedure so that the data can be considered reliable. Neither is hard to do nor time consuming.

And if you don’t have a power meter, sure, go time yourself up a long steep hill climb and make an estimate of power output using, but then what? Without a reliable means to collect power data at other times, then the primary benefits of knowing your FTP and all that flows from it are not accessible. So use the hill climb as a good fitness test but the power estimate is essentially for satisfying curiosity or bragging rights at the coffee shop.

Sin of Sins #3 – Using inconsistent methodologies
This is pretty common. When you start out with a power meter, naturally you’ll want to work out the best, most reliable method for your particular circumstances. Everyone has different terrain to ride on, levels of traffic to contend with, opportunities to do a time trial, or time in which they can safely perform a test where they live, or can’t get outside for months on end, etc etc, so the sin(s) they choose to use as most appropriate to estimate FTP are different.

But once you have settled on a good method, then stick with it and replicate the same protocol each time. By reducing the number of variables that can influence the outcome, the more reliable is the data and what can be interpreted from it.

Examples of consistency might include:
- Using the same venue
- Using the same number of light, recovery ride or rest day(s) before the test(s)
- Performing tests in the same order, with the same break in between
- Performing the tests on the same number of days apart (or always on the same day)
- Using the same equipment
- Looking for similar environmental conditions if possible
- Performing tests over the same distance/duration

Of course it is not always easy or practical to replicate everything, every time, but at least consider these factors when deciding on a test method. Some methods lend themselves to more consistent protocol than others. A time trial over the same course, or undertaking a Maximal Aerobic Power test are examples of those which enable consistency without too much thinking involved.

Sin of Sins #4 – Not replicating riding conditions in testing
This might not be as bad as it can seem at first but it makes sense to at least use a test method using the bike/equipment/terrain/location/bike position etc that comprises the majority of your riding at that stage of your training/season.

This is especially the case when there is likely to be a significant difference in the performance (power) using the test method versus what you would ordinarily be able to produce. For example, if you only ride indoors occasionally and know you struggle to generate the same power as you typically do outdoors, then don’t use the indoor trainer to test FTP.

Sin of Sins #5 – Ignoring signs that FTP has changed
“I had a two hour group run today and my Intensity Factor was 1.07”.
Provided you are not falling for SOS #1 or SOS #2, then be on the lookout for signs that FTP may indeed have shifted significantly. There are a number of them and they include:

- Actual performance not consistent with current FTP estimate, such as AP/NP from a 40km TT that is significantly different from FTP

- An Intensity Factor (IF) > 1.05 for any ride or section of a ride of about an hour

- Regular long intervals at/near FTP becoming “easy(ish)”

- Perceived exertion for rides not consistent with intended level (e.g. a tempo power rides feels more like an endurance ride)

- a steeper than typically sustainable medium term rise in Chronic Training Load. e.g. your CTL has apprently risen at a much higher rate than you would normally expect to sustain without getting ill/niggles/overly fatigued (e.g. > 8 TSS/day/week but maybe less for some)

Now these are signs that FTP may need retesting but are not necessarily good tests in themselves. So ignore them at your peril but don’t jump to inappropriate conclusions or immediately adjust FTP. Gather some additional evidence.

Sin of Sins #6 – 95% of a 20-min mean maximal power = FTP
Well, this method of establishing FTP isn’t one of the listed Seven Deadly Sins in the first place, but it has become such a commonly referred to/utilised method (mainly due to its publication in the excellent book, Training and Racing with a Power Meter) that it gets its own SOS number.

Firstly, the main issue with this common Sin of Sins is that the ratio between 20-min power (or other similar shorter TT duration power) and FTP is not the same for everybody, and neither does the ratio remain static for an individual. One should recognise that due to several factors, not least of which is the contribution of anaerobic capacity and the exact protocol used (e.g. performing a pre-ride blowout effort), that the ratio is likely to be within a range and where someone is within that range is anyone’s guess.

So, FTP might be anywhere in the range of, say 90% to 98% of 20-min max average power. Personally, my FTP has been at both 92% and 96% of my then 20-min max average power. So, by all means use 95% of 20-min max power as a starting point but remember it may well be out by some margin and it would be wise to use an additional or alternative method to validate your FTP estimate.

Sin of Sins #7 – Using NP from rides < < 1-hour
“My 20-min max NP from that crit was 378 watts, so is my FTP 95% of that, i.e. 359 watts?”

Er, no.

Apart from falling for SOS #6, the efficacy of the Normalised Power algorithm in providing a “normalised iso-power equivalent” begins to drop somewhat as the duration shortens to substantially less than one hour. 20-minutes is in that grey zone. 30-minutes ain’t too shabby but I think anything less than 40-50 minutes is stretching the envelope a bit much for a reliable number from which to make an estimate of FTP.

Sin of Sins #8 – Inappropriate use of the CP model
The Critical Power (CP) model is a useful way to estimate FTP. See my previously linked item on the Seven Deadly Sins to find out a bit more on how it works.

The calculation of CP is sensitive to both the way data is collected and the data chosen to input into the model. So ignoring reasons for these sensitivities can introduce unwanted errors. Common SOS#8 mistakes are:

- Using data from inappropriate test durations. Ideally you will want data from within a range of durations – typically tests should be at least 3 minutes and no longer than 30 minutes duration. Tests from very short (e.g 1-minute) or long durations (e.g. 60-min) tend to skew the calculations somewhat. Besides, if you have a 60-min test, then CP is somewhat redundant.

- Using data from test durations that are too close to each other, e.g. 3-min and 6-min. It is far better to use one test of ~ 3-6 min and one of ~ 20-30-min. Can also include another from a duration in between but two really good points with sufficient spread between them is all that's really needed.

- Using multiple data points which include unreliable data, such as a test that was not truly a maximal effort for the duration or was tainted due to the protocol/method used to collect the data. Far better to have two very good data points than four data points with one or two suspect numbers.

- Not using the same test durations each time. E.g. using a 6-min and a 20-min test and next time using a 3-min and 28-min test. Pick your sample durations and stick with them, within reason. This is not as easy as it seems, since if you are doing a 5-min test, how hard do you go? It can be easier to pick a power level you expect to maintain for the duration and go ’til you blow. But if it becomes a significantly different duration, it may affect the outcome.

- Using a different protocol to collect the data. Principles of SOS #3 apply. If you perform both, say a 5-min and a 25-min test on the same day, then next time do it the same way and in the same order. If you perform the tests on different days, then be consistent about that protocol.

- Similarly, avoid cherry picking mean maximal power data from different rides, e.g. a local TT and last week’s crit and then next time a Level 4 training effort and the hillclimb during the local world’s bunch ride.

- Selecting non-contemporaneous data. Now that’s a big word. What I mean is, you don’t select your best 5-min power from three months ago and combine it with a 25-min test from last week. The data must be from the same time period (I suggest the limit for data collection be approximately one ATL time constant or around 7-10 days)

- Using Normalised Power. Don't. Use Average Power.

- Not weighing yourself or using the wrong body mass for the model (note that this doesn't affect CP calculations, just some versions of the model also quote or calculate CP in W/kg terms).

Note that the CP value calculated by the model is typically a better estimate of FTP than the 60-min power predicted by the model. The 60-min power prediction is usually a bit higher than the CP value.

Note added June 2013:
The Golden Cheetah power meter analysis software has a built in feature that uses the principles of the critical power model to provide a CP estimate based on your power meter files. I am not exactly sure of the means by which GC's implementation derives its estimate, but I suspect it is susceptible to the problem of cherry picking data, using inconsistent data, and possibly not including data from efforts of sufficient duration as mentioned above.

As a result, use of the CP model implemented in this manner routinely overestimates FTP. Initial data as assessed by Dr Coggan indicates a typical overestimation of around 5%. This presumes there is sufficient actual data with maximal efforts across various durations.

Sin of Sins #9 – Not performing maximal efforts
Testing performance requires one to go to the limit, otherwise one can never know where that limit is. There is some sub-maximal testing one can do, such as determining lactate threshold in the lab but for the purposes of using a power meter to ascertain FTP, then one does need to lay it all on the line.

Of course it goes without saying that one should be sufficiently fit and healthy to perform maximal effort testing. Undergoing testing while health concerns exist may well end up being the biggest mistake of all!

Sin of Sins #10 – "I’ve got an NP buster!"
No you don’t*.
It is 99.99% likely that:
(i) your FTP is underestimated, or
(ii) the duration you are referring to is not about an hour, or
(iii) your power meter data is suspect – reference SOS #2.

* OK it is possible, just highly improbable and some substantive evidence is required before making such a declaration and joining this rare club.

Finally, there’s not much point in taking your track bike to the local velodrome, doing a whole bunch of anaerobic efforts while tooling around the infield in between efforts, racking up some weirdo NP number due to all the breaks and then seeking to use it as guide to FTP. The test needs to be realistic for the purpose. This is a variant of SOS #4.

I’d expand some more on this, like “what the %&%$ is an NP buster?” and “I do so have an NP buster” but perhaps I’ll save that for another day.

OK, that’s enough for today. It was a bit long but hopefully it can help you to avoid some of the more common pitfalls when attempting to estimate your FTP. It's not all that hard.

Good luck and safe riding!


AH said...

As always, Alex, very educational. Thanks for sharing.

Anonymous said...

Is body mass needed for the Critical Power Model?

Groover said...

I just ordered Coggan's and Allen's book because my head is all dizzy from the FTPs and NPs. :-) One question though and I hope it's not a silly one: Once you have determined your FTP correctly, what do you do with it? Do you base your power training zones on it like you do with heart rate zones?

Alex Simmons said...

Body mass is not required to determine CP, however since CP is expressed as W/kg, then body mass is required to determine an FTP estimate or an estimate of mean maximal power for other durations from, say, 3 to 60 minutes.

Alex Simmons said...

Well FTP provides you with a number of things, including an indication of how hard you could ride a time trial (so it is a good pacing tool), a benchmark for fitness changes as well as a method to determine power training levels, analogous to setting HR zones, although there are a number of important distinction when training using HR & power.

There are other methods to set training levels (e.g. using Maximal Aerobic Power) but reference to FTP is very common and well understood by most experienced power meter users.

Probably the most important concept to get your head around is that of Normalised Power, and the ratio of NP to FTP for your rides (the Intensity Factor). There is a whole new world to be opened up on training management with those indicators.

rmur said...

"Body mass is not required to determine CP, however since CP is expressed as W/kg, ...."

hmmm .. it doesn't have to be so Alex. I know one, perhaps THE original, spreadsheet from Eddie Monnier expressed AWC and CP in per kg terms but that's no absolute. My own old CP-AWC spreadsheet makes nary a mention of mass/kg and works just fine :-0)


Alex Simmons said...

Hi Ric
Yeah - you're right, you don't need body mass to calculate CP (I think I said that already) but some models do have CP expressed as W/kg (and my version of the model is also from Monnier).

Since I have seen a model with the wrong mass used giving a false impression of FTP, I just marked that down as one item to check if that's how your version of the CP model works. It's no biggie.

Unknown said...

On this one:
"Note that the CP value calculated by the model is typically a better estimate of FTP than the 60-min power predicted by the model. The 60-min power prediction is usually a bit higher than the CP value."

Why is CP not equal to 60' power, hence FTP estimate? And why use CP, not 60' power estimate?

Thanks for the great post.

Xplora said...

Hi Alex, you mentioned you shouldn't use NP for FTP estimation... I had a best average power of 232W at the West Head race, and a 1 hour best NP of 303W (WOW). Race was 90 minutes all up. If I had my FTP estimate at 265W, would this be justification to increase it further?

Alex Simmons said...

I haven't said you shouldn't use NP. I've said that NP isn't a valid input for the critical power model and that you shouldn't consider NP from efforts much less than an hour's duration for the purpose of assessing your FTP.

If you had a *hard* ride of about an hour, then FTP will far more likely be closer to NP than AP.

So 90-min AP of 232W < FTP <= 1-hour NP of 303W.

I would say that, provided the power data is accurate and NP has been corrected calculated, it's indicative of an FTP higher than 265W, and that you might want to consider other sins to validate.

Alex Simmons said...

CP is a model based on inputs from rides of much less than an hour's duration, whereas FTP is what we can actually do.

The CP model tends to over estimate what we can actually do for longer durations, how much it over estimates depends on the duration of input chosen (and is why it's suggested test durations should include an effort of 20-30 minutes).

Fred said...

Hello Alex. Congratulations on the blog. I found it by chance when searching on MAP. I rather enjoy the content. I know this post is rather old, but I can not leave it blank.
How am also new with power meter, I'm still assimilating the terms. In January-14 did the FTP test and found the value 268w. In recent ride my 60s peak was 415w.
If FTP is 75% of MAP and MAP max power In your 60s peak. My FTP should be 311? I know that the paradigm of the MAP is not correct, but it brings a guideline. Right?

Alex Simmons said...

Hi Fred

MAP is not the maximal 1-minute power you can generate, but rather the highest power for 1-minute *during an incremental test to exhaustion*.

The pre-fatiguing nature of such a test prior to that final minute means that MAP will be significantly lower than the power you could hold for 1-minute without pre-fatiguing.

Most people could maximally sustain MAP for somewhat longer than 1-minute, usually for a few minutes. Exactly how long is variable, but I'd say your 3-minute max power will be closer approximation of MAP. Some might last less than 3 minutes, some longer.
Cheers, Alex

Derek said...

Hi Alex. Just discovered your blog and absolutely love it! So helpful, especially as I've just started training with power this winter. Hope you will see this comment even though it's an older blog...

I have a Stages left crank meter on my mountain bike and Powertap P1 pedals on my road bike. I realise I'm committing several sins here but as an experiment I did a 30min FTP test with the mtb on rollers and four weeks later a 20min test with the road bike on a turbo trainer using Coggan's routine from the book. I appreciate they are different tests, different power meters and different bikes, but I was surprised how far out they were with the FTP values. Four weeks later I have supposedly lost over 10% of my FTP. I didn't think the power meters would be that far out or that the test difference would be that great. Can the turbo give lower results than rollers? I'll use the same test from now on but which one is likely to be more accurate?

Alex Simmons said...

Hi Derek, glad to hear you find some value here.

1. Yes you can get different power output on different training set ups. Read my post "Turbocharged training" to get a sense of some of the issues:

2. The Stages is left crank arm only, and so any power asymmetry (which is normal) will result in a doubling of the error. e.g. if you are left side dominant, say 53/47, then Stages doubles the 53 to get 106, when your actual power is 100. Read my post "Left right out of Balance" to learn some more about power asymmetry:

3. Even if there is no asymmetry, different power meters may read a bit differently but a pedal and crank meter should be very close if they are both correctly calibrated. If we assume they are both correctly calibrated, then it may indicate you have some asymmetry, or that there may be power difference due to the trainer set up, and of course fitness changes. Which of these is hard to tell.

Perhaps try the P1s with your Stages and compare the data on the rollers and on the trainer.
Cheers, Alex

Derek said...

Thanks Alex. Really appreciate your quick response and I think it could indeed be down to asymmetry as the P1s do show that I have a left side dominance. I'll do as you suggested to compare the P1s with the Stages.

Final question if you don't mind - if I'm now going to stick to a consistent testing method on the turbo trainer with the P1s, what is the "better" test? Coggan's protocol with the 5 min all-out, 10 mins easy, then 20 mins FTP test? Or the one from Friel's book where he simply uses average power for a 30 min effort (following a warm-up)?


Alex Simmons said...

Choose your poison really. The Coggan test you refer to is really Hunter Allen's test protocol, or one that Hunter has used. Both are fine. As a general rule of thumb, it helps to have several tests that help to inform about performance across the power-duration spectrum.

The manner in which you test is a function of the information on yourself for which you are seeking to gain insight, the nature of your target events (IOW what matters most for your events), and the the environmental conditions and options available to you for testing (e.g. indoors, outdoors, climbs, sufficient length of road to go for X minutes or y km, etc) and how those affect the test results.

Really though, just get into it and find a method that works for you and your circumstances. It may evolve with time.

With the new(ish) power-duration models, we are able to infer more and more from general training and racing data and to a degree it lessens the need for formal testing. But formal testing still has its place, especially since at times we are not always including maximal efforts in our training mix.

Cheers, ALex

Derek said...

Thanks again Alex. Really helpful and I'll be keeping an eye on your blog for more useful advice.