Working With Workout Data

In late January 2017 a new pool opened in my neighbourhood. This left me no excuse but to get back into the water after a six-year hiatus--the longest break since I started competitive swimming at age nine. 

 

During this lay-off I've been pretty good about making it to the gym for treadmill running and weights 3-or-so times a week. But I was in anything but water shape, so it was definitely time to go back.

 

I also wanted to take advantage of the new fitness tracker technology. I used a FitBit for a year or so a while back, and make sure I run with my phone to log all the steps. But now I wanted to track my progress in the water in a little closer detail. I figured I would give myself a year and then take a look. The data below is the product of that curiosity. 

Garmin Vivoactive HR+

I ordered a Garmin Vivoactive HR+ (and a screen protector) from Amazon pretty much as soon as I started swimming. There wasn't much science behind the choice, other than the lap counting seemed pretty straightforward and I have a long history with the Garmin brand through their GPS line. You can pick one up from Amazon or SportChek for around $200.

 

The device itself is very intuitive: set the sport you're going to train (running, biking, swimming, stand-up paddle boarding, etc) and start the timer. If you're doing sets, click a split after you finish each interval (and click again when leave when the rest is up). The Vivo's GPS and accelerometer will do the rest, tracking your distance, counting strokes and strides, and so forth.

 

It even has some swimming-specific features like 'SWOLF', which is 'swim golf' or your time plus the number of strokes you've taken in the length. The idea here is, just like in golf, to get your score as low as possible. The accelerometer is also designed to decipher what stroke you're doing, outside kick or IM (for the former there is no forearm motion, and the latter is just too confusing for the algorithms).

 

I haven't seen any estimates of how much a watch adds to your drag, but I wear a drag suit pretty much every workout in any case. The device is also, at least in terms of distance, extremely accurate. At the start it would over- or under-count by a 25m for several sets each time out. However, a software update a few months ago really curbed that; nowadays it's rare for a workout to be off by even a single 25.

 

One thing the Vivo doesn't capture is (in-water) heart rate. For this you need an extra monitoring band. I planned to pick one up (and still do, to better target pace times to heart rate zones), but I was a little uncertain about compatibility and just never got around to it. Plus, as we'll see below, Garmin has not designed its split-level data very well. I'd like that to be a bit smoother before I worry about more gear.

 

You can add after-workout comments, like recording kick distances, whether or not I wore a drag suit, or general notes on conditions and self-assessment. The problem is these don't download in the aggregate data, and, as with any data collection, if it's not done automatically, there's a good chance it's not going to get done at all.

Method

There wasn't much of a plan when I got back into the water; no macrocycles, no championship meet to peak towards. I just added the basic swim workout pattern I've been doing since I was 19 (200 Fr/200 Br/200 Bk/200 K/ 200 IM warmup, followed by 50s, the occasional set of 100s, and 400-500m paddles sets)* to the alternating days around my time in the gym. My only real addition was tacking on a 8x50 @1min paddles Free test set after each warm-up, with the idea being to track my peak speed and average pace improvement over the year. 

 

*I get that, given the body's homeostatic tendency, doing essentially the same workout week after week is the exact opposite of good self-coaching. But 1. I'm not very imaginative; and 2. it makes for a good sample size!

 

Thinking about the drawbacks of the data harvested, I've come up with the following caveats:

 

  • Validity: A big thing to keep in mind is that workout success doesn't necessarily mean *race* success. The conditions are slightly different, including psychological stresses. Lots of Workout Kings find race day unpleasant or worse, meaning past workout performance is not a perfect race predictor variable. Even more problematic is that swimming has a terrible culture of overtraining aerobic systems at the cost of ignoring race-specific, anaerobic capacity. The Vivo also doesn't record the use of paddles or not, meaning someone could look at my 8x50 test set splits and assume velocity jumps that are actually due to the device, rather than my own stroke. In a similar way the raw data won't reveal if I was wearing a drag suit or not. Last is the problem that the Vivo can't record kick sets, something I can confirm biases you away from kick meters during the workout (why do meters, so the natural inclination goes, if you're not going to get credit for them?!)--a definite negative training effect.

 

  • Reliability: As mentioned above, at the outset the distance error from all the orphan 25s was noticeable. But from about last summer that cleared up pretty well. (There was also a tendency for the orphan 25s to be both over- and under-counts, so they would have at least somewhat cancelled themselves out). Less easy to fix is the departure and arrival error--the gap between when you click start on the watch and when you actually leave the wall, and the gap between when you touch the wall on the finish and when you finally click lap finish on the Vivo. I can imagine the finish error growing larger the harder the lap, as you prioritize catching your breath over fiddling with the watch buttons. But I haven't spent much time trying to quantify these errors; instead I've just--reasonably, I think--assumed that they will be both small and consistent, so are of no major consequence to the analysis here.

 

Overall, we can be reasonably confident in the precision and usefulness of what the Vivo captures. The data here isn't perfect, but it is a long ways from the log books and journals coaches asked us to keep (mainly in vain) way back in the day. 

Using Garmin Connect

Gamin's data interface software is a mobile app and online portal called Garmin Connect. I use my phone to synch the data, then the web browser version to download the data for analysis.

 

As shown below, the site does a good job of categorizing each workout. There is also, on the top right corner, an 'Export CSV' button that will dump all the *aggregate* data (more on this later) into a single Excel file. 

You can also drill down into each workout and look at all the split data:

There is lots of interesting stuff in here. The only problem is that it's ridiculously(!) difficult to pull out the data from across all workouts. The 'Export to CSV' button inside a given workout ONLY downloads data from that specific workout. Pulling out the fine-grained data (like, say, lap speed in a specific test set) across the whole year means painstakingly clicking through the download process for *each* workout (or train a webscraper like ParseHub to do the work for you). 

 

Not being able to download all split data at once is a pretty glaring failure--so obvious I actually contacted the company to see if I was missing something. But alas, no salvation.

Even crazier is that once you do manually download a workout's splits, there's no ID variable--such as date--to uniquely identify the workout. This means when you download you also have to *manually* change the filename, otherwise it will go lost in the mass of new files.

 

This is really poor data management, and I wasn't prepared to take the time to fix the failings manually. The result is that for the time being I'm limited to the aggregate data--leaving all my 8x50 test sets locked in a folder full of disorganized spreadsheets. 

Total Workouts

Having downloaded the aggregate data, which runs from February 15, 2017 (when the Vivo arrived, about two weeks after I got back into the water) to March 2, 2018 (an arbitrary, 12-month cut-off), I ran it through Stata to clean it up a bit and produce the following charts. 

 

The basic template is swimming and gym on alternate days, with a treadmill run and weight workout comprising the latter. (In this way pretty much all of the 29 runs and 28 weights were done back-to-back, with the few outliers being workouts on the road, an occasional swim on holidays, and the like).

 

Graphing the total makes it apparent that it's easy to overestimate how many workouts you put in over a year. I hit 48 swims, for example, which is less than one per week--this, when the aim was to make at least two and try to hit three.  Ouch.

Graphing on a monthly basis shows how things can go sideways real quick. Four months had complete no-shows, while only three of the 12 full months in the sample managed to hit the targeted number of workouts. 

 

The reason for the 0-count months was travel and some paternity leave, but I think it's fair to say that once you miss a workout, it's frighteningly  easy to miss a second...and a third as well.

Weights

Though it has its own setting for weights workouts, there's not much data captured by the Vivo other than duration and estimated calorie burn.

 

I do a pretty regular set of exercises, and the more time spent in the gym means more of them get done. So the inference here is that the longer the duration, the longer and harder I've worked.

 

Still, while I have a rough sense of how much I've increased the weight of the free weights and bench press I use, a more careful analysis would record specific exercises and contain the occasional test set. I can't claim to have done any of that here.

Running

As a data-obsessed former swimmer, treadmills are my favourite means of running. Monotony is second nature to a lap swimmer, so I don't gain much from running outside (to say nothing of Canadian winters). I also like the ability to train at specific speeds and for precise amounts of time. Pretty much every workout here was done on one.

I'm a little disappointed at how little distance I've gained in recent months. I've lowered my peak pace in return for some more steady threshold work, but have very little kilometre improvement to show for it.

 

If I smooth out the mileage trend (using Lowess here) there is some overall distance improvement, at least until I hit a cold in late February. But it's come at the cost of a pace that's nothing to brag about.

Swimming

The swimming front is where I've really been interested, both because my starting point was essentially zero and because I'd like to get back into halfway decent racing shape.

 

Note that I've adjusted the distances here to include kick sets, which I recorded in Garmin Connect's manual notes section.

 

Looking at workout distance, two stories jump out at me. First is the steady ramp-up that's necessary when you get back into the water. That first kilometre is a hard one (and remember, when the data starts here I was already two weeks in!).

 

The second story is the difference that swimming with a club can make. The red line below marks when the local Masters club started up, and you can see how the mileage rockets accordingly.

I drew a Lowess curve again to give a clearer sense of the trend. The steepness surprised me, even with a persistent cold dragging down the metres on a few Sunday swims in late February.  

Follow-up

All of this work is pretty basic. To dig further into the data I'll have to ID the splits files (perhaps by merging them on distance?). I would hate to have to go back and relabel from the individual downloads. Maybe Garmin will release an update....

 

With the micro data in hand we can look at the test sets and do some inferential work alongside. I'd like in particular to test whether an increase  run mileage--and maybe work in the weight room--leads to any improved top speed or average pace in the 8x50s.  But that will have to wait for another day!

Conclusions

The proliferation of cheap video recording equipment (hello cells phones!) and personal fitness trackers makes this a fascinating time to be both athlete and coach. I do a bit of both these days, and the availability of data today has to be sport's biggest difference from back when I was an age grouper. I'm not fully certain as to the implications of this, but it sure is exciting to think about what it might make possible. 

 

Second, if the timing at all works for you, join a Masters club! Again, humans are homeostatic creatures--we try our best to get by with the minimum exertion required. This means you can push yourself when training alone, but there's always a little something in the back of your brain holding you back. When you train with others, by contrast, another force is at play: we are competitive creatures, and now the hidden call in the recesses of your mind is to push harder, stroke ahead of the guy beside you, or even just get out of bed so that Coach won't get angry. 

 

Funny thing, that.

Appendix

Here are the data and do files:

Sean Clark - Garmin Aggregate Workout Data
Activities.csv
Comma Separated Value File 32.5 KB
Sean Clark - Garmin Workout Data Analysis .Do File
Activities.do
Text Document 12.8 KB