Running

Clustering running data using Dynamic Time Warping

Motivation I recently needed to implement some time-series clustering at work on a high-dimensional dataset. Having not tackled this specific problem before, I wanted to practice on a smaller, univariate dataset where I already had a strong intuition about the underlying structure. I turned to my exported Strava running data as a testbed to get familiar with clustering using Dynamic Time Warping (DTW). Heartrate data Show the code library(tidyverse) library(duckdb) library(plotly) library(dtw) library(parallel) library(tidytext) library(ggwordcloud) library(patchwork) library(dendextend) library(ggridges) library(broom) library(cluster) I started by pulling data from my existing database, restricting the scope to runs that contain both heart rate and GPS location data.

StravaR

I’ve finally managed to combine my two main hobbies of data science and running into one project: a Shiny web-app that allows you to explore your Strava fitness data, as well as providing a local database of all your activities that you can analyse with R. My initial motivation was that I wanted quick access to certain visualisations and metrics that either Strava doesn’t provide, or are awkward to get so I wrote a basic app for my own use.