R

StravaR

I’ve finally managed to combine my two main hobbies of data science and running into one project: a Shiny web-app that allows you to explore your Strava fitness data, as well as providing a local database of all your activities that you can analyse with R. My initial motivation was that I wanted quick access to certain visualisations and metrics that either Strava doesn’t provide, or are awkward to get so I wrote a basic app for my own use.

Speeding up R workshop

Just a quick update, more to test that the website infrastructure is still running than anything else, as it’s been 4 years since my last post. I ran a workshop (slides here) at the University’s Research Coding Club on speeding up data analysis in R last month that might be useful for anyone who stumbles across this page in the future. The Research Coding Club is an informal collective of people from across the entire University who write software to aid their research.

Is the Andrew Marr show biased towards women?

I came across a tweet from Piers Morgan this morning in which he suggested that the BBC is favouring women since 43 out of the 53 paper reviewers on The Andrew Marr Show in 2019 were women. Unfortunately I was a day late to this hot-take, fortunately this is because I don’t follow Piers Morgan. However, I knew that there must be more to it than a single PC-baiting statistic and knowing that I had a ~3 hour train journey coming up this evening I thought I’d look into it a bit more.

5 reasons to move from a Shiny website to a static site

Back in March I rewrote thepredictaball.com from its original R Shiny implementation into a static website using the Vue Javascript framework. I intended to write about it at the time but I’ve been busy and hadn’t made time for it until now, which is handy given that the football season has just finished! Excuse the clickbait title, but I genuinely couldn’t think of a better way of organising this post.

Fixing bug with predicting clusters in flexmix

A second post in 2 days on mixture modelling? No awards for guessing what type of analysis I’ve been preoccupied with recently! Today’s post provides an ugly hack to fix a bug in the R flexmix package for likelihood-based mixture modelling and provides a cautionary tale about environments. In short, I’ve encountered problems when trying to predict the cluster membership for out-of-sample data using this package, and judging from a couple of posts I found online, I’m not the only one.

multistateutils v1.2.0 released

A new version of multistateutils has been released onto CRAN containing a few new features. I’ll give a quick overview of them here, but have a look at the vignette for more examples. msprep2 The first is a replacement for the mstate::msprep function that converts data into the long transition-specific format required for fitting multi-state models. msprep requires the input data to be a in a wide format, where each row corresponds to an individual and each possible state has a column for entry time and a status indicator.

Evaluating the Predictaball football rating system - 2018

Having become interested in football again due to the World Cup, I was thinking about Predictaball and how I never wrapped up the season with a brief review. It’s been a big season for Predictaball, with the move to an Elo-based system, as well as the launch of a website. However, is the new match forecasting method any good? Model accuracy Fortunately, to help answer this question, a very generous Twitter user by the name of Alex B has been collecting weekly Premiership match predictions from around 30 models and tracked their progress.

multistateutils: functions for using multi-state models in R

A month ago I mentioned that I’d been using a discrete event simulation for estimating transition probabilities from parametric multi-state models. I’ve now turned this code into a general package containing resources for multi-state modelling, called multistateutils (I know, I’m very imaginative) which may be of interest to other people working with multi-state models in R. The current release is available on CRAN, while the development is still on GitHub.

rprev 1.0.0 released with lots of new features

I’m very happy to announce the first ‘official’ release of version 1.0.0 of rprev, the R package for estimating disease prevalence by simulation. This is useful for epidemiologists who have registry data and want to know disease prevalence from time periods longer than is covered by the registry. I first released it almost exactly two years ago but had always intended to update it with the features in this release.

rdes: Discrete event simulation in R for estimating transition probabilities from a multi-state model

I’ve just released an R package for estimating transition probabilities from multi-state models onto Github, found at https://github.com/stulacy/RDES. It’s not a package with a large potential audience, so I don’t intend to release it onto CRAN, rather it has a highly specific role that I developed for my own use and thought it could prove useful for someone else. Essentially, it extends the simulation functionality offered by the fantastic flexsurv package for obtaining predicted outcomes from multi-state models.