Python
Today I gave a talk introducing Python to early stage researchers in my Department. It’s always hard deciding what material to include in an hour’s talk, particularly when the subject material is so vast. This wasn’t helped by the fact that in my department there is a large range of programming experience, from researchers with backgrounds in Computer Science to Electronic Engineers who are only comfortable with Matlab. I attempted to address both of these groups by introducing Python as a language in terms of its syntax, data structures and control flow, before discussing how you can emulate Matlab by using the SciPy stack.
Now that I’ve finished my teaching qualification (the York Learning and Teaching award) I’ve had some time to get back into research. I’ve been updating various bits of software that I haven’t used much over the last month or so, one of which was to update PyPy to version 2.5 from 2.3, skipping a version in the process. I expected that I may get a few speed bonuses but there wouldn’t be a significant improvement from 2.
Receiver Operating Characteristics (ROC) are becoming increasingly commonly used in machine learning as they offer a valuable insight into how your model is performing that isn’t captured with just log-loss, facilitating diagnosis of any issues. I won’t go into much detail of what ROC actually is here, as this post is more intended to help navigate people looking for a MAUC Python implementation. If however you are looking for an overview of ROC then I’d recommend Fawcett’s tutorial here.
Over the weekend I was wondering if there were any useful APIs out there for retrieving information about the weather for a given location. I was particularly interested in current weather but this post also applies to forecasting. I had a Google and found the OpenWeatherMap service which impressed me for several reasons: It is open - anyone can add a weather station to it providing they fulfill certain criteria It provides a lot of information related to the weather!
I encountered an odd bug recently in my GP code. Large amounts of memory were being used, and was increasing substantially at each generation. I realised the problem was most likely in how the tree was traversed to obtain an output for each input data pattern. This was currently done with an iterative approach (which with post-order traversal is not fun!), as I’d assumed that recursive methods would use more memory.