National Learn to Code Day

On Saturday, I took part as a mentor in National Learn to Code Day with Ladies Learning Code. There were about 50 learners of all ages. The topic was “Using Data to Solve Problems: An Introduction to Artificial Intelligence and Machine Learning for Beginners”, led by Solmaz Shahalizadeh, the data team directory at Shopify in Ottawa.

The topic seemed like an oxymoron at first – how can beginners possibly cover in one day what could easily span multiple graduate courses? But actually, it worked great, and I think everyone learned a lot.

Continue reading →

How to learn to code

I just read a great post by Jessica Duarte on teaching beginners to code. It is all so true. Especially #5, making mistakes:

You [the instructor] have to ride out the mistakes. Make them often. Let the class fix them.

It’s essential for students to see and experience the process of working through mistakes. Right now, I am starting to use git and Github for our manakin research at the Smithsonian. A major benefit is that it allows us to make mistakes safely.

Duarte is organizing the 2017 National Learn to Code Day on Intro to Machine Learning and Artificial Intelligence. I look forward to helping out with the Ottawa chapter in on September 23!

Data sharing, reproducibility and peer review

I just reviewed my first manuscript where the authors provided a reproducible analysis (i.e., they shared their data and analysis script with the reviewers). This is something my coauthors and I have tried to provide with our recent studies, but it was my first time experiencing it as a referee.

I think it really helped, but it also raised new questions about traditional peer review.

Continue reading →

Speeding up loops in R

This is from a session I did with the UBC R Study Group. Loops can be convenient for applying the same steps to big/distributed datasets, running simulations, and writing your own resampling/bootstrapping analyses. Here are some ways to make them faster.

1. Don’t grow things in your loops.
2. Vectorize where possible. i.e. pull things out of the loop.
3. Do less work in the loop if you can.

Continue reading →

Animating time series in R

Frame-blending is a great way to illustrate animal behaviour and other things that change over time. This got me thinking about ways to animate time series data. In R, the animation package has lots of options, but you can also build your own just by plotting over the same device window. If you save each iteration in a loop, the resulting images can be used as frames in a video or gif.

Hummingbirds deviate away from vertical stripe patterns

 Click the image to see a larger version

Here is an example using recordings that track hummingbirds flying in our tunnel here at UBC. This animation shows a bird’s eye view of 50 flights by 10 birds. In half of the flights (the red ones), the birds had horizontal stripes on their left side and vertical stripes on their right, and the other half (blue) had the reverse. The subtle difference between the red and blue trajectories (red ones tend to have more positive y values) shows that on average, birds tend to deviate away from vertical stripes, and towards horizontal ones. The histogram that builds up on the right side of the figure shows the mean lateral (y) position for each trajectory as it finishe

Continue reading →

How to loop efficiently in R

My learning curve with the statistical software R has been a long one, but one of the steepest and most exciting times was learning how to write functions and loops. Suddenly I could do all kinds of things that used to seem impossible. Since then, I’ve learned to avoid for loops whenever possible. Why? Because doing things serially is slow. With R, you can almost always reduce a big loop to just few lines of vectorized code.

But there’s one situation where I can’t avoid the dreaded for loop. Recently, I learned how to make for loops run 100s of times faster in these situations.

Continue reading →