Caitlin Menzies nails her 3-minute thesis!

Caitlin Menzies gave an excellent 3-minute thesis at Carleton’s annual 3MT competition.

Caitlin's 3MT slide

Fitting your entire MSc thesis into one slide

What to Read for new graduate students

Ilias and I have been talking about papers each week. Most recently, we read Platt’s Strong Inference paper about the scientific method and Doug Fudge’s engaging 50-year anniversary essay about it.

What are some articles that are great for new graduate students should read? This is a rough list-in-progress…

Stephen C. Stearns “Designs for Learning” (and “Some Modest Advice for Graduate Students”)

Platt (1964) “Strong Inference” (and Fudge’s 2014 essay “50 Years of JR Platt’s Strong Inference”)

Tinbergen (1963) “On Aims and Methods in Ethology”

Srinivasan et al. (1996) “Honeybee Navigation en route to the Goal: Visual Flight Control and Odometry”

Esch et al. (2001) “Honeybee Dances Communicate Distances Measured by Optic Flow”

Gould and Lewontin (1979) “The Spandrels of San Marco and the Panglossian paradigm…”

Ducrest et al. (2008) “Pleiotropy in the melanocortin system, coloration and behavioural syndromes”

Ioannidis (2005) “Why Most Published Research Findings are False”

Burnham and Anderson “Model Selection and Multimodel Inference”

Gelman and Stern “The Difference Between ‘Significant’ and ‘Not Significant’ is Not Itself Statistically Significant”

Gelman “The Problems with P-values are not just with P-values”

Gelman and Loken “The Garden of Forking Paths…”

Loken and Gelman (2017) “Measurement Error and the Replication Crisis”

Gopen and Swan (1990) “The Science of Scientific Writing”

National Learn to Code Day

On Saturday, I took part as a mentor in National Learn to Code Day with Ladies Learning Code. There were about 50 learners of all ages. The topic was “Using Data to Solve Problems: An Introduction to Artificial Intelligence and Machine Learning for Beginners”, led by Solmaz Shahalizadeh, the data team directory at Shopify in Ottawa.

The topic seemed like an oxymoron at first – how can beginners possibly cover in one day what could easily span multiple graduate courses? But actually, it worked great, and I think everyone learned a lot.

Continue reading →

How to learn to code

I just read a great post by Jessica Duarte on teaching beginners to code. It is all so true. Especially #5, making mistakes:

You [the instructor] have to ride out the mistakes. Make them often. Let the class fix them.

It’s essential for students to see and experience the process of working through mistakes. Right now, I am starting to use git and Github for our manakin research at the Smithsonian. A major benefit is that it allows us to make mistakes safely.

Duarte is organizing the 2017 National Learn to Code Day on Intro to Machine Learning and Artificial Intelligence. I look forward to helping out with the Ottawa chapter in on September 23!

How do babies learn words?

My 10-month old daughter just proved that she understands some words. Now, when we tell her to “clap your hands”, or even just talk about clapping, we get a round of applause. Pretty cute! This wasn’t one of the things we were actively trying to teach her, like “daddy”, “mommy”, “dog”, or “milk” – I haven’t seen evidence that she knows those yet.

It just goes to show how learning works: motivation trumps deliberate efforts to teach. Clapping is just plain fun.

It’s spooky to think about what else she might come to understand without us knowing.

Troubleshooting and iteration in science

The scientific method is taught as far back as elementary school. But students almost never get to experience what I think is the best part: what you do when something goes wrong. That’s too bad because self-correction is a hallmark of science.

In ecology and evolution, most graduate students don’t get to experience iteration firsthand, because they are often collecting data right up until the end of their degree. I didn’t experience it until my postdoc, when we failed to repeat a previous experiment. It took several experiments and a lot of time  – two years! – to figure out why. In the end, it was one of the most rewarding things I’ve done.

Wouldn’t it be great if undergraduate students actually got to do this as part of their lab courses (i.e., revise and repeat an experiment), rather than just writing about it?

One thing that can come close – teaching you how to revise and repeat when something doesn’t work – is learning to code.

How to mentor

Yesterday I was asked about how I mentor in research. This is an area where I still have a lot to learn, however, there are at least four things that I think are really important:

1. Confidence**.
Instilling confidence is probably the most important thing a mentor can do. Science is about unknowns and learning how to become an expert. And that requires confidence.

So how do you instill confidence?

2. Basic programming and learning how to “script”.
This was a real catalyst for me and a huge boost to my confidence. Once I had mastered some basic programming in R, it allowed me to start treating data like an experimental subject. Want to understand what happens when you ignore pseudoreplication in your data? What about how collinearity might influence the results of your analysis? It’s not too hard to write a simulation to figure that out. A lot of basic programming is troubleshooting, a useful and transferable skill. Acting like an experimenter also comes naturally – I see it all the time with my 4-month-old daughter!

Learning how to write scripts is also key to making your workflow efficient and reproducible. Filtering, tidying, and graphing your data is 90% of the work. Doing that through code is way more efficient and leaves a record of what you did, making it easier to correct errors later on. And if you can generate publication-quality graphs purely through code, it will save you a huge amount of time making tweaks. And believe me, you will need have to make a lot of tweaks. Finally, scripting means your work can be used by others (including, and perhaps especially, your future self).

3. Students are scientists, too.
There is nothing I’ve done that couldn’t be done by an undergraduate, if they had enough time. One of the best things grad school was our weekly seminar series. We’d have an MSc exit seminar one week followed by a distinguished visiting professor the next. As a student, your work is every bit as important.

4. Treating feedback as an opportunity.
I think it’s important to provide students with lots of constructive feedback – and also, to help them develop an ability to deal with it. In science (and in life), rejection happens. I got another huge boost when I stopped worrying about negative feedback and started looking at it as a problem-solving opportunity. This is a broadly transferable skill.

Taken together, the points above are pretty circular: it takes confidence to handle feedback, but also dealing with feedback forces you to gain confidence. So “fake it until you make it” really works. As a mentor, I think it’s important to treat students as fellow scientists, to provide them with lots of opportunities to act as peer reviewers and reviewees, and to model the process of using feedback to solve problems.

Update to #1 above, on confidence: I also try to emphasize that the value of science is based on the quality of the data collected and clear dissemination of the results – and not whether it supports a particular hypothesis, or has a p-value < 0.05. I think this is a major stumbling block for a lot of students. Your thesis does not hang on the results of one test! The cure to this kind of thinking includes a better understanding of what p-values really mean and the limitations of null hypothesis statistical testing (NHST), and a focus on reporting the data (including effect sizes, confidence intervals, and individual variation).

** Related: I think a lack of confidence is a major cause of the leaky pipeline for women in STEM (and perhaps other under-represented groups). Many women choose careers outside of science despite aptitude (see for example this 2009 study by Ceci et al.). There’s some very recent evidence that gender stereotypes about aptitude – which could shape children’s interests as well as their confidence – begin as early as 6 years old (see here).

Learning to science

From Alison Gopnik’s The Gardener and the Carpenter:

Imagine if we taught baseball the way we teach science. Until they were twelve, children would read about baseball technique and history, and occasionally hear inspirational stories of the great baseball players. They would fill out quizzes about baseball rules. College undergraduates might be allowed, under strict supervision, to reproduce famous historic baseball plays. But only in the second or third year of graduate school, would they, at last, actually get to play a game.

Girls do science

One of the best things about maternity leave is watching my daughter learn new things, almost daily. A few weeks ago she realized she could control her feet. This week she’s using her hands to grab at objects and starting to pull them in for further, mouth-based inspection. It really is exponential – the more she learns, the more she is able to figure out.

Children also learn a lot from what they hear. And they are apparently sensitive to the particulars at a surprisingly young age. Take, for example, the phrase “some birds fly” vs. the generic version “birds fly”. Psychologists have shown that halflings as young as two years old can tell the difference between these two phrases, and they can also use the generic version appropriately. What’s more, when adults use generic language in conversation with very young children, the children are able to infer new categories and make predictions about the world. This has been shown in experiments where psychologists talk about new, fictional categories (like Zarpies and Ziblets) with children. The results of these studies suggest that children are essentialists: i.e., they tend to carve up the world into categories, and view members of the same category as sharing a deeper, inherent nature. And these categories are easily transmitted through language.

This can have some unintended consequences. In her book The Gardener and the Carpenter, Alison Gopnik describes a study by Susan Gelman and colleagues where mothers and their children were given pictures of people doing stereotyped (a girl sewing) and non-stereotyped (a girl driving a truck) activities, and their conversations were recorded and quantified. It turns out that even mothers who were feminists used generic language most of the time. Moreover, there was a correlation between how often mothers used generic language and how often their children did.

Worst of all, moms used generics that reinforced the very stereotypes they were trying to combat. As Gopnik puts it:

Saying “Girls can drive trucks” still implies that girls all belong in the same category with the same deep, underlying essence.

I can’t help but wonder how this might affect our daughter as she grows up.

Although her book is not meant to be prescriptive, Gopnik does say that we probably can’t avoid this by careful wording – it just wouldn’t work to try to consciously control our language. Instead, the best antidote may be to have children observe many examples and talk to many different people.

Speeding up loops in R

This is from a session I did with the UBC R Study Group. Loops can be convenient for applying the same steps to big/distributed datasets, running simulations, and writing your own resampling/bootstrapping analyses. Here are some ways to make them faster.

1. Don’t grow things in your loops.
2. Vectorize where possible. i.e. pull things out of the loop.
3. Do less work in the loop if you can.

Continue reading →

Animating time series in R

Frame-blending is a great way to illustrate animal behaviour and other things that change over time. This got me thinking about ways to animate time series data. In R, the animation package has lots of options, but you can also build your own just by plotting over the same device window. If you save each iteration in a loop, the resulting images can be used as frames in a video or gif.

Hummingbirds deviate away from vertical stripe patterns

 Click the image to see a larger version

Here is an example using recordings that track hummingbirds flying in our tunnel here at UBC. This animation shows a bird’s eye view of 50 flights by 10 birds. In half of the flights (the red ones), the birds had horizontal stripes on their left side and vertical stripes on their right, and the other half (blue) had the reverse. The subtle difference between the red and blue trajectories (red ones tend to have more positive y values) shows that on average, birds tend to deviate away from vertical stripes, and towards horizontal ones. The histogram that builds up on the right side of the figure shows the mean lateral (y) position for each trajectory as it finishe

Continue reading →

How to loop efficiently in R

My learning curve with the statistical software R has been a long one, but one of the steepest and most exciting times was learning how to write functions and loops. Suddenly I could do all kinds of things that used to seem impossible. Since then, I’ve learned to avoid for loops whenever possible. Why? Because doing things serially is slow. With R, you can almost always reduce a big loop to just few lines of vectorized code.

But there’s one situation where I can’t avoid the dreaded for loop. Recently, I learned how to make for loops run 100s of times faster in these situations.

Continue reading →

What will you have to take with you?

My guest post for my university’s School of Graduate Studies blog is up! (You can read it here.) The inspiration was a new radio podcast that we have in the works on research here at Queen’s – scientific and otherwise. I’ve been working on the concept with Vee, an English PhD, and Savita, an undergraduate student who is keen to make top-notch radio documentaries.

I wrote the blog post to try to drum up some interest in being a subject of the radio show, but I hope it has a few nuggets of advice for those finishing and/or considering grad school as well.

Brawn over brains?

There’s no question that broadly speaking, big brains are smart. Take humans, for instance: our brains weigh in at about 3 pounds on average, nearly four times the size of the brains of chimpanzees (whose brains weigh in at less than a pound apiece).

What’s less clear is why. There are a number of theories: maybe intelligence evolved to give us a competitive edge in foraging, or maybe it helped us keep track of increasingly complex social interactions. Ideally, we’d like a theory to explain the evolution of intelligence broadly, so researchers have tried to these hypotheses across multiple species (for instance, comparing relative brain size and social group size among hoofed mammals like horses and deer1).

But brain size alone – even when scaled as a proportion of overall body size – is not an ideal measure of intelligence. The trouble is that small animals often have considerably higher brain-to-body mass ratios – ant brains, for instance, can weigh nearly 15% of their total body mass (the equivalent of a 20 pound human head!), and mice have about the same brain-to-body mass ratio as we do. So how can we study brain evolution, when even primates span a 3000-fold difference in body size (comparing a gray mouse lemur and a gorilla)?

Enter the encephalization quotient, or EQ, a measure of brain size relative to what we would predict, given that there is a curved relationship between brain size and body size (allometry is the technical term for this). It’s the best yardstick we have for the evolution of intelligence. Until now, that is.

Continue reading →