National Learn to Code Day

On Saturday, I took part as a mentor in National Learn to Code Day with Ladies Learning Code. There were about 50 learners of all ages. The topic was “Using Data to Solve Problems: An Introduction to Artificial Intelligence and Machine Learning for Beginners”, led by Solmaz Shahalizadeh, the data team directory at Shopify in Ottawa.

The topic seemed like an oxymoron at first – how can beginners possibly cover in one day what could easily span multiple graduate courses? But actually, it worked great, and I think everyone learned a lot.

Continue reading →

How to learn to code

I just read a great post by Jessica Duarte on teaching beginners to code. It is all so true. Especially #5, making mistakes:

You [the instructor] have to ride out the mistakes. Make them often. Let the class fix them.

It’s essential for students to see and experience the process of working through mistakes. Right now, I am starting to use git and Github for our manakin research at the Smithsonian. A major benefit is that it allows us to make mistakes safely.

Duarte is organizing the 2017 National Learn to Code Day on Intro to Machine Learning and Artificial Intelligence. I look forward to helping out with the Ottawa chapter in on September 23!

How do babies learn words?

My 10-month old daughter just proved that she understands some words. Now, when we tell her to “clap your hands”, or even just talk about clapping, we get a round of applause. Pretty cute! This wasn’t one of the things we were actively trying to teach her, like “daddy”, “mommy”, “dog”, or “milk” – I haven’t seen evidence that she knows those yet.

It just goes to show how learning works: motivation trumps deliberate efforts to teach. Clapping is just plain fun.

It’s spooky to think about what else she might come to understand without us knowing.

Troubleshooting and iteration in science

The scientific method is taught as far back as elementary school. But students almost never get to experience what I think is the best part: what you do when something goes wrong. That’s too bad because self-correction is a hallmark of science.

In ecology and evolution, most graduate students don’t get to experience iteration firsthand, because they are often collecting data right up until the end of their degree. I didn’t experience it until my postdoc, when we failed to repeat a previous experiment. It took several experiments and a lot of time  – two years! – to figure out why. In the end, it was one of the most rewarding things I’ve done.

Wouldn’t it be great if undergraduate students actually got to do this as part of their lab courses (i.e., revise and repeat an experiment), rather than just writing about it?

One thing that can come close – teaching you how to revise and repeat when something doesn’t work – is learning to code.

How to mentor

Yesterday I was asked about how I mentor in research. This is an area where I still have a lot to learn, however, there are at least four things that I think are really important:

1. Confidence**.
Instilling confidence is probably the most important thing a mentor can do. Science is about unknowns and learning how to become an expert. And that requires confidence.

So how do you instill confidence?

2. Basic programming and learning how to “script”.
This was a real catalyst for me and a huge boost to my confidence. Once I had mastered some basic programming in R, it allowed me to start treating data like an experimental subject. Want to understand what happens when you ignore pseudoreplication in your data? What about how collinearity might influence the results of your analysis? It’s not too hard to write a simulation to figure that out. A lot of basic programming is troubleshooting, a useful and transferable skill. Acting like an experimenter also comes naturally – I see it all the time with my 4-month-old daughter!

Learning how to write scripts is also key to making your workflow efficient and reproducible. Filtering, tidying, and graphing your data is 90% of the work. Doing that through code is way more efficient and leaves a record of what you did, making it easier to correct errors later on. And if you can generate publication-quality graphs purely through code, it will save you a huge amount of time making tweaks. And believe me, you will need have to make a lot of tweaks. Finally, scripting means your work can be used by others (including, and perhaps especially, your future self).

3. Students are scientists, too.
There is nothing I’ve done that couldn’t be done by an undergraduate, if they had enough time. One of the best things grad school was our weekly seminar series. We’d have an MSc exit seminar one week followed by a distinguished visiting professor the next. As a student, your work is every bit as important.

4. Treating feedback as an opportunity.
I think it’s important to provide students with lots of constructive feedback – and also, to help them develop an ability to deal with it. In science (and in life), rejection happens. I got another huge boost when I stopped worrying about negative feedback and started looking at it as a problem-solving opportunity. This is a broadly transferable skill.

Taken together, the points above are pretty circular: it takes confidence to handle feedback, but also dealing with feedback forces you to gain confidence. So “fake it until you make it” really works. As a mentor, I think it’s important to treat students as fellow scientists, to provide them with lots of opportunities to act as peer reviewers and reviewees, and to model the process of using feedback to solve problems.

Update to #1 above, on confidence: I also try to emphasize that the value of science is based on the quality of the data collected and clear dissemination of the results – and not whether it supports a particular hypothesis, or has a p-value < 0.05. I think this is a major stumbling block for a lot of students. Your thesis does not hang on the results of one test! The cure to this kind of thinking includes a better understanding of what p-values really mean and the limitations of null hypothesis statistical testing (NHST), and a focus on reporting the data (including effect sizes, confidence intervals, and individual variation).

** Related: I think a lack of confidence is a major cause of the leaky pipeline for women in STEM (and perhaps other under-represented groups). Many women choose careers outside of science despite aptitude (see for example this 2009 study by Ceci et al.). There’s some very recent evidence that gender stereotypes about aptitude – which could shape children’s interests as well as their confidence – begin as early as 6 years old (see here).

Learning to science

From Alison Gopnik’s The Gardener and the Carpenter:

Imagine if we taught baseball the way we teach science. Until they were twelve, children would read about baseball technique and history, and occasionally hear inspirational stories of the great baseball players. They would fill out quizzes about baseball rules. College undergraduates might be allowed, under strict supervision, to reproduce famous historic baseball plays. But only in the second or third year of graduate school, would they, at last, actually get to play a game.

Girls do science

One of the best things about maternity leave is watching my daughter learn new things, almost daily. A few weeks ago she realized she could control her feet. This week she’s using her hands to grab at objects and starting to pull them in for further, mouth-based inspection. It really is exponential – the more she learns, the more she is able to figure out.

Children also learn a lot from what they hear. And they are apparently sensitive to the particulars at a surprisingly young age. Take, for example, the phrase “some birds fly” vs. the generic version “birds fly”. Psychologists have shown that halflings as young as two years old can tell the difference between these two phrases, and they can also use the generic version appropriately. What’s more, when adults use generic language in conversation with very young children, the children are able to infer new categories and make predictions about the world. This has been shown in experiments where psychologists talk about new, fictional categories (like Zarpies and Ziblets) with children. The results of these studies suggest that children are essentialists: i.e., they tend to carve up the world into categories, and view members of the same category as sharing a deeper, inherent nature. And these categories are easily transmitted through language.

This can have some unintended consequences. In her book The Gardener and the Carpenter, Alison Gopnik describes a study by Susan Gelman and colleagues where mothers and their children were given pictures of people doing stereotyped (a girl sewing) and non-stereotyped (a girl driving a truck) activities, and their conversations were recorded and quantified. It turns out that even mothers who were feminists used generic language most of the time. Moreover, there was a correlation between how often mothers used generic language and how often their children did.

Worst of all, moms used generics that reinforced the very stereotypes they were trying to combat. As Gopnik puts it:

Saying “Girls can drive trucks” still implies that girls all belong in the same category with the same deep, underlying essence.

I can’t help but wonder how this might affect our daughter as she grows up.

Although her book is not meant to be prescriptive, Gopnik does say that we probably can’t avoid this by careful wording – it just wouldn’t work to try to consciously control our language. Instead, the best antidote may be to have children observe many examples and talk to many different people.

Speeding up loops in R

This is from a session I did with the UBC R Study Group. Loops can be convenient for applying the same steps to big/distributed datasets, running simulations, and writing your own resampling/bootstrapping analyses. Here are some ways to make them faster.

1. Don’t grow things in your loops.
2. Vectorize where possible. i.e. pull things out of the loop.
3. Do less work in the loop if you can.

Continue reading →

Animating time series in R

Frame-blending is a great way to illustrate animal behaviour and other things that change over time. This got me thinking about ways to animate time series data. In R, the animation package has lots of options, but you can also build your own just by plotting over the same device window. If you save each iteration in a loop, the resulting images can be used as frames in a video or gif.

Hummingbirds deviate away from vertical stripe patterns

 Click the image to see a larger version

Here is an example using recordings that track hummingbirds flying in our tunnel here at UBC. This animation shows a bird’s eye view of 50 flights by 10 birds. In half of the flights (the red ones), the birds had horizontal stripes on their left side and vertical stripes on their right, and the other half (blue) had the reverse. The subtle difference between the red and blue trajectories (red ones tend to have more positive y values) shows that on average, birds tend to deviate away from vertical stripes, and towards horizontal ones. The histogram that builds up on the right side of the figure shows the mean lateral (y) position for each trajectory as it finishe

Continue reading →

How to loop efficiently in R

My learning curve with the statistical software R has been a long one, but one of the steepest and most exciting times was learning how to write functions and loops. Suddenly I could do all kinds of things that used to seem impossible. Since then, I’ve learned to avoid for loops whenever possible. Why? Because doing things serially is slow. With R, you can almost always reduce a big loop to just few lines of vectorized code.

But there’s one situation where I can’t avoid the dreaded for loop. Recently, I learned how to make for loops run 100s of times faster in these situations.

Continue reading →

What will you have to take with you?

My guest post for my university’s School of Graduate Studies blog is up! (You can read it here.) The inspiration was a new radio podcast that we have in the works on research here at Queen’s – scientific and otherwise. I’ve been working on the concept with Vee, an English PhD, and Savita, an undergraduate student who is keen to make top-notch radio documentaries.

I wrote the blog post to try to drum up some interest in being a subject of the radio show, but I hope it has a few nuggets of advice for those finishing and/or considering grad school as well.

Brawn over brains?

There’s no question that broadly speaking, big brains are smart. Take humans, for instance: our brains weigh in at about 3 pounds on average, nearly four times the size of the brains of chimpanzees (whose brains weigh in at less than a pound apiece).

What’s less clear is why. There are a number of theories: maybe intelligence evolved to give us a competitive edge in foraging, or maybe it helped us keep track of increasingly complex social interactions. Ideally, we’d like a theory to explain the evolution of intelligence broadly, so researchers have tried to these hypotheses across multiple species (for instance, comparing relative brain size and social group size among hoofed mammals like horses and deer1).

But brain size alone – even when scaled as a proportion of overall body size – is not an ideal measure of intelligence. The trouble is that small animals often have considerably higher brain-to-body mass ratios – ant brains, for instance, can weigh nearly 15% of their total body mass (the equivalent of a 20 pound human head!), and mice have about the same brain-to-body mass ratio as we do. So how can we study brain evolution, when even primates span a 3000-fold difference in body size (comparing a gray mouse lemur and a gorilla)?

Enter the encephalization quotient, or EQ, a measure of brain size relative to what we would predict, given that there is a curved relationship between brain size and body size (allometry is the technical term for this). It’s the best yardstick we have for the evolution of intelligence. Until now, that is.

Continue reading →

The currency on campus

My article on the behavioural economics of grades is out, and it’s the cover story this month in University Affairs magazine!

I had a blast doing interviews for this story. I tried to pick profs with a reputation for being great teachers in classes that are popular despite being tough. I learned a ton talking to them, but I have to say I was disappointed that I couldn’t take this story further. I was hoping for something more conclusive about how behavioural economics could be applied to grades. We know that humans aren’t particularly rational when it comes to incentives, and grades perform a dual feedback/incentive role – and yet we have no idea how students respond to grading schemes, or whether some of the most common practices might be entirely counterproductive. In the end, I think the incentive effect of grades is something that we should be studying experimentally.

The Owl: why kids make great science writers

I finally had a chance to watch Steven Pinker’s excellent lecture on science communication this weekend. Pinker, a psychologist, linguist and top-notch writer, argues that psychology can help us tune up our writing and become better communicators.

His first point is that cognitive psychology points to the model that we should be aiming for: prose that directs the reader’s attention to something in the world that they can then come to understand on their own.

He also discusses why this is so hard to do: The Curse of Knowledge. Once you know a lot about something, it’s hard to put yourself in the mindset of your readers – i.e., the people who don’t know anything about the thing you are trying to write about. This is because it’s hard work, cognitively, to keep track of what other people know. The classic example of this is the false belief task in psychology. If you show a child a box of Smarties (the chocolate candy), and then ask him or her what might be in the box, the child will say candy. Suppose you then reveal that the box actually contains something else – coal. Then close the box and ask the child what another person would think is inside. A 7 year old will correctly say candy, but a child younger than 4 or so will claim that others would think it contains coal. Up until about age 4, we don’t seem to grasp that other people can have false beliefs about the world. Pinker’s point is that this ability – also known as theory of mind – isn’t a cut and dried thing that we suddenly achieve at age 4. It’s a sophisticated skill that proves to be a challenge even for adults.

His advice on writing? It’s pretty standard stuff. Pinker enlists his mom – or in other words, an intelligent reader who just happens to not know a lot about his particular topic already. His other point is to take a break from your writing before you edit, to give yourself time to shift away from the mental state you were in when you wrote it. You can also read your work aloud, since that seems to engage a different mental state as well (I wonder why?). It makes me wonder whether there is anything we can do to harness this mind reboot effect more efficiently. Say you don’t have a lot of time and your mom is not available. How can you reset your brain on demand? I’m thinking of a 20 minute nap, reading some fiction, or doing some physical exercise before editing your paper – which is best? I imagine this is something that cognitive neuroscientists will be able to tell us pretty soon.

Pinker ends with some sage advice: most good writers learn by example. So find a bit of writing that you admire, and try to figure out what makes it great. His choice? The short essay called “The Owl”. It’s remarkable for its clarity and worth checking out in the video below:

If only it was that easy for the rest of us to escape the curse of knowledge.

You can watch the whole lecture by Steven Pinker here. (The Owl is at the 57 minute mark.)