
Therese Anders | Peer review in data science courses | RStudio (2020)
Peer review enables instructors of large data science classes to provide substantive feedback to students beyond what is feasible with standard code review via automated grading and continuous integration. It facilitates peer learning, which is shown in literature to have positive learning outcomes, and can reduce the burden of grading by course staff. The ghclass package provides a suite of functions to manage courses via GitHub repositories. The package has recently been supplemented with the functionality to implement peer review. Developed during my 2019 summer internship with RStudio in collaboration with my mentor Mine Çetinkaya-Rundel, the peer review functions in ghclass interface with the GitHub API to create review repositories, move files between authors and reviewers, submit feedback, and collect grades. In this presentation, I will give a demonstration of the peer review functions in ghclass. A set of six functions allows instructors to 1) create a random review roster, 2) set up the review repository infrastructure within a GitHub organization, 3) move assignments from authors to reviewers, 4) collect grades, 5) return the feedback, and 6) obtain a rating of the review from the authors. I reflect on the pedagogy of implementing peer review in introductory data science classes and talk about lessons learned from a real-world test run of the package in the Fall semester 2019 at the University of Edinburgh, conducted by Mine Çetinkaya-Rundel. The presentation highlights ghclass as an R command-line based, open source, low profile, and powerful solution to enable peer review in classes ranging from a size of two to approximately 400 students. A 5 minute presentation in our Lightning Talks series
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hi everyone, my name is Therese Anders and I'm a postdoctoral research fellow at the Hertie School in Berlin. And today I will talk to you about how to conduct peer review in data science classes using the GH class package.
So why does peer review matter? Well, it's a great active learning tool for students and in particular for instructors in really large data science classes, peer review enables them to provide their students with meaningful, substantive feedback beyond what might be possible with kind of standard automated code review tools.
The GH class package now offers software solutions to do this in R and this is work that I did during my internship with RStudio this past summer together with Mine Çetinkaya-Rundel and Colin Rundell. And so in a nutshell, GH class is a package that allows you to manage a class that is run as a GitHub organization via R commands interfacing with the GitHub API. And as a new feature, we now added the ability to also conduct peer review with this structure.
Key features of GH class
So what are some of the key features that GH class allows you to do? Well, first and foremost, GH class will allow you to automate the process of distributing and subsequently collecting all the reviews on GitHub repositories. The package will allow you to create a randomized review roster for your class and you can also assign more than one reviewer to an assignment.
The package allows you to automatically collect the scores that students gave to each other. You as the instructor can determine whether you want the review to be single or double blind. And last but certainly not least, this works for a number of different assignments. So this could work for a very small coding homework or even an entire final project.
Demo: the four main steps
So let me briefly demo for you the four main steps that you need to know in order to do this with GH class. So suppose we have three students in our class and they submitted their homework one homework via GitHub repositories. We can initiate the peer review process by creating so-called review repositories for each of these students and the peer init function will automatically do this for you.
The second step is to actually assign the review. And so in my little graphic on my slides, I'm going to be focusing on the example of Bruno as the assignment author in red and Anya as the randomly assigned reviewer for this assignment in blue. So what peer assign does, it takes Bruno's assignment, makes a copy of it, and stores it in a separate folder on Anya's review repository. And this folder is anonymized, so Anya doesn't know whose assignment she's reviewing. And what peer assign also does is it adds a R markdown feedback form to this repository that we're going to ask Anya to fill out.
So here we're going to do a little bit of time traveling here. Because this is the part in the review process where our students would actually conduct the review. And so what our students are going to do is they're going to edit the original assignment of the authors and also fill out the review form that we asked them to do.
Once this process is done, you as the instructor can now collect the scores that the students gave to each other. And so the peer score review form will essentially extract the scores from the R markdown file of the reviewers and store the information in either a data frame or a CSV file.
So the last step in this review process is to take all the review files from the reviewers and move it all back to the authors. And so this is what peer return will do for you. So it will essentially clone the author one folder from Anya's review repository and make a copy of it on Bruno's original assignment repository. And so now Bruno could go and actually take a look at the feedback that he got.
What does this look like from the student's perspective? Well, peer return will also automatically open an issue on Bruno's GitHub repository that gives him some instructions of what to do and also has links to all the great feedback that Anya gave him. And so, for example, if he was to click on this assignment.RMD file, what happens is he would automatically be seeing a GitHub diff between his original assignment and then all the feedback that Anya gave him.
You learned how to conduct peer review with the GH class package. And we have a package down site available if you want to know more where we have a ton of really great vignettes, step-by-step guides, and also instructions that you can pass on to your students. Thank you very much.

