Professor Reviews and Learning Python

Last fall, I decided to learn Python, with a desire to analyze text and implement some machine learning. So, I decided to start by learning BeautifulSoup and using the tools there to scrape a professor rating site. The project went well, and I was able to write some code (that you can find on my GitHub). I got the hang of scraping and wrote code to collect numeric and text information from reviews of professors by school or by state.

Next, I began to analyze those reviews. I started the project intending to look gender differences in ratings, following other reports of differences, such as Sidanius & Crane, 1989, and and Anderson & Miller, 1997. So, I had to have the gender of the professors, something that was not available in the dataset that I had scraped. I decided to use pronouns to assess gender, and in case there were no pronouns in the text or the pronoun use was unclear, I assigned gender based on name.

I compiled reviews for schools in Rhode Island, New Hampshire, and Maine, discarded reviews with no gender assigned, and began by looking at differences in numeric rankings, which include an overall score and a difficulty score. Male and female professors have nearly identical mean scores — women’s mean overall is 3.71 and men’s is 3.75 and women’s difficulty is 2.91 and men’s is 2.90. The overall and difficulty means for each professor are correlated, as you can see here:

Plot of histograms, scatter with linear, and residuals

Interestingly, women seem to have fewer reviews than men. On average, female professors have about 9 reviews and male professors have about 12. This difference seems to be stable when looking at each individual year, and could be due to male professors teaching larger classes (but I have no data on that). The end result is that there are far fewer reviews of female professors. In my dataset, there are 107,930 reviews of male professors and just 64,799 reviews of female professors.

The data set also has a self-report of grades from most reviewers. You can see in the data overall scores go down when  reviewers get a bad grade, but women seem to be hit harder by this than men.

Bar graph showing overall mean score by grade and gender.

Note also that far more reviewers report receiving high grades. In fact, over 160,000 of the 173,000 reviews in my dataset report getting A’s.

The overall scores show a bimodal distribution, as you can see in the histogram of overall scores (reviewers can report scores from 0 to 5 with half-points possible). The next thing I decided to was to categorize these reviews into positive or negative, getting rid of reviews in the middle, and then to do some analysis of the text in those reviews. I’ll report on that next.

Histogram of overall scores, showing bimodal distributions for both men and women.



Disappointment and Hiding in the Classroom

I’ve been noticing lately my disappointment in students. I don’t want to feel disappointed in students. Honestly, I don’t want to feel disappointed in anyone. Who does? But you might argue that we have certain expectations for how the people around us will act, and that people don’t always meet those expectations. When they don’t, I am justified in feeling disappointed, at least provided that my expectations were reasonable. The trouble is that  disappointment is counterproductive, and for me it is part of an overall tendency I have to disconnect with people.

Let me look at this a little closer. I have certain expectations for my students. I set those out for the students by giving them specific assignments (“turn this worksheet in on Monday” or “write a blog post about your problem-solving process”), and I lay them out on the course syllabus by telling students to come to class, check their email regularly, participate, and so forth. There are also a collection of expectations that go unspoken by me. I expect that students will be thinking about what they need to do to prepare for upcoming exams, even if I don’t give them explicit assignments. I expect that students will ask for help and support when they don’t understand something after class. I expect that students will monitor what they do and don’t understand. I expect that students will give me their best work, and won’t piece together something at the last minute. I often say things which imply these expectations, but I’m not always explicit about them. Also notice that not all of these expectations are realistic.

If a student doesn’t meet these expectations, I get cranky. In between classes, if I am expecting work and participation from students that I don’t see, I start to worry, and to run my “disappointment tape.” Typically it involves me getting frustrated and making up a lot of things that I imagine to be happening with the students. I imagine them as uninterested in the course, not dedicated, not hard-working, wanting to get away with not doing work, not caring about thinking deeply, not caring about interacting with me or other students. Yes, there’s some really ugly stuff hiding in there. The thing is that I don’t know that any of that is really happening. Mostly, I think what is happening with me is that I want this connection with students, and most of what I have to connect with is their work. When the work isn’t there, I feel rejected. I imagine the students pulling away from me, and I rush to pull away from them first, by getting “disappointed” in them. Most of the time, I can get back my connection with the students simply by being around them — it is the time in between classes that provides a space for these feelings to grow.

Students don’t always do what we teachers what them to do. In fact, people in general don’t always do what other people what them to do. So we get anxious about our relationships and our standing with other people. In school, this means teachers get frustrated with and disappointed in students. What do students do? Students learn to hide from the disappointment of teachers. They hide and they lie so they can save themselves from the consequences of expectations unmet. Students hide so that they’re grades aren’t in jeopardy and they hide so that they can maintain positive relationships with the powerful people that are important to them. Students get into a habit of hiding, so that it seems as natural as breathing. I remember it well from the last time I was a student — doing work I wasn’t proud of and hoping it would slip by without notice, making up excuses for doing work late or stretching excuses that were technically true but not really accurate, trying to look good in order to get away with things. As a teacher, I know that students are doing these things, but I ignore it, acting as if students are going to meet all of my expectations, and then getting disappointed when they don’t. Because I am required to assign grades to students, I maintain and perpetuate the fiction that grades mean something objective, when the reality is that they’re just a somewhat arbitrary record of how well a student met my somewhat arbitrary standards about a somewhat arbitrary collection of activities and topics.

What if I stopped doing this? It’s hard to imagine. Could I stop having expectations of students? What would happen to me and to the students if I did? What if I kept having my expectations, but was more honest about the fact that I know students won’t always meet them? What’s so bad about the students not meeting them anyways? Could I keep the expectations, but let go of the disappointment, simply connecting with students about what happened and deciding what to do next? Could I let my students be honest with me about the unrealistic nature of my expectations and with what really happens for them in a class? Could I let students formulate their own expectations, help them to make those expectations realistic, and then help them to live up to those expectations? Could I create a classroom environment in which I helped my students evaluate themselves? Wouldn’t this cause the very foundation of objective and rational subjects like math and science crumble because students would start writing expressive poetry about how math makes them feel and giving themselves an A++ on every assignment?


Reading: Grading Student Writing by Peter Elbow

On the recommendation of Jesse Stommel, I’m reading this paper about grading student writing by Peter Elbow, and I’m trying to figure out what it might say about my own grading practices. First, let me say that that the problems of grading writing may be qualitatively different that the problems of grading mathematics. Mathematics has this wonderful and horrible right/wrong duality in it, and it is often set up as an objective arbiter. Emotions and opinions don’t come into mathematics grading, because how can they? It is always true that 2+2=4, and it is never true that 2+2=0 (except of course if you are working mod 4, but that’s just me being an obnoxious mathematician). I suppose that is the true truth if you are either machine grading, or are grading with no “partial credit.” But that is almost never true for me, because I don’t think that the most important thing about a solution on an exam or in homework is the final answer. The process is far more important, and gives me more clues about what the student is thinking and what they have learned. Add to that the fact that I typically give projects and other more subjective assignments for at least part of a student’s grade, and the situation gets quite muddled. And of course I accumulate a large list of quantitative measures during a semester and combine them together in an arbitrary way that I determine at the beginning of the semester, with each grade making up a certain percent of the final grade. All of that is the say: mea culpa, I may need a better theoretical framework here.

Right away the paper grabs me, then with the discussion of the difficulty and unreliability of grading, and even more with the wall it puts up between teacher and student. As Elbow says on the first page, “Students resent the grades we give or haggle over them and, in general, see us as people they have to deceive and hide from rather than people they want to take into their confidence.” I’m in, but what do I do?

Elbow recommends using minimal grades, like pass/fail or strong/satisfactory/weak. He recommends this for low stakes writing, and I could see it working perfectly for low-stakes assignments. In fact, I rarely grade homework. Most is graded on completion only, or if I actually want to provide feedback I use a 0-2 or a 0-3 scale. But really, maybe the words work better (only what do I write in my gradebook?). Elbow says that we can judiciously increase the number of levels in higher-stakes situations if we want, still without resorting to the eight levels of the traditional letter grades with pluses and minuses. Honestly for a test, this would be harder for me than what I already do. I tend to grade student work on each problem using a rough rubric that tells me how many points to give what kind of work — I might subtract points for certain kinds of errors, or give a certain number of points if the student made a correct start to a problem. So when grading is done, I have a bunch of number to add up, and presto, I have a grade! And arguably that grade gives me an idea of how well they were able to demonstrate their knowledge on that particular test. Moving to a more fuzzy system would be more work for me, but I can still see some advantages. I would likely still grade in much the same way, but I’d have a less fiddliness over the small numbers of points, with all questions being strong/satisfactory/weak. Then I have to think of a way to get the exam assessment overall.

Elbow can help here again, and maybe help with my poor Excel gradebook. He suggests to look at all of the grades in aggregate. Say you have a lot of low-stakes grades. Doing “satisfactory” on all of those might be a B, and then looking at the smaller numbers of higher-stakes pieces could pull that B up or down. Being a math person, that screams out to me to make up a formula, and you again get into the whole problem with grades. Wouldn’t a narrative evaluation simply be better and more nuanced, allowing me to say to a student “You did great with all of the lower-stakes pieces, but once the stakes were raised, you struggled to show your competence and understanding.” Then the student and I could both think about why that was. Perhaps the higher-stakes assignments required putting more concepts together, or maybe the pressure negatively impacted the student’s ability to think and communicate clearly.

Elbow advocates for portfolios, which I think are a good idea, but I have only occasionally used. He also discusses the use of contracts for grading, which I think I last encountered in high school. I could see contracts being a way of being up-front in my manipulation of students, as Elbow suggests. In doing so, I could clearly spell out my expectations for behaviors associated a passing grade. My only question there is what happens if the student has all of the behaviors associated with passing, but still doesn’t learn the material? What if they still can’t do any math? Honestly, I don’t think that really happens, at least not if I choose the right behaviors. But I worry about whether I have a clear leg to stand on if criticized for this kind of grading practice. Is it “rigorous enough?” Don’t I want students to come out of the class with some products, rather than just a process and effort? I think what I am struggling with is the student that just does the motions as they go through my class, appearing to really engage without really engaging. I suppose that such students pass through my classes all of the time, and there is no fool-proof method for bending them to my will and forcing them to engage in the ways that I desire. And when I put it that way, perhaps there shouldn’t be. Maybe the real problem is in trying to manipulate students into doing what I want them to do at all.

SJSA Grade Six -  The Year I Rebelled

Photo credit: Michael 1952

Elbow also suggests being explicit about criteria. I tend to have rubrics when I grade project work that spell out what I am looking for, and Elbow’s minimal grading would make this easier and less rigid. I could also give criteria on exam problems, or I could split up into multiple criteria. In a calculus class exam problem I might be looking for the method of solution, setting up the solution in a reasonable way, and executing that method including getting algebra correct. I could be clear about each of these criteria and evaluate each problem on each criteria.

Part of what makes grading hard is being the person that holds the power of judgment, and that’s just part of being a teacher. The power is mine to hold and negotiate, since I have to write down a letter grade at the end of the semester. I want to use student assessments in a way that is helpful to the students, and to determine letter grades in a way that doesn’t create excessive distance between me and the student, or between me and the task of judgement. Honestly, right now I use my grading system as a very long arm that allows me to avoid the uncomfortable position of judge. I don’t really determine the grades — a lot of numbers determine the grades, and I have very little to do with it. I can hide behind those numbers. I can even advocate for and advise students about how to beat those numbers, ignoring the fact that I’m the one writing down that letter grade. Once again, it all comes down to the relationships in my classroom and how I navigate them and engage with the students, and I can see that I have some work to do here.