Last fall, I decided to learn Python, with a desire to analyze text and implement some machine learning. So, I decided to start by learning BeautifulSoup and using the tools there to scrape a professor rating site. The project went well, and I was able to write some code (that you can find on my GitHub). I got the hang of scraping and wrote code to collect numeric and text information from reviews of professors by school or by state.
Next, I began to analyze those reviews. I started the project intending to look gender differences in ratings, following other reports of differences, such as Sidanius & Crane, 1989, and and Anderson & Miller, 1997. So, I had to have the gender of the professors, something that was not available in the dataset that I had scraped. I decided to use pronouns to assess gender, and in case there were no pronouns in the text or the pronoun use was unclear, I assigned gender based on name.
I compiled reviews for schools in Rhode Island, New Hampshire, and Maine, discarded reviews with no gender assigned, and began by looking at differences in numeric rankings, which include an overall score and a difficulty score. Male and female professors have nearly identical mean scores — women’s mean overall is 3.71 and men’s is 3.75 and women’s difficulty is 2.91 and men’s is 2.90. The overall and difficulty means for each professor are correlated, as you can see here:
Interestingly, women seem to have fewer reviews than men. On average, female professors have about 9 reviews and male professors have about 12. This difference seems to be stable when looking at each individual year, and could be due to male professors teaching larger classes (but I have no data on that). The end result is that there are far fewer reviews of female professors. In my dataset, there are 107,930 reviews of male professors and just 64,799 reviews of female professors.
The data set also has a self-report of grades from most reviewers. You can see in the data overall scores go down when reviewers get a bad grade, but women seem to be hit harder by this than men.
Note also that far more reviewers report receiving high grades. In fact, over 160,000 of the 173,000 reviews in my dataset report getting A’s.
The overall scores show a bimodal distribution, as you can see in the histogram of overall scores (reviewers can report scores from 0 to 5 with half-points possible). The next thing I decided to was to categorize these reviews into positive or negative, getting rid of reviews in the middle, and then to do some analysis of the text in those reviews. I’ll report on that next.