Perceptron

I’m learning about various machine learning algorithms, so I want to get recorded what I have learned and where I learned it, in part so that I can relearn it again after I inevitably forget it!

The perceptron is “baby’s first neural network.” It can be used successfully for learning binary classification of data that is linearly separable. The basic idea is that you have some training data that comes to you as vectors. You can start by guessing a weighting for those vectors, which is basically a guess at the subspace that separates your data (the weighting gives the normal vector for the subspace), or you can just initialize the weights to 0. Then you look at a random data point. First, you have to see how your current perceptron categorizes the data point, which you can do by taking a dot product of the weight vector with the data point vector and just looking at its sign.

If it is incorrectly classified, you need to adjust your weighting, which you do by subtracting (a scaling of) the vector of your current data point from your weighting vector, giving that normal vector a bit of a bump so that you will be correctly classifying the current data point. Then you pick another point and do the whole thing again. You are continuously adjusting your weights, so presumably your perceptron is getting better all the time. It is also useful to note that you need to use some kind of activation function to distinguish between correctly and incorrectly classified data points and it seems pretty typical to use a threshold step function.

Does this process ever end? Yes, it will provided that your data is linearly separable. You can even find the proof here. How long does it really take? I don’t know. Presumably it’s not the worst thing to do since lots of people reference it. What if your data isn’t really linearly separable? Well, it will go on forever, so you better pick a maximum number of iterations. Will it give you something reasonable after a reasonable number of iterations if you data is linearly separable-ish? No idea, but it seems like it might.

I read several useful pieces to figure out what I do know. I found this material from a presentation in graduate course on machine learning (there’s a lot of other interesting stuff in the webpage for the 2007 course http://aass.oru.se/~lilien/ml/). I also relied heavily on this material on perceptron from a CMU course in computer vision. Both of these sources have useful illustrations that I decided not to replicate here, so you should go look at them.The wikipedia page on perceptron had some good material, and I got curious a little about history, so I read http://web.csulb.edu/~cwallis/artificialn/History.htm.

Predicting Sentiment from Text

After having scraped and analyzed some professor review data, I wanted to know if I could predict sentiment from the text. Reading the reviews as a human, it certainly seems like you can tell a good review from a bad review without looking at the overall score, but could I do this through a machine learning algorithm? First, I used overall score to distinguish good reviews from bad (see the spread here). Since there are so many “5” scores, those became the good reviews. Then I counted “1” and “2” scores as bad reviews and threw out the rest of the scores because they were ambiguous.

To use the text to predict the sentiment, I decided to use a “bag of words,” in which I would disregard grammar and word order and just count how often a word appeared in a review. I also threw out “stopwords,” which are common words like “these” and “am.” This is the loss of a lot of information but can make the analysis problem much more computationally tractable. Each of these words then becomes a feature that can help us predict the sentiment.

One way to predict the sentiment from these features is to form a decision tree. For example, I could predict sentiment with the tree below. This predicts sentiment correctly about half the time. To do a better job than this we can sSample decision treeample the reviews (and the features) and use each sample to make a different decision tree. We train the decision tree by splitting the space of features up in the best way possible (so that the bad and good reviews are separated, as unmixed as possible). We do this splitting by looking at all of the values of the features and deciding where to place the fit, then we evaluate how good the split is and move on to the next split. Eventually we are able to choose amongst the splits, selecting the best one. For example, for a particular sample, it it could end up that the best first split is whether the review contains the word “worst.” Then we proceed iteratively, looking at the two buckets of reviews that we have and deciding how to best split those, and so on. This will train a single tree (for instance like the tree pictured).

But one tree isn’t good enough, so we select another sample of reviews and a sample of features and do this recursive “best” splitting again and again, with each sample making a new tree. We end up with a whole “forest” of trees and we use this forest by running a new review through each tree, determining whether teach tree says the review is bad or good, and then going with the sentiment of the majority of trees to predict the sentiment of this review.

I implemented this with the RandomForestClassifier from sklearn and it is pretty accurate, around 94% on the data I set aside for testing the model. You can find the code in my GitHub (look at sentimentFromText.ipynb).

Professor Reviews and Learning Python

Last fall, I decided to learn Python, with a desire to analyze text and implement some machine learning. So, I decided to start by learning BeautifulSoup and using the tools there to scrape a professor rating site. The project went well, and I was able to write some code (that you can find on my GitHub). I got the hang of scraping and wrote code to collect numeric and text information from reviews of professors by school or by state.

Next, I began to analyze those reviews. I started the project intending to look gender differences in ratings, following other reports of differences, such as Sidanius & Crane, 1989, and and Anderson & Miller, 1997. So, I had to have the gender of the professors, something that was not available in the dataset that I had scraped. I decided to use pronouns to assess gender, and in case there were no pronouns in the text or the pronoun use was unclear, I assigned gender based on name.

I compiled reviews for schools in Rhode Island, New Hampshire, and Maine, discarded reviews with no gender assigned, and began by looking at differences in numeric rankings, which include an overall score and a difficulty score. Male and female professors have nearly identical mean scores — women’s mean overall is 3.71 and men’s is 3.75 and women’s difficulty is 2.91 and men’s is 2.90. The overall and difficulty means for each professor are correlated, as you can see here:

Plot of histograms, scatter with linear, and residuals

Interestingly, women seem to have fewer reviews than men. On average, female professors have about 9 reviews and male professors have about 12. This difference seems to be stable when looking at each individual year, and could be due to male professors teaching larger classes (but I have no data on that). The end result is that there are far fewer reviews of female professors. In my dataset, there are 107,930 reviews of male professors and just 64,799 reviews of female professors.

The data set also has a self-report of grades from most reviewers. You can see in the data overall scores go down when  reviewers get a bad grade, but women seem to be hit harder by this than men.

Bar graph showing overall mean score by grade and gender.

Note also that far more reviewers report receiving high grades. In fact, over 160,000 of the 173,000 reviews in my dataset report getting A’s.

The overall scores show a bimodal distribution, as you can see in the histogram of overall scores (reviewers can report scores from 0 to 5 with half-points possible). The next thing I decided to was to categorize these reviews into positive or negative, getting rid of reviews in the middle, and then to do some analysis of the text in those reviews. I’ll report on that next.

Histogram of overall scores, showing bimodal distributions for both men and women.

 

Design for Learning Stats

From my course blog for Math | Art | Design.

Math | Art | Design

A student at Brown, Daniel Kunin, has created a terrific visual resource for explaining statistics. It is called Seeing Theory, and it is hosted at Brown. Under the hood, it features Mike Bostock’s JavaScript library for creating visualizations, D3. For anyone interested in visualizing quantitative information, it’s a delight!

View original post

Reading Mathematics: Click/Clunk

Years ago, I did some work to help students read mathematics textbooks. I gave a presentation on the material at an NCTM conference and wrote a piece for students that is still in use by the Harvard Bureau of Study Council. I was recently reminded of the work because of an email I received about it, so I’m going to be looking at making use of the work again and perhaps writing up something additional about it and getting the work out more broadly. It is based on the “click/clunk” method which is used in some reading methods in education.

Woman sitting in a motorized chair

#9: Park on a Hill

When I was writing my dissertation I was part of the PhinisheD community, and there I learned the concept of “parking on a hill.” The idea is to not end your day’s work with the completion of a task, but rather to end it with a task started, ready to move on the next day. Much in the way that gravity will help you get started if you park a car in a hill, the gravity of having a task in progress can help you get started when you park a project on a hill.

This is a technique I could use more of in my life right now, so I am going to record some of my thoughts on this technique here. Right now I am working on a paper that I think I’ve been writing for three years. I’m very slow. This is a “back burner” project which means I often go for days or weeks without touching it. So if I can be sure to have it set up with something that will move me forward whenever I leave it then I think two things will happen. First, part of my brain will already be engaged in the next idea or task, which may push me toward getting back to the project and may help good ideas to percolate in my brain. Second, if I have jotted down something that will help me get started I may find it easier to remember and reconnect with the project the next time I sit down.

Here are some ideas for parking on a hill:

  • Write half a sentence to give yourself a jumping off point. This can produce some anxiety — you probably have an idea of how you are going to finish the sentence, but what if you never get that idea back? But the fact is that leaving openness allows something new to come into your work. If you over-plan your next steps you may miss the exciting accidents and new ideas that come to you when you least expect it.
  • Briefly outline the thing you want to do or write. Don’t over-plan since that won’t produce energy or excitement, but do be sure you have a next job that is easy and fun to dive into.
  • Ask a question. For instance, I am working on a developing some materials for a class this fall, so I might end my work day with this question, “How can I refresh student’s understanding of percentages without talking down to my students and ignoring the understanding they already have?” Then I can start my day with some freewriting in response to that question, which may help me set up the activities I want to develop of the class.

#8: Don’t choose, just do and notice

Lately, I have been meditating, and I’m up to six minutes a day which is OK. In meditation the point is to be present to the now, to observe and only to observe. There are many different techniques to carry this out and to some extent it doesn’t matter what technique you choose. You could do anything. you could practice by wiggling your toes and observing that. But it is useful to have a particular practice; it gives you a focus and a way to discipline your brain toward mindfulness. Whatever particular method you have chosen, you simply keep coming back to the present. If your mind strays, come back to your breath or your mantra, or whatever. Keep coming back.,

I’d love to find such a simple practice for all of life. But life isn’t so easy because life isn’t about watching, it’s about action. In meditation, on the other hand,the point is not to act but to observe. But life is doing, about actions. One of our actions can be sitting down to meditate (or engaging in a meditation practice in a way that does not involve sitting), but for most of us it would be a poor life if that was the only action we took. We have to take life-sustaining actions like making and eating food, caring for our immediate environment, and caring for our bodies. We take actions to care for others, like children. We take actions to connect with friends and family. We take actions to bring in money so that we can sustain a roof over our heads, food on the table, and have discretionary income. life is full of actions. To some extent, life is action.

Personally that’s where I run into trouble. The actions available to us are nearly limitless. I could write this right now, I could get up and eat, I could go to the bathroom, I could write an email that I want to send, I could work on a research project, I could clean the house, I could take a shower, I buy clothes, I could watch a movie, I could make food, I could garden, I could volunteer, I could plan the work I am going to do this summer, I could buy groceries, I could play bass. I could keep this list going for a long long time, and these are just actions that are available to me right at this minute.

I always get stuck in picking an action. Most actions are not essential, they don’t absolutely have to be done. Occasionally I have a task that absolutely must be done at a particular time, and I both hate and love those. I have to get my kids from school at 2:55 even if I might want to stay home and work on something else. I have to teach class as 9:30 even though I might want to spend time with my colleagues at work. Of course, I could choose not to pick up my kids and not to teach class, but the consequences of those choices are unpleasant enough that I don’t have to make a decision — I just do the thing that needs to be done. But I spend most of my day in a state of choice. Do I want to do this or that? And then I get stuck. How do I choose between this and that? Sometimes, like this morning, I have something pressing enough that I have to do it. In other words, I have put it off for a long enough time that the choice is now made for me. But when I have a choice, I am stuck. Having the choice produces a great deal of anxiety, because I worry about making the wrong choice.

To reduce my own anxiety and be more satisfied, then I need to remove the choice, or at least find a way to not find myself in that choice state very often. Choosing is uncomfortable for me. I have sometimes done this using randomness, particularly using using Mark Forster’s random method (I will probably write more about this sometime), I would love to find a way to use the meditation trick of always bringing myself back to one particular focus, as well as connecting to the idea of observation.

So here is my idea. It doesn’t really matter what actions I take. That’s not entirely true of course since there are bad actions I can take, but let’s assume I’m choosing between reasonably good possible actions. I can choose to do this or to do that, much like I can choose any of a variety of meditation methods. But like with meditation, it is important to have a some particular discipline to practice. What is that discipline that I can bring yourself back to repeatedly? I can keep coming back to a focus on what I am doing right now.

So during my day, I am going to work on letting go of choosing the right action. I am going to remind myself that the choice doesn’t really matter in the long run. Rather than choosing what to do next, I am going to simply pay attention to my action at a particular moment. I’m going to do, and to notice what is happening. The noticing shouldn’t be about judging or weighing whether I’m doing the right thing because that’s not the point. The point is simply to be present.

It’s not very exciting as a system because there’s no lists or reward system, but I’ll see how it goes. And of course, I’m working on another system, inspired by software development, that does involve lists.