DonorsChoose is an online charity where teachers post projects to request funding, and donors choose their favourite projects to donate to. About a month ago, DonorsChoose released much of their data on projects and donations going as far back as 2002. With a data set that size, something interesting is sure to pop up. The “Hacking Education” series attempts to find that something interesting.
In part one, we look at the choices that donors have and how they chose: that is, the kinds of projects that teachers post on DonorsChoose, and the kinds of projects that donors decided were their favourites.
It is worthy to note how far DonorsChoose now reaches: the above map shows the locations of DonorsChoose projects in the continental USA. Project postings have grown exponentially in the past few years: the number of projects posted annually went up 19x between 2004 and 2010. This growth is fuelled by a 17x increase in total donation amount (and 65x the raw number of donations).
The Donors’ Choices
In its earlier years, most of the DonorsChoose projects came from low-poverty schools in urban areas. As the organization became more well-known, more and more projects came from suburban or rural areas and areas with decreased amount of poverty. In 2010, 15% of projects came from low-poverty schools, and close to half of the projects came from suburban and rural schools. This is accompanied by an increased percentage of project requests for technology — which is what one would expect from lower-poverty schools in our growingly technological world. Despite this shift, there appears to be growing focus toward core subjects: math, science, literacy and language.
But there is always more than what meets the eye. The below chart shows the percentage increase in the raw number of projects in each subject area and resource type. The categories showing the most increase are core-subject projects asking for “other” types of resources.
Just what kind of projects are these? Each project lists the actual name of the resources requested. Below are two word clouds created using those resource names. The first world cloud is generated from resource names of projects in core areas with resource type “other”. The second is generated from projects in the same subject areas, but with every resource type except “other”. More frequently occurring words appears bigger, shedding some light on the most frequently requested resources. [Note: the word "Illustrator" was removed from the second word cloud, since it was so big that it obfuscated other words. "Illustrator" occurs frequently because many books in the resource list included its illustrator in its name.]
The difference between the two word clouds is quite obvious. Funding requests for activities and games seems to be what drove the growth we saw in the core areas. (Unless it’s more complicated than that…)
It’s also interesting how “bean bag” and “beanbag” are both small but visible in the first plot.
What Donors Chose
A project is considered “completed” when it receives 100% of the funding requested. To study the choices donors made, we look at the various factors affecting the likelihood of a project becoming fully funded: the below plot shows how the school metro, poverty level, and grade level changes the odds of project completion.
As one would expect, donors are most likely to fully fund projects from urban and high-poverty schools. Donors are also much more likely to fully fund projects geared towards students in high school grades.
What about a project’s subject area and resource type? How do they affect project completion?
Music and arts are among the favourite subjects for donors. This isn’t too surprising: for a donor wishing to make schoolchildren happier, funding music and art programs is a sure way to go. What is surprising is the fact that odds of project completion for literacy and language is much lower than those for other subjects requesting the same resource type. Is it that the project is not as high quality? Or it’s just not as exciting to donate to a project to improve literacy?
The other thing to note is the lower odds of completion for project asking for technology. This is hardly any more surprising: though everyone toots the importance of technology these days, it’s difficult not to think of it as something “extra” or “optional” in schools. But is it really more optional than an extra book set for the library?
More on Resources
The particular words used to describe the resource name correlates with donor behaviour as well. The effect of the top 200 most “informative” words (in terms of tf-idf) on project completion are calculated below. Only words that are statistically significant are shown.
While art supplies (tempera, pencil, crayola) and books (illustrator, carle, books, library) are favoured by donors, photos, cartridges and quills are not. Also, including the word “smart” in the resource name does not appear to be a smart thing to do.
Included in teachers’ requests for funding are short essays describing the project and the students. The essays are how teachers communicate to donors, so it is not surprising for essays to affect project completion.
In the first analysis on how the type of words used in essays affect project completion, words are categorized according to the LIWC dictionary categories. LIWC categorizes words based on psychological and linguistic constructs – e.g. part of speech, emotional content, and word meaning. For example, the category “negations” includes negation words (e.g. “not”) and the category “positive emotions” includes words like “adore”, “happy”, “eager”.
The below chart shows the change in odds of project completion when there is a 1% increase in use of words in a word category (i.e. 1 more word out of 100). Only word categories that are statistically significant are shown.
While words related to “death” and “negation” increase odds of project completion, positive emotional words decrease it. (Perhaps pessimism communicates urgency and optimism signals contentment?) Words that are descriptive in nature (relating to “seeing”, “hearing” and “feeling”) also decrease project completion.
If word categories can be correlated with project completion, then individual words can be, too. Once again the effects of the appearance of certain words are shown below. (Again, only words with high TFIDF and enough statistical significance are shown.)
The above chart is quite telling: helping students get into college is a priority for donors, as wells providing “new” things — so long as that new thing isn’t a digital camera.
Most of the analysis was done on projects posted between 2004 and 2010. I used R and ggplot2 to generate graphs, wordle to generate the word clouds, and python for data manipulation. All the code used for the above analysis is in github. (So if I did something wrong, please let me know.)
For the “how donors chose” section I used logistic regression, and only used projects posted before 9/30/2010 (since there were still projects live on the site on projects posted before then). For most parts of the analysis, causation should not be implied — even though I wasn’t too careful with the language when causation seemed extremely likely.