Michael Kowal, Bowdoin’s fellow in Digital and Computational Studies, recently led an open Twitter data workshop on how to analyze big data. “Big data,” Kowal explained, are data sets that are too large for traditional analysis. The massive amount of information on Twitter is an example.
While faculty and student participants munched on pizza, Kowal discussed the purpose, benefits and difficulties of working with big data. He showed workshop participants how mining Twitter data for information about specific hashtags or keywords can reveal public opinion on current events and politics. (A hashtag, for those who don’t know, is a Twitter term for a searchable tag denoting tweets about the same subject matter.)
One of the best ways to see patterns in data is to create visual representations. Kowal displayed three graphs he had created from Twitter data using analysis software R-Studio. He described the functions of the three maps he had made of the hashtag #puppies. The first was a polarity map showing whether the people who tweeted with the #puppy hashtag expressed negative or positive emotions. The next was an emotional map breaking down tweets into categories of emotional reaction, displaying the percentage of tweets in each category. The third was a word cloud showing which words were used in those hashtagged tweets. The words were bigger or smaller depending on the frequency of use.
After Kowal introduced participants to the concepts behind Twitter data computation, he aided them in installing R-Studio on their own computers so they could create similar graphs. R-studio is free and available on the Internet. Then Kowal showed how people can access data from Twitter’s online database. The class got to work manipulating the R-code to analyze hashtags of their choice (#trump and #alldaybreakfast were a couple examples). Following the R-Studio demonstration, Kowal showed participants an alternative method for big data analysis using the computer language Python.
The workshop was sponsored by Bowdoin’s Digital and Computational Studies Initiative.
I