||Measuring the Mood of the Nation via Twitter
This project will develop methods to gauge public opinion on topics of current interest using Twitter messages and sentiment analysis algorithms. Tweets matching selected topics will be collected and analyzed, and the end result will be visualized on a map using geo-tags contained in the Tweets. Topics could be related to politics (“Obama”, “Romney”), public policy (“deficit reduction”, “alternative energy”), or scientific exploration (“Mars rover”). Such tools help understand the mood in different parts of the US, and could also help understand shifts in sentiment over time.
1. Turney P, “Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews,” Proceedings of the Association for Computational Linguistics (2002), pp. 417–424.
2. Gansner E, Hu Y, and North S, “Visualizing Streaming Text Data with Dynamic Maps, to appear in Proceedings of the 20th International Symposium on Graph Drawing” (best paper award), (2012), TwitterScope website.
3. Abello J, Ham FV, Krishnan N, “AskGraphView: A Large Graph Visualization System”, IEEE Transactions in Visualization and Computer Graphics, 12(5) (2006), 669-676.
4. Batagelj V and Zaversnik M, "An O(m) Algorithm for Cores Decomposition of Networks", 2002. paper here
- Week 1:
- I met my advisor, Dr. James Abello, and my partner, Mika Sumida. First we discussed the details of our project and what we are hoping to accomplish. Then we prepared for our first presentation on Friday. We went through code generously provided by Adam Feldman which builds a coocurrence graph from streaming Twitter data. We are planning to adapt this to our needs.
- Week 2:
- This week I worked on a little 3D Java application for animating little spheres falling onto a grid. We plan to use such an animation as a metaphor to represent incoming tweets. I've been familiarizing myself with the Java3D API.
- Week 3:
- I refined my applet somewhat in the first half of the week. I was having an issue with the appearance of the squares in the grid. They are supposed to grow when a ball falls on them, but the bottom of the square is growing in addition to the top for some reason. On Friday we had a conference call with the University of Maryland professors who provided our Twitter data, and decided on a direction in which to move forward.
- Week 4:
- We talked with the University of Maryland professors again and began looking at data on followers and retweets as well. I wrote code to go through the hurricane data sets and show the IDs of the most retweeted tweets. The graph of all users in the data set connected by followers, which Mika created, looks (somewhat unsurprisingly) like a huge blob. We wanted a way to "peel off" users that were less connected and get a sense for who the most influential people are in our data set. To accomplish this I implemented a cores decomposition algorithm developed by Vladimir Batagelj and Matjaz Zaversnik at the University of Ljubljana in Slovenia.
- Week 5:
I continued refining my code for the core decomposition algorithm and testing it on the follower data. Using this, I assigned a value to each edge and obtained a nice partition of the graph into a handful of subgraphs. This makes the large amount of data a little easier to analyze.
- Week 6:
My goal was to take the graph partitions and visualize them by stacking the subgraphs on top of each other, creating a three dimensional structure. I would like to position them in such a way that vertices which are shared by multiple subgraphs are lined up directly over one another, to get a sense of which vertices appear in many subgraphs and which appear in only a few. This should give us an idea of the most "important", i.e. well-connected, users in the network.
- Week 7:
Mika and I spent some time this week preparing for our final presentation on Friday morning.
- Week 8: