DIMACS
DIMACS REU 2011

General Information

Student: Maria Taranov
Office: CoRE 450
School: Rutgers University
E-mail: mtaranov@eden.rutgers.edu
Project: Anomaly Detection
Mentor: Dr. James Abello
Joint Project With: Michael Tsamis

Project Description

This project addresses the problem of finding persistent patterns in evolving networks. A central question is the characterization of patterns that can be used as the basis to detect anomalous activities in time-evolving networks.


Weekly Log

Week 1 (June 6-June 10):
I began by working with twitter data given to us by our mentor. This data consisted of networks of retweets within specific hashtags. During the first week, I began a visual analysis of these networks using ASK-Graphview, one of two large-scale graph visualization softwares that we used. Additionally, Michael Tsamis and I created two basic client-server programs. The first sent messages between the client and the server while the second sent a file from the server to the client. These programs will later be modified to carry out more advanced tasks.
Week 2 (June 13-June 17):
I continued to visually analyze the twitter data from the first week. I also began some basic mathematical analysis of the clusters in the graphs. This allowed me to pick a reasonable algorithm which will be used to analyze large graphs. In particular, we will create vectors that measure the membership of nodes into particular clusters, and then use the norms of these vectors as a basis for analysis.
Week 3 (June 20-June 24):
I began the automization process which will calculate the membership vectors and its norm for every single node.
Week 4 (June 27-July 1):
I finished implementing Java code to calculate the membership vector norms and began work on calculating the dot product. I was also able to print each node's cluster and norm in a format that would allow Mike to add it the ASK-GraphView labels file, impacting the ways the graph could be viewed and analyzed in ASK-GraphView.
Week 5 (July 4-July 8):
This week, we discovered that the data sets were larger than had been originally expected. I spent the week modifying the code so that it could handle the larger data sets. I was successfully able to calculate and print norms for each node and dot product for each set of connected nodes, allowing us to finish the files that would allow the graph to be viewed in ASK-GraphView.
Week 6 (July 11-July 15):
I visually analyzed the new graphs as well as finding the nodes with highest norms and discovering why those nodes were important within the hashtag topic. I also prepared for the final presentation and worked on the final report.
Week 7 (July 18-July 22):
I spent this week in Prague.

Presentations