Matt Lu, DIMACS REU 2023

DIMACS REU 2023

General Information

Student:	Matt Lu
Office:	CoRE 450
School:	Washington University in St Louis
E-mail:	mattlu@wustl.edu
Project:	Truth Learning in a Social Setting
Mentor:	Dr. Jie Gao

Project Description

The structure of a social network and the ordering in which agents within that network make decisions are crucial to how the network aggregates information as a whole. Given a ground truth value, agents within the network learn based on their private signal of the truth as well as the observations of the actions their neighbors. For this project, we will approach the problem of robust learning in two ways: network structure and agent ordering. We will explore the different network structures using randomly generated and real world network data. By running simulations, we can deduce important network features and properties that inform successful learning. Furthermore, we will explore the different orderings in which agents make decision within simple graph structures. Building on existing literature, we want to find orderings in polynomial time that generates good approximations. Specifically, we will focus on structures containing "opinion leaders," or highly influential agents/hubs of agents.

Weekly Log

Week 1:: I spent the majority of the week reviewing existing literature regarding learning in social networks, with specific emphasis on understanding the local learning requirement and robust learning. The local learning requirement states that whether an agent learns or not is determined by the local structure of the network as opposed to the global one. Robust network structure is one that allows for successful learning even if a majority of agents are chosen adversarially not to participate [1][2]. I also reviewed different network models that we could potentially simulate, including the Erdos-Renyi model, Watts-Strogatz model, preferential attachment model [2][3].
Week 2:: At the beginning of the week, we built a basic program using the networkX package in Python to simulate information cascades in different network structures. The model we implemented used a naive majority rule for decision making and a fixed private signal across all agents. We observed how many agents in the network learned successfully and found the average learning rate of the network over 1000 trials. Additionally, we found the minimum and maximum learning rate over all the trials to get a better intuition about the possible lower bounds. We found that, over a random ordering of nodes, sparser networks, such as a 3-ary tree, generally had a lower learning rate and lower bounds compared to more complex networks like the Watts-Strogatz model.; In addition to running simulations, I also read some papers that informed the direction of our project. Notably, I made sure I really understood the local learning requirement proposed by [1], which basically says that the learning quality of an agent is bounded by its neighborhood, as opposed to agents far away. This is an important property because it can be used to explain why adversarial orderings for certain network structures don't have as much effect as others. Another important paper I read was [4], which proposes the idea of multi-layer guinea pig graphs. A guinea pig graph is a graph containing a single layer of agents that make decisions solely based on their own private signals (guinea pigs) and a separate layer that is able to better aggregate the information. The paper claims that a multi-layer guinea pig graph performs better than a single-layer guinea pig graph, given a deterministic ordering. This is useful because we can find the optimality of common graphs, such as a celebrity graph, by reducing them to a k-layer guinea pig graph. Finally, I read about the hardness of influence maximization models [5]. While influence maximization and information cascade problems are similar, they differ in focus and the way the spreading process is modeled. However, they are similar enough that we could look into finding a reduction to show the hardness for our ordering problem.
Week 3:: One of the main tasks for this week was to construct a protocol for the node ordering that would lead to effective learning. From our earlier exploration, we concluded that an effective algorithm on node ordering involves a structure. A lot of the intuition builds on the analysis of multi-layer graphs, which concluded that an optimal ordering follows a “back and forth” ordering, where the ordering alternates between certain agents who make independent decisions and other agents who aggregate and disseminate the learned information [4]. In particular, we believe that the multi-layer analysis informs certain networks that follow a power-law distribution. A distinguishing characteristic of these networks is that they contain vertices with a degree that greatly exceeds the average, also known as hubs. Hubs possess the ability to influence a high proportion of the network, which makes them an important point for aggregating information.; Our protocol separates the learning process into two phases: aggregation and dissemination. For aggregation, we find a portion of the highest degree nodes in the network and have a set of their lowest degree neighbors make decisions first. Ideally, this set will be an independent set, but since finding a k-independent set is NP-hard, we just chose a portion of them at random. The agent then takes these independent decisions and makes its own decision based on a simple majority. Notice that our “aggregation ordering” follows the “back and forth” idea from earlier as we alternate between low degree and high degree nodes. While we are still looking for the best way to construct the dissemination phase, we simulated the aggregation ordering on a randomly generated preferential network and compared to a completely random ordering. Over 100 trials on a network with 500 nodes and a private signal of 0.7, our ordering had an average learning rate of around 81 percent while the random ordering had an average of arouond 77 percent. Our next step is to provide the mathematical justification for this difference.
Week 4: We successfully built our protocol for aggregating and disseminating information using hubs in a preferential attachment network. We added an additional wrinkle where we mark these aggregators of information as “high value” agents. All subsequent agents who observe these “high value” agents will immediately bypass their private signals and make the same decision. As a result, they also become “high value” agents and will disseminate the decisions to the rest of the network in a similar fashion, causing a “positive cascade.” Over 100 trials with 1000 nodes and a q=0.7, our model achieved a 0.997 success rate compared to the 0.95 rate for the random ordering baseline. The main goal of this protocol is to find a theoretical optimal by limiting the randomness within the network and fixing an agent’s decision based on a these “high value” aggregators. The ordering is optimal because the final learning rate of the network becomes subject to the k independent decisions made by the initial agents, which can be expressed by a simple binomial distribution. More specifically, by finding k initial independent neighbors of the highest degree node v in the PA network, we can say with high probability that the rest of the network will learn according to the aggregation results of v. We found a table for the k neighbors required to achieve a success rate of 1- delta, where delta > 0.; We also started looking into how to build a protocol for networks that don’t exhibit scale-free qualities. As mentioned in [1], graphs that satisfy the local learning requirement will successfully learn within their local r-hop neighborhoods. We adjusted our protocol to aggregate information from independent nodes within the r-hop neighborhood with an arbitrary node v acting as the aggregator. While this doesn’t achieve the aggregation and dissemination guarantees that a scale-free network does, we believe that by aggregating within a subset of these neighborhoods, the network will become more conducive to learning overall. Our next steps are to fine tune our model parameters and find the mathematical justification.
Week 5: We began the week by running simulations on random graphs using the protocol outlined in the previous week. On average, the protocol performed better than a random ordering by around 2-5%, which isn’t as drastic as the difference in performance achieved using scale-free networks. This makes sense as random graphs are not conducive to the formation of high degree nodes, through which we can aggregate and disseminate information. As such, we rely more on local clusters of nodes, which can be bounded using the local learning requirement [1]. Even then, the local nature of decisions doesn’t provide any independence or clustering guarantees, which might explain the smaller learning rate compared to a preferential attachment network, for example.; To gain a better intuition of local learning properties, we returned our attention to the Watts Strogatz small worlds graphs. We can quantify the structural properties of these graphs by their characteristic path length L(p) (the longest path in the graph) and clustering coefficient C(p) (how connected a nodes neighbors are) as p grows from zero to one. Notably, as p grows, the graph C(p) decreases and L(p) increases. In other words, as randomness of the graph increases, the less clusters form and the diameter of the graph increases. It follows that for a random ordering on a Watts Strogatz graph, there exists a negative correlation between the learning rate and the C(p) and a positive correlation between learning rate and L(p). This gives us insight into negative clusters induced by local clusters can negatively impact the learning rate of the network as a whole.; In addition to the Watts Strogatz graph, we also tried to find “good graphs” that produce good learning rates with high probability under a fixed ordering. Looking through simple graph structures, we found that a connected caveman graph provided some robustness guarantees. A connected caveman graph is a set of k-cliques connected in a sequential order. In a k-clique, the probability of a negative cascade can depend on as little as the two starting nodes. However, if we connected this to other k-cliques, we can provide a corrective measure under a sequential ordering of cliques.
Week 6: We began the week looking into more “good graphs.” On top of the connected caveman graph, we discovered that a grid graph also had great learning properties given a spiral ordering. A spiral ordering begins at a corner of the grid and continues along a path that resembles a spiral until it reaches the center of the grid. Under this ordering, the nodes on the boundary (the perimeter of the grid) will learn independently while all non-boundary nodes can potentially partake in a cascade. The ordering induces an aggregation mechanism that passes from the boundary to the innermost interior nodes. In particular, the aggregation slowly filters through each of the layers ensuring that the learning rate will improve from layer to layer in a similar fashion to the multi-layer guinea pig model as outlined in [4].; Also, notice that as the size of the grid increases, the boundary increases in theta(n), whereas the interior nodes increase in theta(n^2). So as n goes to infinity for a n x n grid, the learning rate of the interior will depend on an increasingly smaller number of independent nodes on the outermost boundary and will dominate the learning rate of the network. I wrote a python script to find the expected learning rate of a n x n grid, which involved calculating probabilities for distinct positions in the grid. For n=20, the expected learning rate comes out to around 0.89. This serves as a lower bound on the learning rate induced by an optimum ordering
Week 7: This week, we did extensive work on trying to bound the expected learning rates for the connected caveman and grid graphs. This analysis proved much more difficult than we initially thought and we ran into a lot of barriers. Thankfully, we were able to get past some of them and are still in the process of trying to solve some of the others. For the connected caveman, we were able to prove optimality of the sequential ordering (i.e. from first clique sequentially down to the last). We found that the learning rate for the final clique converges to q^3/(1-3q+3q^2), which is bounded away from one. We were also able to use induction to prove optimality.; For the grid graph, we were able to show that the learning rate for the second layer converges to q^2/(1-2q+2q^2) under a spiral ordering. We first showed that the learning rate of each node in the second layer increases monotonically, with exception of the corners. In the simulations, we observed that all subsequent layers follow a similar pattern in terms of learning rate improvement, however, the analysis isn’t quite as straightforward as it requires long and complicated calculations based on the dependencies of each node. Currently, our plan of attack consists of two different steps. First, we want to analyze the learning rate for a skewed tree, which will provide some insight into the mechanism of grid graphs. We were also thinking about analysis via sampling nodes in each layer that are relatively independent. On the brighter side, we were able to prove that inducing a random order on the grid bounds the learning rate by a constant fraction from 1. This implies ssuming that the learning rate of the grid converges to 1 under a spiral ordering, a spiral ordering will perform strictly better on expectation than random ordering. As such, our ordering demonstrates an important intuitive mechanism for maximizing learning rate in low degree graphs.

Week 8

We spent most of this week preparing for the presentations. It felt good to see the culmination of all the work we’ve put in over the past two months. A huge shout out to Professor Jie and the Jie lab for taking the time to let us do a trial run and for giving valuable feedback. The final presentations went without a hitch and it was exciting to see what every one else in the REU was working on!

Presentations

References

[1] "On social networks that support learning" by Arieli et al.
[2] Networks, Crowds, and Markets: Reasoning About a Highly Connected World by David Easley and Jon Kleinberg
[3] Network Science by Albert-Laszlo Barabasi
[4] "Multi-issue social learning" by Bahar et al.
[5] "Maximizing the Spread of Influence through a Social Network" by Kempe et al.
[6] "How Complex Contagions Spread Quickly in Preferential Attachment Models and Other Time-Evolving Networks" by Ebrahimi et al.
[7] "How to schedule a cascade in an arbitrary grap" by Chierichetti et al.
[8] "Optimizing Information in the Herd: Guinea Pigs, Profits, and Welfare" by Daniel Sgroi

Acknowledgements

A special thank you to my mentor Dr. Jie Gao and my lab partner Jordan Chong

This work was carried out while the authors Jordan Chong and Matt Lu were participants in the 2023 DIMACS REU program at Rutgers University, supported by NSF HDR TRIPODS award CCF-1934924