Mele Angelo Mele is an Associate Professor of Economics at Johns Hopkins University - Carey Business School. His research analyses how social and strategic interactions affect individual and aggregate socioeconomic outcomes. His work has been published in Econometrica, American Economic Journal: Economic Policy and Review of Economics and Statistics, Regional Science and Urban Economics. He has a PhD in Economics from University of Illinois at Urbana-Champaign.
structural model of business network formation... - Based on Mele (2017) - Adapted to local dependence models (Schweinberger & Handcock, 2015) ...using Eight business network for 2019... ...employing scalable algorithms. - Model based clustering (Vu et. al., 2013) and pseudo-likelihood estimation Babkin et. al. (2020).
Understanding about: - The cost of connections. - The role of externalities and homophily. - A way to simulate networks: - For measuring causal effects - Evaluating the impact of new services before launching them - To contribute to society
- ERGM (Exponential Random Graph Model): Model the probability distribution of networks as a function of network statistics (edges, triangles, shared partners, stars, etc.) - Can incorporate third-party effects on link formation like “common friends” - A popular method in network science
ERGMs are prone to degeneracy, i.e. tend to put very large probability mass on few networks, which are not representative of what we see in reality. - Impossible to compute the denominator of the likelihood function due to combinatorial explosion of possible network configurations. - Even when N = 10, it would take 40 million years to compute just for one iteration. - Due to the problem above, MCMC-based methods are often applied, but computationally burdensome. - Theoretically, assumes that all nodes know the state of all others when considering forming a new connection.
- Hierarchical ERGM (HERGM): Schweinberger & Handcock (2015) - Agents belong to communities: - Connections across communities happen by luck, influenced by homophily. - Connections within communities also consider externalities (friends of friends, stars, etc.)
- Step 1: Estimate a block structure using a stochastic block model - Applies minorization-maximization (Vu et al., 2013) - Step 2: Given the estimated block structure, estimate between- and within - block parameters by maximum pseudo-likelihood estimation - Parameters like “homophily” and the effect of common friends We tried the R package hergm (Schweinberger and Luna, 2018), but the most recent implementation cannot handle large networks even on a high-performance machine.
estimation techniques stop being feasible at the scale of a few hundred nodes. - Our approach: - Scalable model-based variational clustering algorithm in Vu et. al. (2013) - We augmented that model: - Work with hundreds of thousands of nodes and thousands of communities - Fully parallelized - Low memory usage - Allow to include discrete covariates - Pseudo-likelihood estimation based on Babkin et. al. (2020) - We appreciate Michael Schweinberger for his comments on our fixes to the original hergm package.
1,000 clusters are obtained (median size: 212 nodes) - A few large clusters are obtained - The largest cluster is not so large - Otherwise, sizes are very stable - Industries sharing many clusters tend to be “similar”.
network is VERY sparse. Communities are tighter subnetworks Significant homophily based on observables Externalities (popularity, friends-of-friends) are important determinants of connections within communities.
home in South Kanto - Randomly move a 15% of nodes based in Tokyo to other prefectures in South Kanto proportionally. - Simulate many networks of individuals using HERGM estimates. - Aggregate them into cities to create networks of cities. - Calculate the degree centrality of each city, and take the mean across all simulations. - Calculate its change with respect to simulations in absence of the treatment. - Degree increases more in smaller cities. - The increase more than compensates for the loss in access to cities in Tokyo. Counterfactual change in city degree Red for negative, Blue for positive, Darker for larger values.
- Move 5% of the nodes in Shikoku to Kamiyama-cho. - Same simulation method as the last slide. - Thereʼs a drop in degree on cities that “donated” nodes. - However, many cities, including Tokushima-shi see an increase. - We can learn about how agglomeration leads to industrial clustering. Red for negative. Blue for positive. Darker for larger absolute values Percentile of counterfactual change in mean city degree Mean city degree without treatment Kamiyama- cho Kamiyama- cho
networks are: - Large - Sparse - Clustered - HERGM allows us to estimate a structural network formation model in a large network. - Results can be used to simulate counterfactual networks to assess the impact of events, new services, etc.
J. R., Long, X., & Schweinberger, M. (2020). Large-scale estimation of random graph models with local dependence. Computational statistics & data analysis, 152, 107029. - Schweinberger, M., & Luna, P. (2018). HERGM: Hierarchical exponential-family random graph models. Journal of Statistical Software, 85(1). - Schweinberger, M., and Handcock, M. S. (2015), “Local dependence in random graph models: characterization, properties and statistical inference,” Journal of the Royal Statistical Society, Series B, 77, 647–676. - Vu, D. Q., Hunter, D. R., & Schweinberger, M. (2013). Model-based clustering of large networks. The annals of applied statistics, 7(2), 1010. - Stivala A, Robins G, Lomi A (2020) Exponential random graph model parameter estimation for very large directed networks. PLoS ONE 15(1): e0227804. https://doi.org/10.1371/journal.pone.0227804 - Strauss, D. and Ikeda, M. (1990). Pseudolikelihood estimation for social networks. J. Amer. Statist. Assoc. 85 204– 212.