Probability and Statistics Seminar

Network Inference from Co-Occurrences

Inferring network structures is a central problem arising in many fields of science and technology, including communication systems, biology, sociology, and neuroscience. In this talk, after briefly reviewing several network inference problems, we will focus on that of inferring network structure from co-occurrence" observations. These observations identify which network components (e.g., switches, routers, genes) co-occur in a path, but do not indicate the order in which they occur in that path. Without order information, the number of structures that are data-consistent grows exponentially with the network size. Yet, the basic engineering/evolutionary principles underlying most networks strongly suggest that no all data-consistent structures are equally likely. In particular, nodes that often co-occur are probably closer than nodes that rarely co-occur. This observation suggests modeling co-occurrence observations as independent realizations of a random walk on the network, subjected to random permutations. Treating these permutations as missing data, allows deriving an expectation–maximization (EM) algorithm for estimating the random walk parameters. The model and EM algorithm significantly simplify the problem, but the computational complexity still grows exponentially in the length of each path. We thus propose a polynomial-time Monte Carlo EM algorithm based on importance sampling and derive conditions that ensure convergence of the algorithm with high probability. Finally, we report simulations and experiments with Internet measurements and inference of biological networks that demonstrate the performance of this approach.The work reported in this talk was done in collaboration with Michael Rabbat (McGill University, Canada) and Robert D. Nowak (University of Wisconsin, USA).