Scalable Top-K Structural Diversity Search
This paper studies the problem of top-k structural diversity search, which is to compute k users with the highest structural diversities that is measured by the number of connected components in the neighborhood of a user. As the existing algorithms are not scalable for processing large graphs due to their limits, in this paper we propose a scalable algorithm Div-TriE to improve the efficiency. Div-TriE has two optimal features compared with the existing algorithms. Firstly, we show that as a key building block, we only need to enumerate each triangle at most once in Div-TriE, in contrast to the up-to three times in the existing techniques. Secondly, we develop e cient techniques so that the computation against each enumerated triangle is (amortized) constant, in contrast to the non-constant costs in the corresponding costs of the existing techniques. Extensive experimental results on real graphs show that Div-TriE outperforms the existing techniques by one order of magnitude.