Efficiently Finding a Maximal Clique Summary via Effective Sampling

Li, Xiaofan; Zhou, Rui; Chen, Lu; Liu, Chengfei; He, Qiang; Yang, Yun

Computer Science > Databases

arXiv:2009.10376 (cs)

[Submitted on 22 Sep 2020 (v1), last revised 28 Nov 2020 (this version, v2)]

Title:Efficiently Finding a Maximal Clique Summary via Effective Sampling

Authors:Xiaofan Li, Rui Zhou, Lu Chen, Chengfei Liu, Qiang He, Yun Yang

View PDF

Abstract:Maximal clique enumeration (MCE) is a fundamental problem in graph theory and is used in many applications, such as social network analysis, bioinformatics, intelligent agent systems, cyber security, etc. Most existing MCE algorithms focus on improving the efficiency rather than reducing the output size. The output unfortunately could consist of a large number of maximal cliques. In this paper, we study how to report a summary of less overlapping maximal cliques. The problem was studied before, however, after examining the pioneer approach, we consider it still not satisfactory. To advance the research along this line, our paper attempts to make four contributions: (a) we propose a more effective sampling strategy, which produces a much smaller summary but still ensures that the summary can somehow witness all the maximal cliques and the expectation of each maximal clique witnessed by the summary is above a predefined threshold; (b) we prove that the sampling strategy is optimal under certain optimality conditions; (c) we apply clique-size bounding and design new enumeration order to approach the optimality conditions; and (d) to verify experimentally, we test eight real benchmark datasets that have a variety of graph characteristics. The results show that our new sampling strategy consistently outperforms the state-of-the-art approach by producing smaller summaries and running faster on all the datasets.

Comments:	20 pages, 80 figures
Subjects:	Databases (cs.DB)
ACM classes:	H.2
Cite as:	arXiv:2009.10376 [cs.DB]
	(or arXiv:2009.10376v2 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.2009.10376

Submission history

From: Xiaofan Li [view email]
[v1] Tue, 22 Sep 2020 08:07:16 UTC (2,950 KB)
[v2] Sat, 28 Nov 2020 08:59:54 UTC (5,298 KB)

Computer Science > Databases

Title:Efficiently Finding a Maximal Clique Summary via Effective Sampling

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:Efficiently Finding a Maximal Clique Summary via Effective Sampling

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators