Hybrid Parallelization for Accelerating Visibility-Graph Construction and Community Detection on Temporal Data
Keywords:
High Performance Computing, Temporal Series Analysis, Visibility Graph, Community DetectionSynopsis
This is a Chapter in:
Book:
Competitive Tools, Techniques, and Methods
Print ISBN 978-1-6692-0008-6
Online ISBN 978-1-6692-0007-9
Series:
Chronicle of Computing
Chapter Abstract:
In the present era, vast amounts of time series data, particularly in biology, require efficient analysis due to high-dimensional datasets from advanced technologies. Transforming time series into networks and applying community detection methods can uncover dynamic patterns as temporal network communities. However, the large size of these datasets often extends analysis time. High Performance Computing (HPC) addresses this by accelerating traditional applications. This article presents an HPC-optimized version of the visibility-graph-based temporal community detection method. Enhancing the original algorithm with parallel processing, including shared memory and message passing models, improves adaptability. Experiments using an artificial sine series and two real-world datasets—biological time series and power consumption patterns—on Clemson's Palmetto Cluster demonstrate significant performance improvements and scalability over the original approach.
Cite this paper as:
Jia X., Zheng M., Wang L., Jin S. (2024) Hybrid Parallelization for Accelerating Visibility-Graph Construction and Community Detection on Temporal Data. In: Tiako P.F. (ed) Competitive Tools, Techniques, and Methods. Chronicle of Computing. OkIP. AHPC24#06. https://doi.org/10.55432/978-1-6692-0007-9_8
Presentated at:
The 2024 OkIP International Conference on Advances in High-Performance Computing (AHPC) in Oklahoma City, Oklahoma, USA, and Online on October 3, 2024.
Contact:
Xun Jia
xunj@g.clemson.edu
References
Langran, G. (1989). A review of temporal database research and its use in GIS applications. International journal of geographical information system, 3(3), 215-232.
Lacasa, L., Luque, B., Ballesteros, F., Luque, J., & Nuno, J. C. (2008). From time series to complex networks: The visibility graph. Proceedings of the National Academy of Sciences, 105(13), 4972-4975.
Jin, D., Yu, Z., Jiao, P., Pan, S., He, D., Wu, J., ... & Zhang, W. (2021). A survey of community detection approaches: From statistical modeling to deep learning. IEEE Transactions on Knowledge and Data Engineering, 35(2), 1149-1170.
Wang, Y., Jin, D., Musial, K., & Dang, J. (2019, July). Community detection in social networks considering topic correlations. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, No. 01, pp. 321-328).
Jin, D., Wang, H., Dang, J., He, D., & Zhang, W. (2016, February). Detect overlapping communities via ranking node popularities. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 30, No. 1).
Ramirez-Gargallo, G., Garcia-Gasulla, M., & Mantovani, F. (2019, May). TensorFlow on state-of-the-art HPC clusters: a machine learning use case. In 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) (pp. 526-533). IEEE.
Jiang, L., Liu, Y., & Cheng, M. C. (2022). Fast-Accurate Full-Chip Dynamic Thermal Simulation With Fine Resolution Enabled by a Learning Method. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42(8), 2675-2688.
Wang, C., Jin, S., Huang, R., Huang, Q., & Chen, Y. (2022). A configurable hierarchical architecture for parallel dynamic contingency analysis on gpus. IEEE Open Access Journal of Power and Energy, 10, 187-194.
Sanbonmatsu, K. Y., & Tung, C. S. (2007). High performance computing in biology: multimillion atom simulations of nanoscale systems. Journal of structural biology, 157(3), 470-480.
Dagum, L., & Menon, R. (1998). OpenMP: an industry standard API for shared-memory programming. IEEE computational science and engineering, 5(1), 46-55.
Nichols, B., Buttlar, D., & Farrell, J. (1996). Pthreads programming: A POSIX standard for better multiprocessing. " O'Reilly Media, Inc.".
Luebke, D., & Harris, M. (2004, June). General-purpose computation on graphics hardware. In Workshop, SIGGRAPH (Vol. 33, p. 6).
Gabriel, E., Fagg, G. E., Bosilca, G., Angskun, T., Dongarra, J. J., Squyres, J. M., ... & Woodall, T. S. (2004). Open MPI: Goals, concept, and design of a next generation MPI implementation. In Recent Advances in Parallel Virtual Machine and Message Passing Interface: 11th European PVM/MPI Users’ Group Meeting Budapest, Hungary, September 19-22, 2004. Proceedings 11 (pp. 97-104). Springer Berlin Heidelberg.
Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.
Apache sparkTM - unified engine for large-scale data analytics. Apache SparkTM - Unified Engine for large-scale data analytics. (n.d.). https://spark.apache.org/
Abhyankar, S., Brown, J., Constantinescu, E. M., Ghosh, D., Smith, B. F., & Zhang, H. (2018). PETSc/TS: A modern scalable ODE/DAE solver library. arXiv preprint arXiv:1806.01437.
Harris, C. R., Millman, K. J., Van Der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., ... & Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357-362.
Bezanson, J., Edelman, A., Karpinski, S., & Shah, V. B. (2017). Julia: A fresh approach to numerical computing. SIAM review, 59(1), 65-98.
Blas (Basic Linear Algebra Subprograms). (n.d.). https://www.netlib.org/blas/
LAPACK - linear algebra package. (n.d.). https://www.netlib.org/lapack/
Masoudi-Sobhanzadeh, Y., Gholaminejad, A., Gheisari, Y., & Roointan, A. (2022). Discovering driver nodes in chronic kidney disease-related networks using Trader as a newly developed algorithm. Computers in Biology and Medicine, 148, 105892.
Kutluana, G., & Türker, İ. (2024). Classification of cardiac disorders using weighted visibility graph features from ECG signals. Biomedical Signal Processing and Control, 87, 105420.
Bonin-Font, F., & Burguera, A. (2020). Towards multi-robot visual graph-SLAM for autonomous marine vehicles. Journal of Marine Science and Engineering, 8(6), 437.
Rahnavard, A., Chatterjee, S., Sayoldin, B., Crandall, K. A., Tekola-Ayele, F., & Mallick, H. (2021). Omics community detection using multi-resolution clustering. Bioinformatics, 37(20), 3588-3594.
Gasparetti, F., Sansonetti, G., & Micarelli, A. (2021). Community detection in social recommender systems: a survey. Applied Intelligence, 51(6), 3975-3995.
Guerrero-Solé, F. (2017). Community detection in political discussions on Twitter: An application of the retweet overlap network method to the Catalan process toward independence. Social science computer review, 35(2), 244-261.
Al-Sharoa, E., Al-Khassaweneh, M., & Aviyente, S. (2018). Tensor based temporal and multilayer community detection for studying brain dynamics during resting state fMRI. IEEE Transactions on Biomedical Engineering, 66(3), 695-709.
Zheng, M., Domanskyi, S., Piermarocchi, C., & Mias, G. I. (2021). Visibility graph based temporal community detection with applications in biological time series. Scientific reports, 11(1), 5623.
Amdahl, G. M. (1967, April). Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the April 18-20, 1967, spring joint computer conference (pp. 483-485).
Zhou, W., Sailani, M. R., Contrepois, K., Zhou, Y., Ahadi, S., Leopold, S. R., ... & Snyder, M. (2019). Longitudinal multi-omics of host–microbe dynamics in prediabetes. Nature, 569(7758), 663-671.
Hebrail, G., & Berard, A. (2012). Individual household electric power consumption data set. UCI Machine Learning Repository.
