Parallel Implementation of Minimum Spanning Tree Algorithm on CPU and GPU

Alexander S. Kolganov
Keldysh Institute of Applied Mathematics


Abstract


Solution of the finding a minimum spanning tree problem is common in various areas of research: recognition of different objects, computer vision, analysis and construction of networks (eg, telephone, electrical, computer, travel, etc.), chemistry and biology, and many others. Processing large graphs is a quite time-consuming task for the central processor (CPU), and in high demand at the present moment. The usage of Graphics processing units (GPUs) as a mean to solve general-purpose problems grows every day, because GPUs have more computing power than CPUs. This article describes the methods of compression and conversion of graphs in standard formats to increase the efficiency of their processing. The search algorithm of minimum spanning trees has been used for researching the proposed approaches. The possibility of a hybrid implementation of this algorithm has been investigated. The highest results were obtained on the large R-MAT and SSCA2 graphs.

Keywords


MST, parallel graph processing, Boruvka algorithm, CUDA, large graphs

References


Chazelle, B.A. Minimum spanning tree algorithm with inverse-ackermann type complexity /

B.A. Chazelle — Journal of the ACM. — 2000. — Vol. 47, №. 6. — P. 1028–1047.

Pettie, S. An inverse-Ackermann style lower bound for the online minimum spanning tree

verification problem / S. Pettie — Foundations of Computer Science. — 2002. — P. 155–163.

Wei, W. GPU-Based Fast Minimum Spanning Tree Using Data Parallel Primitives / W.

Wei, G. Shaozhong, Y. Fan, C. Jianxun // Information Engineering and Computer Science

(ICIECS), 2nd International Conference (TBD Wuhan, China). — 2010. — P. 1–4.

Arefin, A.S. kNN-MST-Agglomerative: A fast and scalable graph-based data clustering

approach on GPU / A.S. Arefin, C. Riveros, R. Berretta, P. Moscato // Computer Science &

Education (ICCSE), 7th International Conference (The University of Melbourne, Australia).

— 2012. — P. 585–590.

Vineet, V. Fast Minimum Spanning Tree for Large Graphs on the GPU / V. Vineet, P. Harish,

S. Patidar, P.J. Narayanan // HPG ’09 Proceedings of the Conference on High Performance

Graphics (New Orleans, LA, USA). — 2009. — P. 167–171.

Gibbs, N.E. A comparison of several bandwidth and profile reduction algorithms / N.E. Gibbs,

W.G. Poole, P.K. Stockmeyer — ACM Transactions on Mathematical Software. — 1976. —

Vol. 2, №. 4. — P. 322–330.

Gilbert, J.R. Sparse matrices in MATLAB: Design and Implementation / J.R. Gilbert, C.

Moler, R. Schreiber — SIAM Journal on Matrix Analysis and Applications. — 1992. — Vol.

, №. 1. — P. 333—356.

Chakrabarti, D. R-MAT: A Recursive Model for Graph Mining / D. Chakrabarti, Y. Zhan,

C. Faloutsos // In Proceedings of 4th International Conference on Data Mining (Brighton,

UK). — 2004. — P. 442–446.

Bader D.A., Madduri K. Design and implementation of the HPCS graph analysis benchmark

on symmetric multiprocessors / D.A. Bader , K. Madduri — Technical report. — 2005.

Bor˚uvka. O Jist´em Probl´emu Minim´aln´ım. / Bor˚uvka, Otakar — Pr´ace Moravsk´e

Pˇr´ırodovˇedeck´e Spoleˇcnosti III. — 1926. — №. 3. — P. 37–58.

Описание алгоритмов для структуры Union-Find. URL: https://www.topcoder.com/

community/data-science/data-science-tutorials/disjoint-set-data-structures/

(дата обращения: 30.11.2015)

Описание стандарта OpenMP. URL: http://www.openmp.org/mp-documents/openmp-4.

pdf (дата обращения: 30.11.2015)

Compute Unified Device Architecture. URL: http://www.nvidia.ru/object/

cuda-parallel-computing-ru.html (дата обращения: 30.11.2015)

Unified Memory in CUDA 6. URL: http://devblogs.nvidia.com/parallelforall/

unified-memory-in-cuda-6/ (дата обращения: 30.11.2015)

Nobari, S. Scalable Parallel Minimum Spanning Forest Computation / S. Nobari, T. Cao,

P. Karras, S. Bressan // PPoPP’12 Proceedings of the 17th ACM SIGPLAN symposium

on Principles and Practice of Parallel Programming (New York, NY, USA). — 2012. — P.

–214.

Wei, W. Design and Implementation of GPU-Based Prim’s Algorithm / W. Wei, G.

Shaozhong, Y. Fan, C. Jianxun // Information Engineering and Computer Science (ICIECS)

nd International Conference (TBD Wuhan, China). — 2010. — P. 1–4.




DOI: http://dx.doi.org/10.14529/cmse160301