Самая быстрая и энергоэффективная реализация алгоритма поиска в ширину на одноузловых различных параллельных архитектурах согласно рейтингу Graph500
Аннотация
Ключевые слова
Полный текст:
PDFЛитература
Cherkassky B.V., Goldberg A.V., Radzik T. Shortest Paths Algorithms: Theory and Experimental Evaluation. Math. Program. 1996. vol. 73. pp. 129–174. DOI: 10.1007/BF02592101.
Moore E.F. The Shortest Path Through a Maze. In Int. Symp. on Th. of Switching. 1959 pp. 285–292.
Chazelle B.A., Minimum Spanning Tree Algorithm with Inverse-Ackermann Type Complexity. Journal of the ACM. 2000. vol. 47, no. 6. pp. 1028–1047.
Kolganov A.S. Parallel Implementation of Minimum Spanning Tree Algorithm on CPU and GPU. Vestnik Yuzho-Uralskogo gosudarstvennogo universiteta. Seriya: Vychislitel’naja matematika i informatika [Bulletin of South Ural State University. Series: Computational Mathematics and Software Engineering]. 2016, vol. 5, no. 3. pp. 5–19. DOI: 10.14529/cmse160301 (in Russian).
Graph500. Available at: http://graph500.org/ (accessed 01.12.2017)
GreenGraph500. Available at: http://green.graph500.org/ (accessed 01.12.2017)
Cormen T., Leiserson C., Rivest R. Introduction to Algorithms. MIT Press, Cambridge. 1990.
Edmonds J., Karp R.M. Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems. Journal of the ACM. 1972. vol. 19, no. 2. pp. 248–264. DOI: 10.1007/3-540-36478-1_4.
Brandes U. A Faster Algorithm for Betweenness Centrality. J. Math. Sociol. 2001. vol. 25, no. 2. pp. 163–177. DOI: 10.1080/0022250X.2001.9990249.
Frasca M., Madduri K., Raghavan P. NUMA-Aware Graph Mining Techniques for Performance and Energy Efficiency. Proc. ACM/IEEE Int. Conf. High Performance Computing, Networking, Storage and Analysis (Salt Lake City, Utah, USA, November 10–16, 2012). pp. 1–11. DOI: 110.1109/SC.2012.81.
Girvan M., Newman M.E. Community Structure in Social and Biological Networks. Proc. Natl. Acad. Sci. (USA, June 11, 2002). vol. 99, no. 12. pp. 7821 7826. DOI: 10.1073/pnas.122653799.
Dijkstra E.W. A Note on Two Problems in Connexion with Graphs. Numerische Mathematik. 1959. vol. 1, no. 1. pp. 269–271. DOI: 10.1007/BF01386390.
Top500. Available at: https://www.top500.org/ (accessed 01.12.2017)
Bader D.A., Madduri K. Designing Multithreaded Algorithms For Breadth-First Search and St-Connectivity on the Cray MTA-2. 2006. pp. 523–530. DOI: 10.1109/ICPP.2006.34.
Korf R.E., Schultze P. Large-Scale Parallel Breadth-First Search. AAAI. 2005. pp. 1380–1385.
Yoo A., Chow E., Henderson K., McLendon W., Hendrickson B., Catalyurek U. A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L. Proceedings of the 2005 ACM/IEEE Conference on Supercomputing (Seattle, Washington, USA, November 12–18, 2005). DOI: 10.1109/SC.2005.4.
Zhang Y., Hansen E.A. Parallel Breadth-First Heuristic Search On A Shared-Memory Architecture. In AAAI Workshop on Heuristic Search, Memory-Based Heuristics and Their Applications. 2006.
Yasui Y., Fujisawa K., Sato Y. Fast and Energy-efficient Breadth-First Search on a Single NUMA System. Lecture Notes in Computer Science. 2014. vol. 8488. pp. 365–381. DOI: 10.1007/978-3-319-07518-1_23.
Hiragushi T., Takahashi D. Efficient Hybrid Breadth-First Search on GPUs. Lecture Notes in Computer Science. 2013. vol. 8286. pp. 40–50. DOI: 10.1007/978-3-319-03889-6_5.
Merrill D., Garland M., Grimshaw A. Scalable GPU graph traversal. Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (New Orleans, Louisiana, USA, February 25–29, 2012). pp. 117–128. DOI: 10.1145/2370036.2145832.
Chakrabarti D., Zhan Y., Faloutsos C. R-MAT: A Recursive Model for Graph Mining. Proceedings of the 2004 SIAM International Conference on Data Mining (Florida, USA, April 22–24, 2004). pp. 442–446. DOI: 10.1137/1.9781611972740.43.
Pissanetzky S. Sparse Matrix Technology. Academic Press. 1984.
Posledovatel’naja realizacija BFS. Available at: https://en.wikipedia.org/wiki/ Breadth-first_search (accessed 01.12.2017)
CUDA Dynamic Parallelism. Available at: https://devblogs.nvidia.com/parallelforall/cuda-dynamic-parallelism-api-principles/ (accessed 01.12.2017)
Intel Xeon Phi 7250. Available at: https://ark.intel.com/ru/products/94035/Intel-Xeon-Phi-Processor-7250-16GB-1_40-GHz-68-core (accessed 01.12.2017)
Intel Xeon E5 2699 v3. Available at: https://ark.intel.com/ru/products/81061/Intel-Xeon-Processor-E5-2699-v3-45M-Cache-2_30-GHz (accessed 01.12.2017)
IBM Power8 and NVidia Tesla P100. Available at: https://www-03.ibm.com/systems/ru/power/hardware/s822lc-hpc/ (accessed 01.12.2017)
Nvidia NVLink. Available at: http://www.nvidia.com/object/nvlink.html (accessed 01.12.2017)
DOI: http://dx.doi.org/10.14529/cmse180201