Тенденции развития вычислительных узлов современных суперкомпьютеров
Аннотация
В данной работе выполнен анализ вычислительных узлов современных суперкомпьютеров с двух точек зрения — аппаратно-компонентной и инфраструктурной. На основании проведённого анализа названы основные конструктивные элементы, которыми должен быть оснащен современный вычислительный узел. В статье приведены классификации архитектур современных универсальных и специализированных ядер с примерами; проведен обзор современных тенденций организации подсистемы памяти и внутриузлового интерконнекта; упомянуты способы использования энергонезависимых устройств хранения на узлах при организации современных высокопроизводительных систем хранения. Также разобраны основные требования к организации инфраструктуры узла современного суперкомпьютера, в частности, дана краткая классификация современных подходов к организации жидкостного охлаждения и мониторинга вычислительных узлов. Выявленные тенденции приводят к основным вариантам дизайна вычислительных узлов, состоящих из энергоэффективного универсального процессора и совокупности энергоэффективных специализированных ускорителей. В статье сделан акцент на современных технологиях, которые достигли стадии выхода в производство или, как минимум, создания рабочих прототипов. Обсуждаются современные суперкомпьютерные задачи и их отображение на архитектуру вычислительных узлов. В заключении приведено кратное обсуждение актуальных технологических проблем и основных направлений для сохранения прогресса в компьютерной отрасли.
Ключевые слова
Полный текст:
PDFЛитература
The project of the Russian Academy of Science: «Development of the compute system for simulation of the exascale supercomputer». Available at: http://www.keldysh.ru/projects/exaflops.pdf (accessed: 23.01.2019).
Reed D.A., Dongarra J. Exascale Computing and Big Data. Communications of the ACM. 2015. vol. 58, no. 7. pp. 56-68. DOI: 10.1145/2699414.
Chrysos G. Intel Rbigcirc Xeon Phi coprocessor (codename Knights Corner). Proceedings of the 2012 IEEE Hot Chips 24 Symposium, HCS, August 27–29, 2012, Cupertino, CA. pp. 1–31. DOI: 10.1109/HOTCHIPS.2012.7476487.
Lindholm E.,Nickolls J., Oberman S., Montrym J. NVIDIA Tesla: A Unified
Graphics and Computing Architecture. IEEE Micro. 2008. vol. 28, no. 2. pp. 39–55. DOI: 10.1109/MM.2008.31.
Jouppi N., Young C., Patil N., Patterson D. Motivation for and Evaluation of the First Tensor Processing Unit. IEEE Micro. 2018. vol. 38, no. 3. pp. 10–19. DOI: 10.1109/MM.2018.032271057.
Davies M. et al. Loihi: A Neuromorphic Manycore Processor with On-Chip Learning. IEEE Micro. 2018. vol. 38, no. 1. pp. 82–99. DOI: 10.1109/MM.2018.112130359.
Hsu J. CES 2018: Intel’s 49-Qubit Chip Shoots for Quantum Supremacy. IEEE Spectrum Tech Talks. 2018. Available at: https://spectrum.ieee.org/tech-talk/computing/hardware/intels-49qubit-chip-aims-for-quantum-supremacy (accessed: 23.11.2018).
Intel Rbigcirc Stratix Rbigcirc 10 SoC FPGAs. Available at: https://www.intel.com/content/www/us/en/products/programmable/soc/stratix-10.html (accessed: 23.11.2018).
Exascale Requirements Review. An Office of Science review sponsored jointly by Advanced Scientific Computing Research and High Energy Physics. June 10–12, 2015 BETHESDA, MARYLAND. Available at: http://hepcce.org/files/2016/11/DOE-ExascaleReport-HEP-Final.pdf (accessed: 13.11.2018).
Top500 List Statistics. Release November 2018. Available at: https://www.top500.org/statistics/list/ (accessed: 16.11.2018).
Hemsoth N. Cascade Lake at Heart of 2019 TACC Supercomputer. Technology publication resource The Next Platform is published by Stackhouse Publishing Inc in partnership with The Register. Available at: https://www.nextplatform.com/2018/08/29/cascade-lake-heart-of-2019-tacc-supercomputer/ (accessed: 13.11.2018).
Bartsch V. D6.3 Initial Project Press Release. ExaNoDe Consortium Public deliverable. 2016. Available at: http://exanode.eu/wp-content/uploads/2017/04/D6.3.pdf (accessed: 16.11.2018).
ARMv8 — A Scalable Vector Extension for Post-K. FUJITSU LIMITED. 2016 Available at: http://www.fujitsu.com/global/Images/armv8-a-scalable-vector-extension-for-post-k.pdf (accessed: 22.01.2019).
Astra. Top500 The List. Available at: https://www.top500.org/system/179565 (accessed: 16.11.2018).
Xilinx. High Performance Computing and Data Storage. Available at: https://www.xilinx.com/applications/high-performance-computing.html (accessed: 23.11.2018).
Timmel A.N., Daly J.T. Multiplication with Fourier Optics Simulating 16-bit Modular Multiplication. Available at: https://arxiv.org/pdf/1801.01121.pdf (accessed: 23.11.2018).
Kim A.K., Perekatov V.I., Feldman V.M. On the way to russian exasistemes: plans of the Elbrus hardware-software platform develоpers on creation of an exaflops performance supercomputer. Voprosy radioelektroniki, 2018, no. 2, pp. 6–13. (in Russian)
CORAL Collaboration: Briefing on CORAL-2 RFP and Draft Technical Requirements. Vendor Webinar Meeting. 2017. Available at: https://procurement.ornl.gov/rfp/CORAL2/Brief-of-Draft-SOW-20171206-SA.PDF (accessed: 23.11.2018).
Farber R. HPC and AI — Two Communities Same Future. HPCwire: Global News and Information on High Performance Computing. 2018. Available at: https://www.hpcwire.com/2018/01/25/hpc-ai-two-communities-future/ (accessed: 23.11.2018).
JEDEC DDR5 & NVDIMM-P Standards Under Development. Global Standards for the Microelectronics Industry. 2017. Available at: https://www.jedec.org/news/pressreleases/jedec-ddr5-nvdimm-p-standards-under-development (accessed: 23.11.2018).
Hadidi R. et al. Demystifying the Characteristics of 3D-Stacked Memories: A Case Study for Hybrid Memory Cube. Proceedings of the IEEE International Symposium on Workload Characterization, IISWC 2017, October 1–3, 2017, Seattle, WA, USA. pp. 66-75. DOI: 10.1109/IISWC.2017.8167757.
High Bandwidth Memory (HBM) DRAM. JESD235A. Global Standards for the Microelectronics Industry. 2015. Available at: https://www.jedec.org/standards-documents/docs/jesd235a (accessed: 23.11.2018).
Hybrid Memory Cube (HMC). Hybrid Memory Cube Consortium Page. Available at: http://hybridmemorycube.org/ (accessed: 16.11.2018).
Intel Rbigcirc Memory Drive Technology Application Note. Available at: https://www.intel.com/content/dam/support/us/en/documents/memory-and-storage/intel-mem-drive-tech-appnote.pdf (accessed: 23.11.2018).
Graphics Double Data Rate (GDDR5) SGRAM standard. JESD212C. Global Standards for the Microelectronics Industry. 2016. Available at: https://www.jedec.org/standards-documents/docs/jesd212c (accessed: 23.11.2018).
Graphics Double Data Rate 6 (GDDR6) SGRAM standard. JESD250A. Global Standards for the Microelectronics Industry. 2017. Available at: https://www.jedec.org/standards-documents/docs/jesd250a (accessed: 23.11.2018).
Ferreira da Silva R., Callaghan S., Deelman E. On the use of burst buffers for accelerating data-intensive scientific workflows. Proceedings of the 12thWorkshop onWorkflows in Support of Large-Scale Science, WORKS ’17. ACM, 2017. pp. 2:1–2:9. DOI: 10.1145/3150994.3151000.
Bhimji W., Bard D., Romanus M., Paul, D., Ovsyannikov A., Friesen B., et al. Accelerating Science with the NERSC Burst Buffer Early User Program. Lawrence Berkeley National Laboratory. 2016. Available at: https://escholarship.org/uc/item/9wv6k14t (accessed: 23.11.2018).
Cray Rbigcirc DataWarpTM Applications I/O Accelerator. Available at: https://www.cray.com/products/storage/datawarp (accessed: 23.11.2018).
Morgan T.P. For many hyperconverged is the next platform. Technology publication resource The Next Platform is published by Stackhouse Publishing Inc in partnership with The Register. 2018. Available at: https://www.nextplatform.com/2018/01/29/hyperconverged-next-platform-many-jobs/ (accessed: 23.11.2018).
Bahur V. The RSC Technologies Company introduces hyperconverged HPC-solution based on state-of-the-are components. 2018. Available at: http://www.cnews.ru/news/line/2018-06-27_rsk_predstavila_giperkonvergentnoe_hpcreshenie (accessed: 23.11.2018). (in Russian)
The GEN-Z Consortium. Available at: https://genzconsortium.org/ (accessed: 23.11.2018).
The OpenCAPI Consortium. Available at: https://opencapi.org/ (accessed: 23.11.2018).
NVLink Fabric. Available at: https://www.nvidia.com/ru-ru/data-center/nvlink/ (accessed: 23.11.2018).
Product Brief: Intel Rbigcirc Xeon Rbigcirc Scalable Platform. Available at: https://www.intel.sg/content/www/xa/en/processors/xeon/scalable/xeon-scalable-platform-brief.html (accessed: 23.11.2018).
Infinity Fabric (IF) — AMD. Available at: https://en.wikichip.org/wiki/amd/infinity_fabric (accessed: 23.11.2018).
Russian microprocessors «Elbrus» and «MCST R» series and boards based thereon. Production catalogue 2017. Available at: http://mcst.ru/files/59db45/cf0cd8/50a21b/000000/katalog_produktsii_mtsst_hq.pdf (accessed: 23.11.2018). (in Russian)
CCIX. Available at: https://www.ccixconsortium.com/ (accessed: 23.11.2018).
Coherent Accelerator Processor Interface (CAPI). Available at: https://developer.ibm.com/linuxonpower/capi/ (accessed: 23.11.2018).
Shustikov V. Skoltex researches developed supercomputer «Zhores». The Skolkovo Foundation Press-Release. 2019. Available at: https://sk.ru/news/b/pressreleases/archive/2019/01/18/uchenye-skolteha-sozdali-superkompyuter-zhores.aspx (accessed: 25.01.2019). (in Russian).
Is Liquid Cooling Ready to Go Mainstream? HPCwire: Global News and Information on High Performance Computing. Available at: https://www.hpcwire.com/2017/02/13/liquid-cooling-ready-go-mainstream/ (accessed: 16.11.2018).
Aquila. Available at: https://www.aquilagroup.com/cooling/ (accessed: 26.11.2018).
RSC Group. Available at: http://www.rscgroup.ru/en (accessed: 25.01.2019).
Asetek. Available at: https://www.asetek.com/ (accessed: 26.11.2018).
Ebullient. Available at: http://ebullientcooling.com/ (accessed: 26.11.2018).
ExaScaler Inc. Overview. Available at: http://www.exascaler.co.jp/en/company (accessed: 26.11.2018).
M Server Solutions for Data Centers. Available at: https://www.3m.com/3M/en_US/data-center-us/solutions/data-center-servers/ (accessed: 26.11.2018).
Abramov S.M., Amelkin S.A., Romanenko A.Y., Simonov A.S., Chichkovsky A.A. The experience of implementing the high-performance computing systems with immersion cooling. Proceedings of the All-Russian Scientific and Technical Conference «Supercomputer Technologies» (Divnomorskoe, Russia, September, 29 – October, 4, 2014). pp. 9–15. (in Russian).
Levin I., Dordopulo A., Doronchenko Y., Raskladkin M., Fedorov A. Immersion cooling system for FPGA-based reconfigurable computer systems. Program systems: theory and applications. 2016. vol. 7:4(31), pp. 65–81. Available at: http://psta.psiras.ru/read/psta2016_4_65-81.pdf (accessed: 24.01.2019). (in Russian).
Liquid MIPS. Available at: http://www.liquidmips.com/cms/en-us/howitworks.aspx (accessed: 26.11.2018).
Libri A., Bartolini A., Benini L. Dwarf in a Giant: Enabling Scalable, High-Resolution HPC Energy Monitoring for Real-Time Profiling and Analytics. Available at: https://arxiv.org/pdf/1806.02698.pdf (accessed: 19.11.2018).
Grant R.E., Levenhagen M., Olivier S.L., DeBonis D., Pedretti K.T., Laros III J.H. Standardizing Power Monitoring and Control at Exascale. Computer, 2016. vol. 49, no. 10. pp. 38–46. DOI: 10.1109/MC.2016.308.
Georgiou Y., Glesser D., Trystram D. Adaptive Resource and Job Management for Limited Power Consumption. IEEE International Parallel and Distributed Processing Symposium Workshop, 2015, Hyderabad. pp. 863–870. DOI: 10.1109/IPDPSW.2015.118.
Shalf J.M., Leland R. Computing beyond Moore’s Law. Computer. 2015. vol. 48, no. 12. pp. 14–23. DOI: 10.1109/mc.2015.37.
DOI: http://dx.doi.org/10.14529/cmse190305