Выполнение распределенных вычислительных экспериментов на MLOps платформе НИУ ВШЭ
Аннотация
Ключевые слова
Полный текст:
PDFЛитература
Korenkov V. GRID technologies: status and prospectives. Herald of the International Academy of Science. Russian Section. 2010. No. 1. P. 41–44. (in Russian) DOI: 10.3997/2214-4609.20142827.
Pimenov A., Fedorov I., Bezzateev S. Fog computing architecture using blockchain technology. Information and Control Systems. 2022. Oct. No. 5. P. 40–48. (in Russian) DOI: 10.31799/1684-8853-2022-5-40-48.
Sukhoroslov O.V., Afanasiev A. Everest: A Cloud Platform for Computational Web Services. CLOSER. 2014. P. 411–416. DOI: 10.5220/0004941404110416.
Centre of Artificial Intelligence – HSE University. 2024. URL: https://cs.hse.ru/aicenter/ (in Russian).
Antonenko V., Chupakhin A., Kolosov A., et al. On HPC and Cloud Environments Integration. Performance Evaluation Models for Distributed Service Networks. Springer, 2021. P. 159–185. DOI: 10.1007/978-3-030-67063-4_8.
Ejarque J., Badia R.M., Albertin L., et al. Enabling dynamic and intelligent workflows for HPC, data analytics, and AI convergence. Future generation computer systems. 2022. Vol. 134. P. 414–429. DOI: 10.1016/j.future.2022.04.014.
Sukhoroslov O. Combined use of high-performance resources and Grid infrastructures within the Everest cloud platform. Supercomputer Days in Russia. 2015. P. 706–711. (in Russian).
Velikhov V., Klimentov A., Mashinistov R., et al. Integration of heterogeneous computing resources at NRI “Kurchatov Institute” for large-scale scientific computations. Izvestiya SFedU. Engineering Sciences. 2016. No. 11 (184). P. 88–100. (in Russian).
Kutovskiy N., Mitsyn V., Moshkin A., et al. Integration of distributed heterogeneous computing resources for the mpd experiment with DIRAC Interware. Physics of Particles and Nuclei. 2021. Vol. 52, no. 4. P. 999. (in Russian).
Feoktistov A.G., Sidorov I.A., Sergeev V.V., et al. Virtualization of heterogeneous HPCclusters based on OpenStack platform. Bulletin of the South Ural State University. Series: Computational Mathematics and Software Engineering. 2017. Vol. 6, no. 2. P. 37–48. (in Russian) DOI: 10.14529/cmse170203.
Silva R.F.D., Badia R.M., Bard D., et al. Frontiers in scientific workflows: Pervasive integration with high-performance computing. Computer. 2024. Vol. 57, no. 8. P. 36–44. DOI: 10.1109/mc.2024.3401542.
Stubbs J., Cardone R., Packard M., et al. Tapis: An API platform for reproducible, distributed computational research. Advances in Information and Communication: Proceedings of the 2021 Future of Information and Communication Conference (FICC), Vol. 1. Springer, 2021. P. 878–900. DOI: 10.1007/978-3-030-73100-7_61.
Vorontsov K., Iglovikov V., Strijov V., et al. Roundtable: Challenges in repeatable experiments and reproducible research in data science. Proceedings of Moscow Institute of Physics and Technology. 2021. Vol. 13, no. 2 (50). P. 100–108. (in Russian) DOI: 10.53815/20726759_2021_13_2_100.
Khritankov A., Pershin N., Ukhov N., Ukhov A. MLDev: Data Science Experiment Automation and Reproducibility Software. International Conference on Data Analytics and Management in Data Intensive Domains. Springer, 2021. P. 3–18. DOI: 10.1007/978-3-031-12285-9_1.
Alam K., Roy B. Challenges of provenance in scientific workflow management systems. 2022 IEEE/ACM Workshop on Workflows in Support of Large-Scale Science (WORKS). IEEE, 2022. P. 10–18. DOI: 10.1109/works56498.2022.00007.
Dhruv A., Dubey A. Managing software provenance to enhance reproducibility in computational research. Computing in Science & Engineering. 2023. Vol. 25, no. 3. P. 60–65. DOI: 10.1109/mcse.2023.3314288.
Zybin R., Shvetsova V., Badalyan D., et al. Cloud environment “Asperitas”. 2022. (in Russian). Certificate of state registration of a computer program RU 2022682679.
Grushin D., Samovarov O., Hashba E. SaaS platform for organizing a unified web environment for research, development and education “Fanlight”. 2018. (in Russian). Certificate of state registration of a computer program RU 2018615444.
Nasonov D., Butakov N., Bukhanovsky A., et al. Technology for organizing management and processing big data – DataMall. 2020. (in Russian). Certificate of state registration of a computer program RU 2020664222.
Kreuzberger D., Kühl N., Hirschl S. Machine learning operations (MLOPS): Overview, definition, and architecture. IEEE Access. 2023. Vol. 11. P. 31866–31879. DOI: 10.1109/access.2023.3262138.
Tyutlyaeva E.O., Odintsov I.O., Marmuzov G.V., et al. Development trends of modern supercomputers. Bulletin of the South Ural State University. Series: Computational Mathematics and Software Engineering. 2019. Vol. 8, no. 3. P. 92–114. (in Russian) DOI: 10.14529/cmse190305.
Wilkinson M.D., Dumontier M., Aalbersberg I.J., et al. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data. 2016. Vol. 3, no. 1. P. 1–9. DOI: 10.1038/sdata.2016.18.
Kostenetskiy P., Shamsutdinov A., Chulkevich R., et al. HPC TaskMaster - Task Efficiency Monitoring System for the Supercomputer Center. Parallel Computational Technologies / ed. by L. Sokolinsky, M. Zymbler. Cham: Springer International Publishing, Jan. 2022. P. 17–29. DOI: 10.1007/978-3-031-11623-0_2.
Kostenetskiy P., Kozyrev V., Chulkevich R., Raimova A. Enhancement of the Data Analysis Subsystem in the Task-Efficiency Monitoring System HPC TaskMaster for the cHARISMa Supercomputer Complex at HSE University. Parallel Computational Technologies / ed. by L. Sokolinsky, M. Zymbler, V. Voevodin, J. Dongarra. Cham: Springer Nature Switzerland, 2024. P. 49–64. DOI: 10.1007/978-3-031-73372-7_4.
Lyu C., Zhang W., Huang H., et al. RTMDet: An Empirical Study of Designing Real-Time Object Detectors. CoRR. 2022. Vol. abs/2212.07784. DOI: 10.48550/ARXIV.2212.07784. arXiv: 2212.07784.
Slastnikov S., Chertova E. Machine vision model synthesis module for object and action detection. 2024. URL: https://cs.hse.ru/aicenter/rid_detection (in Russian).
Slastnikov S., Chertova E. A program for synthesis of machine vision models to detect objects and activities. 2023. (in Russian). Certificate of state registration of a computer program RU 2023660157.
Khritankov A.S. A method for performance analysis of distributed applications based on reference models. Parallel Computational Technologies (PCT’2011). 2011. P. 343–354. (in Russian).
Kostenetskiy P., Chulkevich R., Kozyrev V. HPC Resources of the Higher School of Economics. Journal of Physics: Conference Series. 2021. Jan. Vol. 1740, no. 1. P. 012050. DOI: 10.1088/1742-6596/1740/1/012050.
DOI: http://dx.doi.org/10.14529/cmse250203