Методы управления work-stealing деками в динамических планировщиках многопроцессорных параллельных вычислений

Елена Алексеевна Аксёнова, Андрей Владимирович Соколов

Аннотация


В параллельных планировщиках задач, работающих по стратегии work-stealing, каждый процессор имеет свой дек задач. Один конец дека используется для добавления и извлечения задач только владельцем, а другой — для перехвата задач другими процессорами. В статье предлагается обзор методов управления work-stealing деками, которые используются при реализации work-stealing планировщиков параллельных задач, а также представлено описание поставленных и решенных нашим коллективом задач оптимального управления деками для стратегии work-stealing. Принцип алгоритмов оптимального управления деками в двухуровневой памяти заключается в том, что при переполнении выделенного участка быстрой памяти происходит перераспределение элементов (задач) дека между уровнями памяти. В быстрой памяти остаются элементы из концов дека, так как с ними будет происходить работа в ближайшее время, а элементы средней части дека хранятся в медленной памяти. В таком случае необходимо определить оптимальное количество элементов, которое нужно оставить в быстрой памяти, в зависимости от критерия оптимальности и параметров системы.

Ключевые слова


имитационные и марковские модели оптимального управления структурами данных; оптимальное кэширование деков; оптимальное управление work-stealing деками; оптимизация work-stealing балансировщиков; управляемые случайные блуждания

Полный текст:

PDF

Литература


Herlihy M., Shavit N. The Art of Multiprocessor Programming. Elsevier, 2008. 508 p.

Yu-Kwong K., Ishfaq A. Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors. ACM Computing Surveys. 1999. Vol. 31, no. 4. P. 406–471. DOI: 10.1145/344588.344618.

Alakeel A.M. A Guide to Dynamic Load Balancing in Distributed Computer Systems. International Journal of Computer Science and Network Security. 2010. Vol. 10, no. 6. P. 153–160.

Beaumont O., Carter L., Ferrante J., et al. Centralized versus Distributed Schedulers for Bagof-Tasks Applications. IEEE Transactions on Parallel and Distributed Systems. 2008. Vol. 19, no. 5. P. 698–709. DOI: 10.1109/TPDS.2007.70747.

Xia Y., Prasanna V. Hierarchical Scheduling of DAG Structured Computations on Manycore Processors with Dynamic Thread Grouping. Job Scheduling Strategies for Parallel Processing. Springer, 2010. Lecture Notes in Computer Science. Vol. 6253, P. 154–174. DOI: 10.1007/978-3-642-16505-4_9.

Hendler D., Shavit N. Non-Blocking Steal-Half Work Queues. Proceedings of the 21st annual symposium on Principles of distributed computing, PODC ’02, Monterey, California, July 21–24, 2002. New York, ACM, 2002. P. 280–289. DOI: 10.1145/571825.571876.

Hendler D., Shavit N. Work Dealing. Proceedings of the 14th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’02, Winnipeg, Canada, August 10–13, 2002. New York, ACM, 2002. P. 164–172. DOI: 10.1145/564870.564900.

Acar U.A., Chargueraud A., Rainey M. Scheduling Parallel Programs by Work Stealing with Private Deques. ACM SIGPLAN Notices. 2013. Vol. 48, no. 8. P. 219–228. DOI: 10.1145/2517327.2442538.

Arora N.S., Blumofe R.D., Plaxton C.G. Thread Scheduling for Multiprogrammed Multiprocessors. Proceedings of the 10th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’98, Puerto Vallarta, Mexico, June 28 – July 2, 1998. New York, ACM, 1998. P. 119–129. DOI: 10.1145/277651.277678.

Yang J., He Q. Scheduling Parallel Computations by Work Stealing: A Survey. International Journal of Parallel Programming. 2018. Vol. 46, no. 2. P. 173–197. DOI: 10.1007/s10766-016-0484-8.

Blumof R.D., Joerg C.F., Kuszmaul B.C., et al. Cilk: An Efficient Multithreaded Runtime System. ACM SIGPLAN Notices. 1995. Vol. 30, no. 8. P. 207–216. DOI: 10.1145/209937.209958.

Leijen D., Schulte W., Burckhardt S. The Design of a Task Parallel Library. ACM SIGPLAN Notices. 2009. Vol. 44, no. 10. P. 227–242. DOI: 10.1145/1639949.1640106.

Tardieu O., Wang H., Lin H. A Work-Stealing Scheduler for X10’s Task Parallelism with Suspension. Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’12, New Orleans, Louisiana, USA, February 25–29, 2012. New York, ACM, 2012. P. 267–276. DOI: 10.1145/2145816.2145850.

Knuth D. The Art of Computer Programming. Volume 1. Addison-Wesley Professional, 1997. 672 p.

Blumofe R.D., Leiserson C.E. Scheduling Multithreaded Computations by Work Stealing. Journal of the ACM. 1999. Vol. 46, no. 5. P. 720–748. DOI: 10.1145/324133.324234.

Alam M., Varshney A.K. A New Approach of Dynamic Load Balancing Scheduling Algorithm for Homogeneous Multiprocessor System. International Journal of Applied Evolutionary Computation. 2016. Vol. 7, no. 2. P. 61–75. DOI: 10.4018/IJAEC.2016040104.

Amelina N., Fradkov A., Jiang Y., Vergados D.J. Approximate Consensus in Stochastic Networks With Application to Load Balancing. IEEE Transactions on Information Theory. 2015. Vol. 61, no. 4. P. 1739–1752. DOI: 10.1109/TIT.2015.2406323.

Li J., Agrawal K., Elnikety S., et al. Work Stealing for Interactive Services to Meet Target Latency. ACM SIGPLAN Notices. 2016. Vol. 51, no. 8. P. 1–13. DOI: 10.1145/3016078.2851151.

Robison A., Voss M., Kukanov A. Optimization via Reflection on Work Stealing in TBB. Proceedings of the IEEE International Parallel & Distributed Processing Symposium, IPDPS ’08, Miami, FL, USA, April 14–18, 2008. IEEE, 2008. P. 1–8. DOI: 10.1109/IPDPS.2008.4536188.

Dijk T., Pol J.C. Lace: Non-blocking Split Deque for Work-Stealing. Parallel Processing Workshops. Springer, 2014. Lecture Notes in Computer Science. Vol. 8806, P. 206–217. DOI: 10.1007/978-3-319-14313-2_18.

Faxen K.-F. Wool-A work stealing library. ACM SIGARCH Computer Architecture News. 2008. Vol. 36, no. 5. P. 93–100. DOI: 10.1145/1556444.1556457.

Gmys J., Leroy R., Mezmaz M., et al. Work Stealing with Private Integer-Vector-Matrix Data Structure for Multi-core Branch-and-Bound Algorithms. Concurrency and Computation Practice and Experience. 2016. Vol. 28, no. 18. P. 4463–4484. DOI: 10.1002/cpe.3771.

Wimmer M., Versaci F., Traff J.L., et al. Data Structures for Task-based Priority Scheduling. ACM SIGPLAN Notices. 2014. Vol. 49, no. 8. P. 379–380. DOI: 10.1145/2692916.2555278.

Chen Q., Guo M., Guan H. LAWS: Locality-Aware Work-Stealing for Multi-socket Multi-core Architectures. Proceedings of the 28th ACM International Conference on Supercomputing, ICS’14, Munich, Germany, June 10–13, 2014. New York, ACM, 2014. P. 3–12. DOI: 10.1145/2597652.2597665.

Guo H., Chen Q., Guo M., Xu L. SAWS: Selective Asymmetry-Aware Work-Stealing for Asymmetric Multi-core Architectures. Proceedings of the IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2016, Sydney, Australia, December 12–14, 2016. IEEE, 2016. P. 116–123. DOI: 10.1109/HPCC-SmartCity-DSS.2016.0027.

Guo Y., Zhao J., Cave V., Sarkar V. SLAW: A Scalable Locality-aware Adaptive Workstealing Scheduler for Multi-core Systems. ACM SIGPLAN Notices. 2010. Vol. 45, no. 5. P. 341–342. DOI: 10.1145/1837853.1693504.

Mitzenmacher M. Analyses of Load Stealing Models Based on Differential Equations. Proceedings of the 10th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA/PODC98, Puerto Vallarta, Mexico, June 28 – July 2, 1998. New York, ACM, 1998. P. 212–221. DOI: 10.1145/277651.277687.

Wangkai J., Xiangjun P. SLITS: Sparsity-Lightened Intelligent Thread Scheduling. Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS ’23, Orlando, Florida, United States, June 19–23, 2023. New York, ACM, 2023 P. 21–22. DOI: 10.1145/3578338.3593568.

Kuchumov R.I. Implementation and Analysis of the Work-Stealing Task Scheduler. Stochastic Optimization in Computer Science. 2016. Vol. 12, no. 1. P. 20–39. (in Russian).

Sokolov A.V., Drac A.V. The Linked List Representation of n LIFO-Stacks and/or FIFO-Queues in the Single-Level Memory. Information Processing Letters. 2013. Vol. 113, no. 19–21. P. 832–835. DOI: 10.1016/j.ipl.2013.07.021.

Aksenova E.A., Lazutina A.A., Sokolov A.V. On Optimal Methods of Representing Dynamic Data Structures. Review of Applied and Industrial Mathematics. 2003. Vol. 10, no. 2. P. 375–376. (in Russian).

Hendler D., Lev Y., Moir M., Shavit N. A Dynamic-Sized Nonblocking Work Stealing Deque. Distributed Computing. 2006. Vol. 18. P. 189–207. DOI: 10.1007/s00446-005-0144-5.

Sokolov A.V. Mathematical Models and Algorithms of Optimal Control of Dynamic Data Structures. PetrSU Publishing, 2002. 215 p. (in Russian).

Drac A.V., Sokolov A.V. Control of Two FIFO-Queues in the Case of Their Movement One After Another in a Circle. Mathematical Methods of Pattern Recognition. 2011. Vol. 15, no. 1. P. 315–317. (in Russian).

Drac A.V., Sokolov A.V. The Circular Representation of 2 FIFO-Queues in Single Level Memory. Proceedings of the International Conference on Numerical Analysis and Applied Mathematics, ICNAAM-2014, Rhodes, Greece, September 22–28, 2014. AIP Publishing LLC, 2015. Vol. 1648. P. 520002. DOI: 10.1063/1.4912732.

Barkovsky E.A., Sokolov A.V. A Method of Memory Control of a Computer System (RU 2647627 C1). URL: http://www1.fips.ru/registers-doc-view/fips_servlet?DB=RUPAT&DocNumber=2647627&TypeFile=html (in Russian).

Sokolov A., Barkovsky E. The Mathematical Model and the Problem of Optimal Partitioning of Shared Memory for Work-Stealing Deques. Parallel Computing Technologies (PaCT 2015): Proceedings of the 13th International Conference, Petrozavodsk, Russia, August 31 – September 4, 2015. Springer, 2015. Lecture Notes in Computer Science. Vol. 9251, P. 102–106. DOI: 10.1007/978-3-319-21909-7_11.

Barkovksy E., Kuchumov R., Sokolov A. Optimal Control of Two Deques in Shared Memory with Various Work-Stealing Strategies. Program Systems: Theory and Applications. 2017. Vol. 8, no. 1. P. 83–103. DOI: 10.25209/2079-3316-2017-8-1-83-103. (in Russian).

Aksenova E.A., Sokolov A.V. Modeling of the Memory Management Process for Dynamic Work-Stealing Schedulers. Proceedings of Ivannikov ISPRAS Open Conference, ISPRAS 2017, Moscow, Russian Federation, November 30 – December 1, 2017. IEEE, 2017. P. 12–15. DOI: 10.1109/ISPRAS.2017.00009.

Barkovsky E., Lazutina A., Sokolov A. The Optimal Control of Two Work-Stealing Deques, Moving One After Another in a Shared Memory. Program Systems: Theory and Applications. 2019. Vol. 10, no. 1. P. 19–32. DOI: 10.25209/2079-3316-2019-10-1-3-17.

Aksenova E.A., Barkovsky E.A., Sokolov A.V. The Models and Methods of Optimal Control of Three Work-Stealing Deques Located in a Shared Memory. Lobachevskii Journal of Mathematics. 2019. Vol. 40. P. 1763–1770. DOI: 10.1134/S1995080219110052.

Kuchumov R., Sokolov A., Korkhov V. Staccato: Shared-Memory Work-Stealing Task Scheduler with Cache-Aware Memory Management. International Journal of Web and Grid Services. 2019. Vol. 15, no. 4. P. 394–407. DOI: 10.1504/IJWGS.2019.103233.

Work-Stealing Task Scheduler. URL: https://github.com/rkuchumov/staccato.

CPP Distributed Scheduler. URL: https://gitlab.com/mildlyparallel/cpp-distributed-scheduler.

Khaydarova R.R., Mouromtsev D.I., Lapaev M.V., Fishchenko V.D. Distributed convolutional neural network model on resource-constrained cluster. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2020, Vol. 20, no. 5, P. 739–746. DOI: 10.17586/2226-1494-2020-20-5-739-746.

Saddik A., Latif R., Ouardi A.E., et al. Computer development based embedded systems in precision agriculture: tools and application. Acta Agriculturae Scandinavica, Section B – Soil & Plant Science. 2022. Vol. 72, no. 1. P. 589–611. DOI: 10.1080/09064710.2021.2024874.

Lazutina A.A., Sokolov A.V. On Optimal Control of Work-Stealing Deques in Two-Level Memory. Bulletin of Computer and Information Technologies. 2020. Vol. 17, no. 4. P. 51–60. DOI: 10.14489/vkit.2020.04.pp.051-060. (in Russian).

Aksenova E.A, Lazutina A.A., Sokolov A.V. Minimizing the average cost of redistribution when working with work-stealing deques in two-level memory. Program Systems: Theory and Applications. 2021. Vol. 12, no. 2. P. 53–71. DOI: 10.25209/2079-3316-2021-12-2-53-71. (in Russian).

Aksenova E.A., Lazutina A.A., Sokolov A.V. About Optimal Management of Work-Stealing Deques in Two-Level Memory. Lobachevskii Journal of Mathematics. 2021. Vol. 42. P. 1475–1482. DOI: 10.1134/S1995080221070027.

Aksenova E., Sokolov A. Minimizing the Average Cost of Redistribution when Working with Two Work-Stealing Deques in Two-Level Memory. Russian Supercomputing Days: Proceedings of Russian Supercomputing Days, Moscow, Russia, September 26–27, 2022. Moscow, MAKS Press, 2022. P. 4–12. URL: https://russianscdays.org/files/2022/RuSCDays22_Proceedings.pdf




DOI: http://dx.doi.org/10.14529/cmse230403