Нейросетевой метод обнаружения аномалий в многомерных потоковых временных рядах

Яна Александровна Краева

Аннотация


В статье рассмотрена задача детекции аномальных подпоследовательностей многомерного потокового временного ряда, элементы которого поступают в режиме реального времени, возникающая в настоящее время в широком спектре предметных областей: промышленный Интернет вещей, персональное здравоохранение и др. Предложен новый метод решения указанной задачи, получивший название mDiSSiD (Discord, Snippet, and Siamese Neural Network-based Detector of multivariate anomalies). Предложенный метод использует концепцию диссонанса временного ряда (подпоследовательность, имеющая наиболее не похожего на нее ближайшего соседа), обобщенную на многомерный случай. Под многомерным диссонансом понимается N-мерная подпоследовательность d-мерного временного ряда (где 1 ⩽ N ⩽ d), которая наиболее не похожа на все остальные подпоследовательности N-мерных временных рядов, полученных путем составления всевозможных сочетаний из d рядов по N. Детекция аномалий реализуется с помощью нейросетевой модели на основе сиамских нейросетей. Вычислительные эксперименты на реальных временных рядах из различных предметных областей показали, что метод mDiSSiD в среднем опережает по точности обнаружения аномалий передовые аналоги, использующие иные нейросетевые подходы (сверточные и рекуррентные нейронные сети, автоэнкодеры, генеративно-состязательные сети).

Ключевые слова


многомерный временной ряд; поиск аномалий; диссонанс; сниппет; сиамская нейронная сеть

Полный текст:

PDF

Литература


Blázquez-García A., Conde A., Mori U., Lozano J.A. A Review on Outlier/Anomaly Detection in Time Series Data. ACM Comput. Surv. 2021. Vol. 54, no. 3. 56:1–56:33. DOI: 10.1145/3444690.

Kumar S., Tiwari P., Zymbler M.L. Internet of Things is a revolutionary approach for future technology enhancement: a review. J. Big Data. 2019. Vol. 6. P. 111. DOI: 10.1186/s40537-019-0268-2.

Zymbler M.L., Kraeva Y.A., Latypova E.A., et al. Cleaning Sensor Data in Intelligent Heating Control System. Bulletin of the South Ural State University. Series: Computational Mathematics and Software Engineering. 2021. Vol. 10, no. 3. P. 16–36. (in Russian) DOI: 10.14529/cmse210302.

Ivanov S.A., Nikolskaya K.Y., Radchenko G.I., et al. Digital Twin of a City: Concept Overview. Bulletin of the South Ural State University. Series: Computational Mathematics and Software Engineering. 2020. Vol. 9, no. 4. P. 5–23. (in Russian) DOI: 10.14529/cmse200401.

Volkov I., Radchenko G.I., Tchernykh A. Digital Twins, Internet of Things and Mobile Medicine: A Review of Current Platforms to Support Smart Healthcare. Program. Comput. Softw. 2021. Vol. 47, no. 8. P. 578–590. DOI: 10.1134/S0361768821080284.

Schmidl S., Wenig P., Papenbrock T. Anomaly Detection in Time Series: A Comprehensive Evaluation. Proc. VLDB Endow. 2022. Vol. 15, no. 9. P. 1779–1797. URL: https://www.vldb.org/pvldb/vol15/p1779-wenig.pdf.

Paparrizos J., Kang Y., Boniol P., et al. TSB-UAD: An End-to-End Benchmark Suite for Univariate Time-Series Anomaly Detection. Proc. VLDB Endow. 2022. Vol. 15, no. 8. P. 1697–1711. URL: https://www.vldb.org/pvldb/vol15/p1697paparrizos.pdf.

Hodge V.J., Austin J. A Survey of Outlier Detection Methodologies. Artif. Intell. Rev. 2004. Vol. 22, no. 2. P. 85–126. DOI: 10.1023/B:AIRE.0000045502.10941.a9.

Malhotra P., Vig L., Shroff G., Agarwal P. Long Short Term Memory Networks for Anomaly Detection in Time Series. 23rd European Symposium on Artificial Neural Networks, ESANN 2015, Bruges, Belgium, April 22-24, 2015. 2015. URL: https://www.esann.org/sites/default/files/proceedings/legacy/es2015-56.pdf.

Munir M., Siddiqui S.A., Dengel A., Ahmed S. DeepAnT: A Deep Learning Approach for Unsupervised Anomaly Detection in Time Series. IEEE Access. 2019. Vol. 7. P. 1991–2005. DOI: 10.1109/ACCESS.2018.2886457.

Kraeva Y.A. Detection of Time Series Anomalies Based on Data Mining and Neural Network Technologies. Bulletin of the South Ural State University. Series: Computational Mathematics and Software Engineering. 2023. Vol. 12, no. 3. P. 50–71. (in Russian) DOI: 10.14529/cmse230304.

Chicco D. Siamese Neural Networks: An Overview. Artificial Neural Networks / ed. by H. Cartwright. New York, NY: Springer US, 2021. P. 73–94. DOI: 10.1007/978- 1- 0716-0826-5_3.

He K., Zhang X., Ren S., Sun J. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016. IEEE Computer Society, 2016. P. 770–778. DOI: 10.1109/CVPR.2016.90.

Yankov D., Keogh E.J., Rebbapragada U. Disk aware discord discovery: Finding unusual time series in terabyte sized datasets. Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), October 28-31, 2007, Omaha, Nebraska, USA. 2007. P. 381–390. DOI: 10.1109/ICDM.2007.61.

Imani S., Madrid F., Ding W., et al. Matrix Profile XIII: Time Series Snippets: A New Primitive for Time Series Data Mining. 2018 IEEE International Conference on Big Knowledge, ICBK 2018, Singapore, November 17-18, 2018 / ed. by X.Wu, Y. Ong, C.C. Aggarwal, H. Chen. IEEE Computer Society, 2018. P. 382–389. DOI: 10.1109/ICBK.2018.00058.

Tafazoli S., Keogh E.J. Matrix Profile XXVIII: Discovering Multi-Dimensional Time Series Anomalies with K of N Anomaly Detectiondagger . Proceedings of the 2023 SIAM International Conference on Data Mining, SDM 2023, Minneapolis-St. Paul Twin Cities, MN, USA, April 27-29, 2023 / ed. by S. Shekhar, Z. Zhou, Y. Chiang, G. Stiglic. SIAM, 2023. P. 685–693. DOI: 10.1137/1.9781611977653.CH77.

Gharghabi S., Imani S., Bagnall A.J., et al. An ultra-fast time series distance measure to allow data mining in more complex real-world deployments. Data Min. Knowl. Discov. 2020. Vol. 34, no. 4. P. 1104–1135. DOI: 10.1007/s10618-020-00695-8.

Yeh C.M., Zhu Y., Ulanova L., et al. Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View That Includes Motifs, Discords and Shapelets. IEEE 16th International Conference on Data Mining, ICDM 2016, December 12-15, 2016, Barcelona, Spain / ed. by F. Bonchi, J. Domingo-Ferrer, R. Baeza-Yates, et al. IEEE Computer Society, 2016. P. 1317–1322. DOI: 10.1109/ICDM.2016.0179.

Zymbler M., Goglachev A. Fast Summarization of Long Time Series with Graphics Processor. Mathematics. 2022. Vol. 10, no. 10. P. 1781. DOI: 10.3390/math10101781.

Zymbler M., Kraeva Y. High-Performance Time Series Anomaly Discovery on Graphics Processors. Mathematics. 2023. Vol. 11, no. 14. P. 3193. DOI: 10.3390/math11143193.

Ioffe S., Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015. Vol. 37 / ed. by F.R. Bach, D.M. Blei. JMLR.org, 2015. P. 448–456. JMLR Workshop and Conference Proceedings. URL: http://proceedings.mlr.press/v37/ioffe15.html.

Wenig P., Schmidl S., Papenbrock T. TimeEval: A Benchmarking Toolkit for Time Series Anomaly Detection Algorithms. Proc. VLDB Endow. 2022. Vol. 15, no. 12. P. 3678–3681. URL: https://www.vldb.org/pvldb/vol15/p3678-schmidl.pdf.

Roggen D., Calatroni A., Rossi M., et al. Collecting complex activity datasets in highly rich networked sensor environments. Seventh International Conference on Networked Sensing Systems, INSS 2010, Kassel, Germany, June 15-18, 2010. IEEE, 2010. P. 233–240. DOI: 10.1109/INSS.2010.5573462.

Bächlin M., Plotnik M., Roggen D., et al. Wearable assistant for Parkinson’s disease patients with the freezing of gait symptom. IEEE Trans. Inf. Technol. Biomed. 2010. Vol. 14, no. 2. P. 436–446. DOI: 10.1109/TITB.2009.2036165.

Moody G., Mark R. The impact of the MIT-BIH Arrhythmia Database. IEEE Engineering in Medicine and Biology Magazine. 2001. Vol. 20, no. 3. P. 45–50. DOI: 10.1109/51.932724

Sakurada M., Yairi T. Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction. Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, Gold Coast, Australia, QLD, Australia, December 2, 2014 / ed. by A. Rahman, J.D. Deng, J. Li. ACM, 2014. P. 4. DOI: 10.1145/2689746.2689747.

Bashar M.A., Nayak R. TAnoGAN: Time Series Anomaly Detection with Generative Adversarial Networks. 2020 IEEE Symposium Series on Computational Intelligence, SSCI 2020, Canberra, Australia, December 1-4, 2020. IEEE, 2020. P. 1778–1785. DOI: 10.1109/SSCI47803.2020.9308512.

Paparrizos J., Boniol P., Palpanas T., et al. Volume Under the Surface: A New Accuracy Evaluation Measure for Time-Series Anomaly Detection. Proc. VLDB Endow. 2022. Vol. 15, no. 11. P. 2774–2787. URL: https://www.vldb.org/pvldb/vol15/p2774-paparrizos.pdf.

Bilenko R.V., Dolganina N.Y., Ivanova E.V., Rekachinsky A.I. High-performance Computing Resources of South Ural State University. Bulletin of the South Ural State University. Series: Computational Mathematics and Software Engineering. 2022. Vol. 11, no. 1. P. 15–30. (in Russian) DOI: 10.14529/cmse220102.

Hadsell R., Chopra S., LeCun Y. Dimensionality Reduction by Learning an Invariant Mapping. 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 17-22 June 2006, New York, NY, USA. IEEE Computer Society, 2006. P. 1735–1742. DOI: 10.1109/CVPR.2006.100.

Kingma D., Ba J. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (ICLR). San Diega, CA, USA, 2015.




DOI: http://dx.doi.org/10.14529/cmse240403