A Comprehensive Review and Empirical Assessment of Data Augmentation Techniques in Time-Series Classification

Dr. Elena M. Petrovic; Dr. Rajan V. Subramaniam

doi:10.55640/irjaet-v02i09-01

Authors

Dr. Elena M. Petrovic Department of Informatics, ETH Zurich, Switzerland
Dr. Rajan V. Subramaniam Department of Computer Science and Engineering, Indian Institute of Technology (IIT) Bombay, India

DOI:

https://doi.org/10.55640/irjaet-v02i09-01

Keywords:

Time-Series Classification, Data Augmentation, Empirical Evaluation

Abstract

Time-series data is ubiquitous across various domains, from healthcare and finance to industrial monitoring and human activity recognition. The accurate classification of such data is crucial for informed decision-making and automated systems. However, a common challenge in developing robust time-series classification models, especially deep learning-based ones, is the scarcity of sufficiently large and diverse labeled datasets. Data augmentation has emerged as a powerful technique to address this limitation by synthetically expanding the training data, thereby enhancing model generalization and reducing overfitting. While data augmentation has been extensively studied in domains like image processing and natural language processing, its application and effectiveness in time-series classification present unique challenges and opportunities. This article provides a comprehensive survey of existing data augmentation techniques specifically tailored for time-series classification. Furthermore, it synthesizes empirical findings from a wide range of studies, discussing the efficacy of different augmentation strategies across various datasets and model architectures. We categorize augmentation methods, analyze their underlying principles, and highlight their impact on classification performance. Finally, we identify current limitations and propose future research directions to foster the development of more effective and universally applicable time-series data augmentation methodologies.

References

Connor Shorten and Taghi M Khoshgoftaar. 2019. A survey on image data augmentation for deep learning. Journal of big data 6, 1 (2019), 1–48.

A Jung et al. 2017. imgaug: Image augmentation for machine learning experiments. p. Accessed 3 (2017), 977–997.

G Forestier, J Weber, L Idoumghar, PA Muller, et al. 2019. Deep learning for time series classification: a review. Data Min. Knowl. Discov. 33, 4 (2019), 917–963.

Brian Kenji Iwana and Seiichi Uchida. 2021. An empirical survey of data augmentation for time series classification with neural networks. Plos one 16, 7 (2021), e0254841.

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).

Terry T Um, Franz MJ Pfister, Daniel Pichler, Satoshi Endo, Muriel Lang, Sandra Hirche, Urban Fietzek, and Dana Kulić. 2017. Data augmentation of wearable sensor data for parkinson’s disease monitoring using convolutional neural networks. In Proceedings of the 19th ACM international conference on multimodal interaction. 216–220.

Khandakar M Rashid and Joseph Louis. 2019. Window-warping: a time series data augmentation of IMU data for construction equipment activity identification. In ISARC. Proceedings of the international symposium on automation and robotics in construction, Vol. 36. IAARC Publications, 651–657.

Edgar Talavera, Guillermo Iglesias, Ángel González-Prieto, Alberto Mozo, and Sandra Gómez-Canaval. 2022. Data augmentation techniques in time series domain: A survey and taxonomy. arXiv preprint arXiv:2206.13508 2206.13508 (2022).

Anthony Bagnall, Jason Lines, Aaron Bostrom, James Large, and Eamonn Keogh. 2017. The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data mining and knowledge discovery 31 (2017), 606–660.

John Cristian Borges Gamboa. 2017. Deep learning for time-series analysis. arXiv preprint arXiv:1701.01887 1701.01887 (2017).

Hoang Anh Dau, Anthony Bagnall, Kaveh Kamgar, Chin-Chia Michael Yeh, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatan, and Eamonn Keogh. 2019. The UCR time series archive. IEEE/CAA Journal of Automatica Sinica 6, 6 (2019), 1293–1305.

Peiyu Li, Soukaïna Filali Boubrahimi, and Shah Muhammad Hamdi. 2021. Shapelets-based Data Augmentation for Time Series Classification. In 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 1373–1378.

Arthur Le Guennec, Simon Malinowski, and Romain Tavenard. 2016. Data augmentation for time series classification using convolutional neural networks. In ECML/PKDD workshop on advanced analytics and learning on temporal data.

Germain Forestier, François Petitjean, Hoang Anh Dau, Geoffrey I Webb, and Eamonn Keogh. 2017. Generating synthetic time series to augment sparse datasets. In 2017 IEEE international conference on data mining (ICDM). IEEE, 865–870.

Shota Haradal, Hideaki Hayashi, and Seiichi Uchida. 2018. Biosignal data augmentation based on generative adversarial networks. In 2018 40th annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE, 368–371.

Elizabeth Fons, Paula Dawson, Xiao-jun Zeng, John Keane, and Alexandros Iosifidis. 2020. Evaluating data augmentation for financial time series classification. arXiv preprint arXiv:2010.15111 2010.15111 (2020).

International Research Journal of Advanced Engineering and Technology

Article Details Page