Comparative study of Transformer- and LSTM-based machine learning methods for transient thermal field reconstruction

Comparative study of Transformer- and LSTM-based machine learning methods for transient thermal field reconstruction https://scientific-publications.ukaea.uk/wp-content/themes/blade/images/empty/thumbnail.jpg 150 150 UKAEA Opendata UKAEA Opendata https://secure.gravatar.com/avatar/bfb95e09a49fb313555a38d29c8599f7?s=96&d=mm&r=g 18th December 2023 18th December 2023

UKAEA-CCFE-PR(23)187

Comparative study of Transformer- and LSTM-based machine learning methods for transient thermal field reconstruction

Wiera Bielajewa Michelle Tindall Perumal Nithiarasu

Preprint Published

Solution reconstruction from limited number of measurements is useful in many areas of heat transfer applications. Unlike the standard problems, such reconstruction problems are ill-posed; thus, the non-uniqueness of solution and inherent instability severely complicates the modelling process. Consequently, more conventional inverse analysis methods to reconstruct solutions remain computationally intractable and lacking sufficient flexibility, especially when dealing with time-dependent problems. Aided by powerful Graphical Processing Units (GPUs), Machine Learning (ML) methods rose in popularity due to their flexibility and ability to efficiently process large amounts of data. In recent years, the Transformer-based ML models have gained recognition for their remarkable performance in Natural Language Processing (NLP) tasks as well as time-series analysis, overshadowing the performance of the ML models conventionally used for sequence processing, such as the long short-term memory (LSTM) models. These achievements make Transformer-based models seemingly ideal candidates for reconstructing full solutions from a few measurements. This article compares the performance of these novel Transformer based models with a simple LSTM model in reconstructing transient one-dimensional (1D) and two-dimensional (2D) thermal fields using sparse spatial measurements. Counterintuitively, the simple LSTM model achieves higher or comparable prediction accuracy compared to the complex Transformer-based models while also exhibiting shorter or comparable training times, which may render Transformer-based models a suboptimal choice for reconstructing transient solutions. Instead, more traditional sequence processing ML models, such as LSTM, might be preferred for this purpose.