Open this publication in new window or tab >>Show others...
2025 (English)In: Mechanical systems and signal processing, ISSN 0888-3270, E-ISSN 1096-1216, Vol. 239, article id 113226Article in journal (Refereed) Published
Abstract [en]
Bearing system is critical components in rotating machinery, whose health directly impacts operational safety. To address faults arising from bearing wear—and to overcome the limitations of conventional deep-learning methods that struggle to exploit both time- and frequency-domain information simultaneously—we propose FD-MVLLM, a novel reprogramming framework that combines a large language model (LLM) with multimodal vibration data for fault diagnosis. First, raw vibration signals are preprocessed to produce three distinct modalities: the original time series and two time–frequency representations. Convolutional layer then extract features from the time–frequency images. Innovatively, we reprogram both the raw time series and the image features using carefully designed text prototypes, yielding patch embeddings. To fully leverage the LLM's reasoning capabilities and boost diagnostic accuracy, we also integrate key time-domain and frequency-domain evaluation metrics into the prompt context, producing prompt embeddings. These patch and prompt embeddings are fed into an LLM fine-tuned via low-rank adaptation (LoRA); a final linear output layer translates the LLM's output into precise fault diagnoses. We validate FD-MVLLM on simulated rolling bearing fault data—generated using Hertz contact theory and Runge-Kutta numerical integration—as well as on several public benchmark datasets. Experimental results demonstrate that FD-MVLLM substantially outperforms fault diagnosis methods based on single-modal vibration data and LLM., highlighting its promise as a new paradigm for multimodal data-driven fault diagnosis. This work is open-sourced at https://github.com/youngpy996/FD-MVLLM.
Place, publisher, year, edition, pages
Elsevier BV, 2025
Keywords
Bearing system, Fault diagnosis, Fine-tuning, Large language model, Multimodal data, Vibration
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-369607 (URN)10.1016/j.ymssp.2025.113226 (DOI)001565443600001 ()2-s2.0-105014735903 (Scopus ID)
Note
QC 20250912
2025-09-122025-09-122025-12-08Bibliographically approved