A Comparison Between Convolutional and Transformer Architectures for Speech Emotion Recognition

Iyer, Shreyah, Glackin, Cornelius, Cannings, Nigel, Veneziano, Vito and Sun, Yi (2022) A Comparison Between Convolutional and Transformer Architectures for Speech Emotion Recognition. Institute of Electrical and Electronics Engineers (IEEE).

Copy

Creating speech emotion recognition models com-parable to the capability of how humans recognise emotions is a long-standing challenge in the field of speech technology with many potential commercial applications. As transformer-based architectures have recently become the state-of-the-art for many natural language processing related applications, this paper investigates their suitability for acoustic emotion recognition and compares them to the well-known AlexNet convolutional approach. This comparison is made using several publicly available speech emotion corpora. Experimental results demonstrate the efficacy of the different architectural approaches for particular emotions. The results show that the transformer-based models outperform their convolutional counterparts yielding F1-scores in the range [70.33%, 75.76%]. This paper further provides insights via dimensionality reduction analysis of output layer activations in both architectures and reveals significantly improved clustering in transformer-based models whilst highlighting the nuances with regard to the separability of different emotion classes.

Item Type	Other
Uncontrolled Keywords	alexnet; convolutional neural networks; mel spectrograms; speech emotion recognition; transfer learning; transformers; wav2vec2
Subjects	Computer Science(all) > Software Computer Science(all) > Artificial Intelligence
Divisions	?? dep_cs ?? ?? sbu_specs ?? ?? rc_csir ?? ?? rg_bio_comp ??
Date Deposited	18 Nov 2024 12:43
Last Modified	18 Nov 2024 12:43

[error in script]

picture_as_pdf

picture_as_pdf: final_camera_ready_version.pdf

View

Download

Atom

BibTeX

OpenURL ContextObject in Span

OpenURL ContextObject

Dublin Core

MPEG-21 DIDL

EndNote

HTML Citation

METS

MODS

RIOXX2 XML

Reference Manager

Refer

ASCII Citation

Export

Downloads