Automatic Classification of Strain in the Singing Voice Using Machine Learning

dc.contributor.authorLiu, Yuanyuan
dc.contributor.authorMittapalle, Kiran Reddy
dc.contributor.authorYagnavajjula, Madhu Keerthana
dc.contributor.authorRäsänen, Okko
dc.contributor.authorAlku, Paavo
dc.contributor.authorIkävalko, Tero
dc.contributor.authorHakanpää, Tua
dc.contributor.authorÖyry, Aleksi
dc.contributor.authorLaukkanen, Anne-Maria
dc.contributor.organizationfi=opettajankoulutuslaitos (Rauma)|en=Department of Teacher Education (Rauma)|
dc.contributor.organization-code1.2.246.10.2458963.20.99310884848
dc.converis.publication-id491612849
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/491612849
dc.date.accessioned2025-08-27T23:47:50Z
dc.date.available2025-08-27T23:47:50Z
dc.description.abstract<p><b>Objectives</b><br>Classifying strain in the singing voice can help protect professional singers from vocal overuse and support singing training. This study investigates whether machine learning can automatically classify singing voices into two levels of perceived strain. The singing samples represent two genres: classical and contemporary commercial music (CCM).<br><b>Methods</b><br>A total of 324 singing voice samples from 15 professional normophonic singers (nine female, six male) were analyzed. Nine singers were classical, and six were CCM singers. The samples consisted of syllable strings produced at three to six pitches and three loudness levels. Based on expert auditory-perceptual ratings, the samples were categorized into two strain levels: normal-mild and moderate-severe. Three acoustic feature sets (mel-frequency cepstral coefficients (MFCCs), the extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS), and wavelet scattering features) were compared using two classifier models [support vector machine (SVM) and multilayer perceptron (MLP)]. Feature selection was performed using recursive feature elimination, and the Mann-Whitney U test was used to assess the discriminative power of the selected features.<br><b>Results</b><br>The highest classification accuracy of 86.1% was achieved using a subset of wavelet scattering features with the MLP classifier. A comparison between individual features showed that the first MFCC coefficient, representing spectral tilt, exhibited the greatest between-class separation.<br><b>Conclusion</b><br>This study demonstrates that machine learning models utilizing selected acoustic features can classify perceptual strain of singing voices automatically with high accuracy. These preliminary findings highlight the potential for larger studies involving more diverse singer groups across different genres.<br></p>
dc.identifier.eissn1873-4588
dc.identifier.jour-issn0892-1997
dc.identifier.olddbid204638
dc.identifier.oldhandle10024/187665
dc.identifier.urihttps://www.utupub.fi/handle/11111/53206
dc.identifier.urlhttps://www.jvoice.org/article/S0892-1997(25)00134-1/fulltext
dc.identifier.urnURN:NBN:fi-fe2025082790510
dc.language.isoen
dc.okm.affiliatedauthorHakanpää, Tua
dc.okm.discipline112 Statistics and probabilityen_GB
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline6131 Theatre, dance, music, other performing artsen_GB
dc.okm.discipline616 Other humanitiesen_GB
dc.okm.discipline112 Tilastotiedefi_FI
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.discipline6131 Teatteri, tanssi, musiikki, muut esittävät taiteetfi_FI
dc.okm.discipline616 Muut humanistiset tieteetfi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherElsevier BV
dc.publisher.countryUnited Statesen_GB
dc.publisher.countryYhdysvallat (USA)fi_FI
dc.publisher.country-codeUS
dc.relation.doi10.1016/j.jvoice.2025.03.040
dc.relation.ispartofjournalJournal of Voice
dc.source.identifierhttps://www.utupub.fi/handle/10024/187665
dc.titleAutomatic Classification of Strain in the Singing Voice Using Machine Learning
dc.year.issued2025

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
PIIS0892199725001341.pdf
Size:
923.63 KB
Format:
Adobe Portable Document Format