Hyppää sisältöön
    • Suomeksi
    • In English
  • Suomeksi
  • In English
  • Kirjaudu
Näytä aineisto 
  •   Etusivu
  • 3. UTUCris-artikkelit
  • Rinnakkaistallenteet
  • Näytä aineisto
  •   Etusivu
  • 3. UTUCris-artikkelit
  • Rinnakkaistallenteet
  • Näytä aineisto
JavaScript is disabled for your browser. Some features of this site may not work without it.

An Approach to the Frugal Use of Human Annotators to Scale up Auto-coding for Text Classification Tasks

Chen Li'An; Suominen Hanna

An Approach to the Frugal Use of Human Annotators to Scale up Auto-coding for Text Classification Tasks

Chen Li'An
Suominen Hanna
Katso/Avaa
LiAnChen_HannaSuominen_2021_Approach_to_the_Frugal_Use_of_Human.pdf (442.4Kb)
Lataukset: 

URI
https://aclanthology.org/2021.alta-1.2/
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2023022128008
Tiivistelmä

Human annotation for establishing the training data is often a very costly process in natural language processing (NLP) tasks, which has led to frugal NLP approaches becoming an important research topic. Many research teams struggle to complete projects with limited funding, labor, and computational resources. Driven by the Move-Step analytic framework theorized in the applied linguistics field, our study offers a rigorous approach to the frugal use of two human annotators to scale up autocoding for text classification tasks. We applied the Linear Support Vector Machine algorithm to text classification of a job ad corpus. Our Cohen’s Kappa for inter-rater agreement and Area Under the Curve (AUC) values reached averages of 0.76 and 0.80, respectively. The calculated time consumption for our human training process was 36 days. The results indicated that even the strategic and frugal use of only two human annotators could enable the efficient training of classifiers with reasonably good performance. This study does not aim to provide generalizability of the results. Rather, it is proposed that the annotation strategies arising from this study be considered by our readers only if they are fit for one’s specific research purposes.

Kokoelmat
  • Rinnakkaistallenteet [27094]

Turun yliopiston kirjasto | Turun yliopisto
julkaisut@utu.fi | Tietosuoja | Saavutettavuusseloste
 

 

Tämä kokoelma

JulkaisuajatTekijätNimekkeetAsiasanatTiedekuntaLaitosOppiaineYhteisöt ja kokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy

Turun yliopiston kirjasto | Turun yliopisto
julkaisut@utu.fi | Tietosuoja | Saavutettavuusseloste