Assessing Deepfake Detection Models: A Comparative Study for Misinformation Detection and Prevention

Liyana Gamage, Chathura

Assessing Deepfake Detection Models: A Comparative Study for Misinformation Detection and Prevention

Liyana Gamage, Chathura (2025-06-02)

Assessing Deepfake Detection Models: A Comparative Study for Misinformation Detection and Prevention

Liyana Gamage, Chathura

(02.06.2025)

Katso/Avaa

LiyanaGamage_Chathura_Thesis.pdf (942.5Kb)

Lataukset:

Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.

avoin

Näytä kaikki kuvailutiedot

Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2025061166652

Tiivistelmä

Deepfake videos are proliferating faster than existing detection tools can keep pace, yet most prior studies benchmark detectors only on single datasets (with few exceptions that have generalized models), ignore bitrate degradation, and omit resource costs, leaving practitioners uncertain about real-world reliability. Addressing this gap, this study evaluates the reliability and deployability of selected deepfake detectors and formulates actionable countermeasures against synthetic media misinformation. Three state-of-the-art models, BA-TFD+, Convolutional Cross Efficient ViT, and CLRNet, were analysed on five publicly available datasets: DeeperForensics-1.0, Celeb-DF (v2), LAV-DF, DFD, and DFW under three H.264 compression settings. The extensive inference with measurement of precision, recall, AUC, F1-score, latency, and computational memory usage highlighted calibration gaps visualized through heat maps of AUC and F1 and AUC versus F1 scatter plots. Results show that at least one model achieved F1 approximately 0.80 on every dataset and compression levels and all detectors remain insensitive to bitrate reduction (i.e., higher compressed data), performance is fragmented as Cross ViT generalizes best but have a significant memory usage of 3-5 GB, BA-TFD+ offers near perfect accuracy on familiar dataset (i.e., LAV-DF) yet suffers severe threshold bias elsewhere, while CLRNet is lightweight and fast but not so well performed in terms of detection accuracy everywhere. This study concludes that no single detector rises as a winner in isolation. A combined or layered approach is recommended, such as edge-level pre-filters and cloud-side ensemble, continuous threshold calibration, and cryptographic watermarking techniques. Policy proposals include, but are not limited to, mandatory C2PA signatures, obligatory AI-labeling, and common standard compliance availability worldwide. Together, these technical and policy-wise measures provide a solid roadmap for sustaining trust in visual evidence as generative media evolves.

Kokoelmat

Pro gradu -tutkielmat ja diplomityöt sekä syventävien opintojen opinnäytetyöt (kokotekstit) [9745]