Verwendung von Large Language Models für die thematische Analyse. Offene Aufforderungen, bessere Terminologien und thematische Karten

Autor/innen

  • Stefano De Paoli Abertay University

DOI:

https://doi.org/10.17169/fqs-25.3.4196

Schlagworte:

thematische Analyse, halbstrukturierte Interviews, Large Language Models, Erstkodierung, thematische Karten

Abstract

In diesem Beitrag baue ich auf einer ersten Forschungsarbeit auf, in der ich Verfahren zur Nutzung von Large Language Models (LLMs) in der qualitativen Datenanalyse entwickelt habe, indem ich eine thematische Analyse (TA) mit LLMs durchführe. Die TA dient der Identifizierung von Mustern durch eine anfängliche Kennzeichnung qualitativer Daten, gefolgt von der Organisation der Kennzeichnungen/Codes nach Themen.

Zunächst schlage ich einen neuen Satz von LLM-Aufforderungen für die anfängliche Kodierung und Generierung von Themen vor. Diese neuen Prompts unterscheiden sich von den typischen Prompts, die für eine solche Analyse eingesetzt werden, da sie völlig offen sind und auf TA-Sprache beruhen. Zweitens untersuche ich den Prozess des Entfernens doppelter Anfangscodes durch eine vergleichende Analyse der Codes jedes Interviews mit einem kumulativen Codebuch. Drittens untersuche ich die Konstruktion von thematischen Karten aus den Themen, die durch das LLM hervorgerufen wurden. Viertens bewerte ich die vom LLM erzeugten Themen im Vergleich zu den Themen, die manuell von Menschen erzeugt wurden. Für die Durchführung dieser Untersuchung habe ich ein kommerzielles LLM über eine Anwendungsprogrammschnittstelle (API) eingesetzt. Zwei Datensätze von frei zugänglichen halbstrukturierten Interviews wurden analysiert, um die methodischen Möglichkeiten dieses Ansatzes zu demonstrieren. Ich schließe mit praktischen Überlegungen zur Durchführung von TA mit LLM, um unser Wissen über das Feld zu erweitern.

Downloads

Keine Nutzungsdaten vorhanden.

Autor/innen-Biografie

Stefano De Paoli, Abertay University

Stefano DE PAOLI is professor of digital society at Abertay University in Dundee (Scotland). Stefano is interested in codesign, user research and qualitative methods. More recently he started working on large language models for data analysis.

Literaturhinweise

Braun, Virginia & Clarke, Victoria (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77-101.

Braun, Virginia; Clarke, Victoria & Weate, Paul (2016). Using thematic analysis in sport and exercise research. In Brett Smith & Andrew C. Sparkes (Eds.), Routledge handbook of qualitative research in sport and exercise (pp.191-205). New York, NY: Routledge.

Chandrasekaran, Dhivya & Mago, Vijay (2021). Evolution of semantic similarity—a survey. ACM Computing Surveys (CSUR), 54(2), 1-37.

Chen, Banghao; Zhang, Zhaofeng; Langrené, Nicholas & Zhu, Shengxin (2023). Unleashing the potential of prompt engineering in large language models: A comprehensive review. arXiv preprint, https://arxiv.org/abs/2310.14735 [Date of Access: June 29, 2024].

Chew, Robert; Bollenbacher, John; Wenger, Michael; Speer, Jessica & Kim, Annice (2023). LLM-assisted content analysis: Using large language models to support deductive coding. arXiv preprint, https://arxiv.org/abs/2306.14924 [Date of Access: December 11, 2023].

Clarke, Victoria & Braun, Virginia (2017). Thematic analysis. The Journal of Positive Psychology, 12(3), 297-298.

Curty, Renata; Greer, Rebecca & White, Torin (2021). Teaching undergraduates with quantitative data in the social sciences at University of California Santa Barbara: A local report, https://doi.org/10.25436/E2101H [Date of Access: May 10, 2024].

Curty, Renata; Greer, Rebecca & White, Torin (2022). Teaching undergraduates with quantitative data in the social sciences at University of California Santa Barbara [Data set], https://doi.org/10.25349/D9402J [Date of Access: May 10, 2024].

Dai, Shih-Chieh; Xiong, Aiping & Ku, Lun-Wei (2023). LLM-in-the-loop: Leveraging large language model for thematic analysis. arXiv preprint, https://arxiv.org/abs/2310.15100 [Date of Access: January 6, 2024].

De Paoli, Stefano (2023a). Writing user personas with large language models: testing phase 6 of a thematic analysis of semi-structured interviews. arXiv preprint, https://arxiv.org/abs/2305.18099 [Date of Access: June 15, 2024].

De Paoli, Stefano (2023b). Performing an inductive thematic analysis of semi-structured interviews with a large language model: An exploration and provocation on the limits of the approach. Social Science Computer Review, https://doi.org/10.1177/08944393231220483 [Date of Access: December 7, 2023].

De Paoli, Stefano & Mathis, Walther S. (2024). Reflections on inductive thematic saturation as a potential metric for measuring the validity of an inductive thematic analysis with LLMs. arXiv preprint, https://arxiv.org/abs/2401.03239 [Date of Access: June 15, 2024].

Drápal, Jakub; Westermann, Hannes & Savelka, Jaromir (2023). Using large language models to support thematic analysis in empirical legal studies. arXiv preprint, https://arxiv.org/abs/2310.18729 [Date of Access: January 16, 2023].

Gao, Jie; Guo, Yuchen, Lim; Gionnieve, Zhan; Tianqin, Zhang; Zheng, Li; Toby, Jia-Jun L. & Perrault, Simon T. (2023). CollabCoder: A GPT-powered workflow for collaborative qualitative analysis. arXiv preprint, https://arxiv.org/abs/2304.07366 [Date of Access: January 22, 2024].

Hamilton, Leah; Elliott, Desha; Quick, Aaron; Smith, Simone & Choplin, Victoria (2023). Exploring the use of ai in qualitative analysis: A comparative study of guaranteed income data. International Journal of Qualitative Methods, 22, https://doi.org/10.1177/16094069231201504 [Date of Access: January 22, 2024].

Hanchard, Matthew & San Roman Pineda, Itzel (2023). Fostering cultures of open qualitative research: Dataset 2—Interview transcripts, https://doi.org/10.15131/shef.data.23567223.v2 [Date of Access: October 24, 2023].

Hoxtell, Annette (2019). Automation of qualitative content analysis: A proposal. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 20(3), Art. 15, http://dx.doi.org/10.17169/fqs-20.3.3340 [Date of Access: January 16. 2024].

Huber, Patrick & Carenini, Giuseppe (2022). Towards understanding large-scale discourse structures in pre-trained and fine-tuned language models. arXiv preprint, https://arxiv.org/abs/2204.04289 [Date of Access: December 12, 2023].

Lee, Vien V.; van der Lubbe, Stephanie C.; Goh, Leih H. & Valderas, Jose M. (2023). Harnessing ChatGPT for thematic analysis: Are we ready?. arXiv preprint, https://arxiv.org/abs/2310.14545 [Date of Access: December 12, 2023].

Madaan, Aman; Tandon, Niket; Gupta, Prakhar; Hallinan, Skyler; Gao, Luyu, Wiegreffe; Sarah, Alon, Uri; Dziri, Nouha; Prabhumoye, Shrimai; Yang, Yiming; Gupta, Shashank; Majumder, Bodhisattwa P.; Hermann, Katherine; Welleck, Sean; Yazdanbakhsh, Amir & Clark, Peter (2023). Self-refine: Iterative refinement with self-feedback. arXiv preprint, https://arxiv.org/abs/2303.17651 [Date of Access: December 12, 2023].

Maguire, Moira & Delahunt, Brid (2017). Doing a thematic analysis: A practical, step-by-step guide for learning and teaching scholars. All Ireland Journal of Higher Education, 9(3), 3351-3359, https://ojs.aishe.org/index.php/aishe-j/article/view/335 [Date of Access: October 7, 2023].

Ofoeda, Joshua; Boateng, Richard & Effah, Jhon (2019). Application programming interface (API) research: A review of the past to inform the future. International Journal of Enterprise Information Systems (IJEIS), 15(3), 76-95.

Reimers, Niels & Gurevych, Iryna (2019). Sentence-bert: Sentence embeddings using Siamese Bert-networks. arXiv preprint, https://arxiv.org/abs/1908.10084 [Date of Access: May 16, 2024].

Saldaña, Johnny (2021). The coding manual for qualitative researchers. London: Sage.

Saunders, Benjamin; Sim, Julius; Kingstone, Tom; Baker, Shula; Waterfield, Jackie; Bartlam, Bernadette; Burroughs, Heather & Jinks, Clare (2018). Saturation in qualitative research: Exploring its conceptualization and operationalization. Quality & Quantity, 52, 1893-1907, https://doi.org/10.1007/s11135-017-0574-8 [Date of Access: October 10, 2023].

Schiavone, Will; Roberts, Chirstopher; Du, David; Sauro, Jeff & Lewis, Jim (2023). Can ChatGPT replace UX researchers? An empirical analysis of comment classifications [Online post], https://measuringu.com/classification-agreement-between-ux-researchers-and-chatgpt/ [Date of Access: June 12, 2023].

Serrano, Sofia; Brumbaugh, Zander & Smith, Noah A. (2023). Language models: A guide for the perplexed. arXiv preprint, https://arxiv.org/abs/2311.17301 [Date of Access: December 10, 2023].

Terry, Gareth; Hayfield, Nikki; Clarke, Victoria & Braun, Virginia (2017). Thematic analysis. In Willig Carla & Stainton Rogers Wendy (Eds.), The Sage handbook of qualitative research in psychology (2nd ed., pp.17-37). London: Sage.

Waldherr, Annie; Wehden, Lars O.; Stoltenberg, Daniela; Miltner, Peter; Ostner, Sophia & Pfetsch, Barbara (2019). Inductive codebook development for content analysis: Combining automated and manual methods. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 20(1), Art. 19, https://doi.org/10.17169/fqs-20.1.3058 [Date of Access: January 16. 2024].

Wasserman, Stanley & Faust, Katherine (1994). Social network analysis: Methods and applications. Cambridge: Cambridge University Press.

Wiedemann, Gregor (2013). Opening up to big data: Computer-assisted analysis of textual data in social sciences. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 14(2), Art. 23, https://doi.org/10.17169/fqs-14.2.1949 [Date of Access: January 16. 2024].

Wollin-Giering, Susanne; Hoffmann, Markus; Höfting, Jonasw & Ventzke, Carla (2024). Automatic transcription of English and German qualitative interviews. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 25(1), Art. 8, https://doi.org/10.17169/fqs-25.1.4129 [Date of Access: January 19, 2024].

Xiao, Ziang; Yuan, Xingdi; Liao, Vera Q.; Abdelghani, Rania & Oudeyer, Pierre Y. (2023). Supporting qualitative analysis with large language models: Combining codebook with GPT-3 for deductive coding. In Association for Computing Machinery (Ed.), Companion proceedings of the 28th International Conference on Intelligent User Interfaces (pp.75-78). New York, NY: Association for Computing Machinery, https://dl.acm.org/doi/proceedings/10.1145/3581641 [Date of Access: June 30, 2023].

Yu, Zihan; He, Liang; Wu, Zhen; Dai, Xinyu & Chen, Jiajun (2023). Towards better chain-of-thought prompting strategies: A survey. arXiv preprint, https://arxiv.org/abs/2310.04959 [Date of Access: December 10, 2023].

Veröffentlicht

2024-09-29

Zitationsvorschlag

De Paoli, S. (2024). Verwendung von Large Language Models für die thematische Analyse. Offene Aufforderungen, bessere Terminologien und thematische Karten. Forum Qualitative Sozialforschung Forum: Qualitative Social Research, 25(3). https://doi.org/10.17169/fqs-25.3.4196

Ausgabe

Rubrik

Einzelbeiträge