Cookies
This website uses cookies to enhance the user experience.
Patricia Lucas | 04 Mar 2025
There is plenty of hype about the potential of Generative AI, but just as much doubt and fear.
In the world of qualitative research, the emergence of Large Language Models, and ChatGPT in particular, has brought in the potential for machine-led automation to a system of inquiry that seeks to be context-based and naturalistic. There is lively debate in the academic literature and in online spaces about whether AI approaches are appropriate, ethical, or whether they are even any good.
As a researcher, I think the useful question isn't 'can we trust AI' but 'how can we make it trustworthy?'. It's a technological shift that is here to stay. What I want to know is what does a trustworthy AI tool look like, and how can I use AI in trustworthy and reliable ways?
Trustworthiness is a keystone of quality in qualitative research. Qualitative research does not seek to establish generalisability, measure validity, or expect results to be replicable over time or place. These are the markers of quality in quantitative research. Instead, qualitative researchers want rich data, for which the markers of quality are credibility, transferability, dependability and confirmability (per Lincoln & Guba).
Typical actions to increase credibility and trustworthiness in qualitative research include: engaging with the context, triangulation (comparing different points of view through different sources of data or different analytic approaches), dependability checking (quality assurance processes such as peer review and audit trails), transferability (describing context, and seeking diversity when collecting data so that similarity to other contexts can be judged), and member checking (confirming whether participants feel their views and experiences have been accurately represented).
At Colectiv, we have built tools to both collect and analyse qualitative data.
Our AI interviewer is designed to ask questions following a topic guide, much like a traditional semi-structured interview. Each participant experiences a unique conversation because new questions are generated each time, responding to answers they provide and probing for further details. Credibility is increased because:
We also use AI to support analysis. Generative AI is particularly good at summarising large amounts of text, but it is also lazy: it has a bias towards the information that appears first, it tends to stop when the summary appears sufficient (rather than complete), it may oversimply and ignore differing or nuanced viewpoints. It may also hallucinate, generating false data. We take a number of steps to avoid these risks, and to build trust in our outputs:
We believe that our processes contribute to credibility and dependability (using project bespoking, piloting, individual-level analysis and quality assurance), confirmability (pilot, access to individual level insights and team checking), and transferability (findings linked to participant characteristics, and available in a searchable, sortable format).
We know from the transcripts that our AI interviewer asks relevant and in-depth questions, focussed on the topics directed. But, as importantly for quality assurance, the feedback from interviews suggests that people understand the experience, and find it easy and enjoyable to take part:
"it was nice talking with you. You are a good qualitative researcher."
" This is my first interview with AI, and I really enjoyed it. For example, when I didn’t understand the question, its ability to quickly rephrase and clarify made me feel good. "
"it worked well and avoided errors such as asking for 1-5 rankings without saying which is low and high, unlike the human powered survey I answered just before this one."
Building context into the interviewer training improves the quality of the interviews. When we compare responses to our project interviews, to responses to short, context-free Demo interviews we use on our website and elsewhere. People who test context-free interviews are more likely to tell us the conversation felt less natural:
"I think it might need to be a bit more conversational. It appeared as if the bot was trying to base the conversation on a pre-defined set of questions or inquiry areas, which makes sense if it is an interviewer"
We have maximised trustworthiness by keeping human hands on the tiller of the AI machine. At the moment, this means holding back on fully automating some steps. But we can still complete and release quality-assured coding of hundreds of interviews in more than 20 languages within 24 hours of interview completion.
Some of those who argue against the use of AI for qualitative research do so because they find examples of low quality outputs. But not all research is good research, and being entirely conducted by humans is no protection against unreliable or poor quality research. Different AI tools are going to perform differently, just as different people and teams do. They should be judged individually. We have put trustworthiness and quality assurance at the heart of our tools, but we continue to look for improvements and are open to suggestions and challenge.
Just as every tool should be judged on its merit, its application should be evaluated in context. AI tools are not appropriate for every research context or question. They can come into their own when scale or speed is imperative, where there are barriers to traditional methods (such as language or distance), or when used alongside traditional qualitative methods to reach different groups of participants.
When you want to reach more people, in multiple languages, in a digital native mode, and with fast access to insights, we think our human-curated method can be trusted.
If you have any questions, or want to test out our tools, drop us a line hello@colectiv.tech