Skip to main content

Faculty Insights

Large Language Models Perform as Strong Collaborators, Insight Generators, in AI-Human Hybrid Marketing Research Study

By Wisconsin School of Business

September 11, 2024

AI graphic of human profile with AI patterned head and UW crest

New research from the Wisconsin School of Business suggests generative AI (GenAI) assistants can be valuable collaborators and help deliver significant efficiency and effectiveness gains in the marketing research process when placed in an AI-human hybrid (GenAI with human oversight) partnership.

The study, forthcoming in the Journal of Marketing, by Neeraj Arora, Arthur C. Nielsen, Jr. Chair in Marketing Research and Education and a professor of marketing, Ishita Chakraborty, Thomas and Charlene Landsberg Smith Faculty Fellow and an assistant professor of marketing, and Yohei Nishimura, doctoral student of marketing, looks at how large language models (LLMs), a type of AI that can generate text and understand content, can be used as collaborators when part of a hybrid team conducting marketing research functions.

The authors created two end-to-end frameworks—one for qualitative research and one for quantitative—to test how LLMs could be incorporated at various stages of marketing research, such as design, sampling, generating synthetic respondents, and analysis.

Using these frameworks, the research team partnered with a Fortune 500 food company to replicate two studies the company conducted in 2019 using an LLM, GPT-4.

Study one

The first study was qualitative in nature and centered around questions for a Friendsgiving celebration. Friendsgiving is when friends, not family, celebrate Thanksgiving. LLMs were tasked with assisting with data generation and analysis, including creating sample characteristics, generating synthetic respondents, and conducting moderate in-depth interviews.

Results: For qualitative research, the research team found that LLMs were excellent assistants for data generation and analysis. On the data generation front, LLMs effectively create desirable sample characteristics, generate synthetic respondents that match those characteristics, and conduct and moderate in-depth interviews. The study results suggested that LLM-generated responses were superior in terms of depth and insightfulness. LLMs also perform well as analysts, matching human experts in identifying key ideas, grouping them into themes, and summarizing information. Interestingly, although LLMs missed some themes that humans detected, they also generated new ones that humans did not. Expert judges found that human-LLM hybrids outperformed their human-only or LLM-only counterparts.

“The upshot here is that LLMs and humans bring unique, complementary insights to the table that managers should leverage,” says Chakraborty. “Analysis, synthetic respondents, creating sample characteristics—these are all areas where LLMs shine.”

Study two

The second study, which was quantitative, focused on testing a new refrigerated dog food concept. The research team designed the system architecture and prompts to create personas, ask questions, and obtain responses from synthetic respondents.

Results: The findings revealed the LLM picked the answer direction well, that is, when the average for a variable was toward the lower end of the scale, the synthetic data average tended to be low as well, and vice versa.

For the base GPT-4 model, the variance in the synthetic data was consistently smaller and the correlations between variables in the human data, a measure of reliability, were not recovered well by the LLM. To correct for this limitation, the research tested two approaches to incorporate context: few-shot learning and retrieval-augmented generation (RAG). The former leverages previous answers an LLM gave to generate the next answer, and the latter leverages existing contextual information—namely the results of a related qualitative study the company conducted. Each approach shows great promise in improving synthetic survey data quality as they help improve the heterogeneity and reliability of LLM answers.

Incorporating context into an LLM using RAG—resulting in improvements in the data generated—is a contribution unique to this study, one that other studies in the field have not looked at in detail.

“LLMs as an intelligent engine could prove to be a revolutionary generator of prior information for a wide variety of business questions at a low cost,” Arora says.

As GenAI disrupts the field, the impact on marketing research is likely to be transformational. The industry continues to evaluate potential opportunities and use cases while new companies emerge at the intersection of marketing research and GenAI.

While untold avenues for future research and practical applications remain, Arora notes two important considerations related to the group’s study: cost and margin for LLM error.

“A significant advantage of LLMs as an assistant is their low cost and scalability,” he says. “We believe that these factors will contribute toward rapid adoption of LLMs for insight generation.

“It is important to note that LLMs can be wrong, biased, or hallucinate when it was not trained on the relevant data,” says Arora. “Therefore, the human supervisor is a necessary part of the marketing research knowledge production process.”


Tags: