A robot walks into a bar: Can artificial intelligence do natural wit?

Challenge
What makes something funny? A research team from Google DeepMind wanted to understand how well large language models (LLMs) serve as creativity support tools for comedy writing, but they faced several challenges:
- Finding professional comedians with experience using AI in their creative process
- Creating an environment for authentic evaluation of AI-generated comedy material
- Gathering detailed feedback about moderation, censorship, and cultural biases in LLMs
- Understanding the complex, subjective nature of humor and creative writing
Solution
Thanks to Prolific, the researchers were able to connect with exactly the right participants for their specialized study. Prolific made a difference by providing:
Outstanding professional recruitment. Prolific helped connect the researchers with comedy professionals from diverse backgrounds who were performing at the 2023 Edinburgh Festival Fringe and beyond, ensuring participants had genuine expertise in comedy performance.
This kind of specialized recruitment would have been difficult to achieve through traditional methods, but it meant the study could include genuine comedy experts with exactly the right blend of AI experience and performance credentials.
Seamless workshop organization. Through Prolific's streamlined participant management, the researchers effortlessly organized four workshops (one in-person, three online) with excellent attendance and engagement.
Participants took part in comedy writing sessions using popular LLMs, completed detailed evaluations, and contributed thoughtful perspectives in focus groups.
Comprehensive research ecosystem. Prolific perfectly supported a multi-method study, which allowed researchers to gather rich quantitative and qualitative data in one cohesive research environment. The platform's intuitive design made it easy to integrate creativity metrics assessments alongside in-depth discussions about AI ethics in comedy.
Execution
Participants spent approximately three hours in workshops that included a 45-minute comedy writing session using ChatGPT or Bard (now succeeded by Gemini). They were also asked to complete the standardized Creativity Support Index to quantitatively measure the tools' effectiveness
Focus group discussions took place and explored the strengths and limitations of AI for comedy writing. Lastly, they had detailed conversations about ethical issues including censorship, bias, and ownership
Results
The study uncovered some eye-opening truths about where LLMs currently fall short when it comes to comedy writing:
- Most comedians found AI-generated content "bland" and lacking the personal voice essential for good comedy - "like cruise ship comedy material from the 1950s, but a bit less racist", as one participant put it.
- LLMs struggled with offensive humor that "punches up", a common technique comedians use to challenge power structures
- Safety filters and moderation strategies in LLMs were seen as problematic forms of censorship that disproportionately affected minority perspectives
- Comedians found LLMs particularly lacking in context awareness, personal experience, and ability to generate surprising content—all vital elements for effective comedy
The research demonstrated that despite advancing capabilities, LLMs still lack fundamental human qualities necessary for high-quality comedy. The Creativity Support Index score for LLMs as comedy writing tools was mediocre (54.6 out of 100), highlighting significant room for improvement.
Conclusion
For researchers studying human-AI creative collaboration, Prolific provided the ideal solution to recruit specialized participants and facilitate complex workshops combining multiple research methodologies.
Citation: Mirowski, P.W., Love, J., Mathewson, K., & Mohamed, S. (2024). A Robot Walks into a Bar: Can Language Models Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians. FAccT '24, June 3-6, 2024, Rio de Janeiro, Brazil.
Research institution: Google DeepMind
Have your research featured.
Share your published research using the form below, and it may be featured on our website.