Poster Session 3

Friday, May 23rd 3:00 - 4:00 pm

ICM Lobby

1: The Role of Cognitive Inhibition in Creativity

Ilana Shinder, Technion
Yoed Kenett, Technion

This study investigates the relationship between cognitive inhibition and creativity by examining inhibitory control during semantic memory retrieval. We explore the efficacy of associative (spontaneous) and dissociative (controlled) retrieval processes and their correlation with creativity measures such as divergent thinking, intelligence, and personality traits, utilizing standardized stimuli. Forty participants engaged in creativity-related tasks, both inside and outside an fMRI scanner, focusing on divergent thinking, associative retrieval, and dissociative control. Functional connectivity analyses within the fMRI scanner assessed neural connectivity patterns linked to different creativity components. Preliminary findings indicate a positive correlation between cognitive inhibition and several creativity measures, suggesting that inhibition may enhance idea fluency, while associative fluency contributes to idea originality. The overlap in neural connectivity during divergent thinking and memory retrieval tasks points to shared neural pathways, providing insights into the cognitive and neural mechanisms that underpin creativity. This research offers a nuanced understanding of how executive functions interact with associative processes in creative thought.

2: The Dixit Study - A Novel Gamified Paradigm to Study Linguistic Creativity

Daniel Lennart Müller, University of Osnabrück

Operalisation remains a main challenges in (neuro-)scientific research on creativity. With the experimental set-up presented here, I aim at a more ecologically valid set-up that evaluates creativity more naturally. To this end, participants play the game Dixit in the following way: Each player starts with six cards which depict imaginative drawings, e.g., an anchor in the dessert. One player, the storyteller, has to find a word or sentence which creatively describes a card from their hand without showing it. Other players select cards that match the description. The storyteller shuffles and reveals all selected cards, and players guess which card was the storyteller's. Points are awarded to the storyteller if more than one but not all players guess correctly. This reflects the two classical dimensions of creativity: Novelty and sensibility. After each round, players draw new cards, and the next player becomes the storyteller. During the phase when the storyteller thinks about what word or sentence to use, they are measured with EEG. The following conditions are compared: Success, where some but not all players guess the card correctly. Overshoot, where all players guessed correctly and undershoot, where no one guessed correctly. This is contrasted with one control group where participants a simple image description task while the EEG is measured. The hypothesis is that in the Dixit condition, there is a change in synchronicity of the upper alpha band corresponding to divergent and convergent phases in the creative process as found by Eymann et al. (2024), which is missing in the control task. One might also compare success conditions with over- and undershoot conditions.

3: Creativity Resources Network: An Expert Study

Dominik Golab, University of Louvain (Belgium)
Baptiste Barbot, University of Louvain (Belgium)/ Child Study Center at Yale University

The ontological status of creativity is still unclear in the psychological science. Multiple efforts are being made to establish its more consistent conceptualization. Among different theoretical approaches, creativity can be conceptualized as a higher-order multi-resource construct. Which resources define creativity best? This pre-registered study employed an expert elicitation approach to identify key creativity resources based on expert consensus. A total of 239 creativity psychology researchers, from across the globe, evaluated 20 creativity-relevant resources among other ones. Two indicators were derived from experts’ evaluations: importance and centrality. The top five resources rated highest in importance were Curiosity, Openness, Imagination, Divergent Thinking, and Associative Thinking, as identified through a Many-Facet Rasch Model, which controlled for potential biases in experts' scoring. Centrality within the creativity resources network, using Expected Influence indicator in Network Analysis, highlighted Tolerance of Ambiguity, Openness, Selective Combining, Evaluation, and Intrinsic Motivation as the most central resources. Differences in evaluations and definition were explored across subfields of psychology, judge characteristics, and cultural differences. For example, clinical psychologist emphasized resources such as Risk Taking, and Eastern collectivistic cultures (based in Asia) rated Agreeableness and Intelligence as more important resources to creativity. Together, this study constitutes an important contribution to the ontology of creativity, providing a foundation for future empirical validation and the design of creativity interventions.

4: From Theory to Canvas: Harnessing Nostalgia for Innovation in Art

Eirini Petratou, Visual Artist & Researcher (PhD in Creativity, Panteion University of Athens)

Nostalgia, a powerful emotional state that reconnects us to cherished memories of the past, has been shown to enhance creative thinking by stimulating imagination, memory, and emotion. My doctoral research explored the role of nostalgia in creativity, with a particular focus on how nostalgic stimuli, such as scents, can trigger vivid memories and unlock innovative thought processes. This presentation bridges academia and artistic practice, demonstrating how these findings informed my creative process as a visual artist. Drawing on my research, I developed a series of paintings that intentionally incorporate the emotional and cognitive activation triggered by nostalgia. By immersing myself in nostalgic experiences—especially through the use of evocative odors—I was able to access deeply personal memories and translate them into visual expression. Experimenting with various techniques, textures, and color palettes, I sought to reflect the multi-sensory and emotional depth of nostalgia on the canvas. Each painting became an exploration of memory, emotion, and creative innovation, revealing how past experiences can inspire new artistic directions. This presentation will delve into the interplay between nostalgia and artistic practice, showcasing how memory and emotion shaped my creative decisions. It will also discuss the broader implications of using nostalgia as a tool for innovation in visual art, offering insights for artists, researchers, and creative practitioners alike.

5: Improving Assessment Accuracy in Visual Creativity Tasks Through Feedback and Reflection

Marta Czerwonka, University of Wroclaw
Jakub Jędrusiak, University of Wroclaw
Aleksandra Zielińska, University of Wroclaw
Izabela Lebuda, University of Wroclaw
Maciej Karwowski, University of Wroclaw

People struggle with uncertainty about criteria understanding or decision-making results during the evaluation process in creativity. The ability to discern more from less creative ideas is a core component of metacognitive monitoring with accuracy serving as an indicator of performance. Strengthening metacognitive skills through building a learning experience and providing task-relevant knowledge are key in improving assessment accuracy. The aim of this study was to examined effectiveness of metacognitive intervention in facilitating the process of learning to assess creative products more accurately. For fourteen days, university students (N = 164) assigned to experimental, control active, and control passive group participated in daily tasks designed to stimulate creative thinking. Experimental group participants were engaged in metacognitively-oriented tasks and trained in evaluating visual creativity products. The assessment skills trainings were extended on certain days by integrating real-time feedback and prompts for reflection. Potential changes in accuracy were examined four times. Results showed that students who participated in metacognitive training demonstrated higher accuracy in assessing drawings compared to both the active and passive control groups. The learning effect of accuracy over time was strongest in the experimental group. We demonstrated that days when students (N = 61; 755 student-day units) received feedback and were prompted to reflect on the external, judges perspective were associated with higher accuracy than days when only evaluating skills were trained. These findings open an avenue for developing metacognitive interventions.

6: Advanced Analysis Pipeline for Mobile EEG Recordings During Real-World Settings

Lianne Sánchez-Rodríguez, IUCRC BRAIN, University of Houston
Aime J. Aguilar-Herrera, IUCRC BRAIN, University of Houston
Maxine A. Pacheco-Ramírez, IUCRC BRAIN, University of Houston
Yoshua E. Lima-Carmona, IUCRC BRAIN, University of Houston
Esther A. Delgado-Jiménez, IUCRC BRAIN, Tecnológico de Monterrey
Derek Huber, IUCRC BRAIN, University of Houston
Mauricio A. Ramírez-Moreno, IUCRC BRAIN, Tecnológico de Monterrey
Jesús G. Cruz-Garza, IUCRC BRAIN, University of Houston
José L. Contreras-Vidal, IUCRC BRAIN, University of Houston

The need to study the creative brain in real-world scenarios has led to the deployment of Mobile Brain/Body Imaging (MoBI) technologies during artistic performances such as dance, music, or visual art making. However, this ecological approach encounters significant challenges due to the presence of physiological and non-physiological artifacts generated by free-behaving participants in real settings such as performance venues, theater or museums. Unfortunately, no standardized method exists for processing MoBI data in such scenarios. To address this critical gap, we propose best practices and a novel methodology for cleaning electroencephalography (EEG) signals using, electrooculography (EOG), head motion, and video to identify and remove artifacts online or offline. The proposed pipeline comprises three stages: (1) a detailed visual inspection of EEG signals aligned to head motion, video, and synchronized recording triggers (2) two sequential artifact removal stages using adaptive noise canceling (ANC) to eliminate eye blinks, motion artifacts, biases, and signal drifts (3) followed by dipole source localization and clustering complemented with analysis of the spectral content, bispectrum analysis estimation and functional connectivity. This comprehensive framework has been rigorously validated through brain synchronization studies in diverse artistic contexts such as dance, acting, and music, offering an unprecedented, nuanced view of neural activity in real-world creative scenarios and potentially revealing relationships between brain structure and cognitive-artistic function. Implementation of the proposed pipeline implementation are discussed in companion abstracts.

7: The Cognitive-Nodal Model: A new conceptual model of innovation

Lourenço MD Amador, University of Melbourne

Innovation has been essential to human progress and the development of the 21st Century world. It has become a major buzzword in modern times, both in our daily lives and in research. However, multidisciplinary research on innovation only gained momentum in the 1970s when the field of “Innovation Studies” become established. However, despite this popularity, interdisciplinary conceptualizations and models of innovation and its skills have not been properly investigated. Therefore, this research sought to address these gaps via answering two main questions: “What is innovation?” and “Is innovation a skill?”. The approaches used were multidisciplinary literature reviews, to understand how different perspectives conceptualize innovation and its skill; a Prototype Analysis, to investigate the modern layperson’s conceptualization of innovation, a perspective lacking from the literature; and a pilot study based on a novel methodology for conceptual analysis, used to investigate and produce expert-filtered definitions and models of innovation and the skill of innovation. From these theoretical and experimental approaches, a new model of innovation was developed, the cognitive-nodal model. Along with this conceptual model, the following definitions of innovation and its skills were also developed: Innovation is a novel development of a tool (be it through creation or alteration) which leads to a solution within one or more domains, that is communicable across one or more systems, and leads to a significant change in its context. The agent(s) involved with innovation is visionary, and capable of high-level ideation and knowledge implementation towards solving difficult problems.

8: Analysis of exploration and exploitation behaviors during a problem-solving task in Montessori children and non-montessori

Nino MInzon, ISMM, Paris cite University
Manon Roha, ISMM, Paris cite University
Leila Dagneaux, ISMM, Paris cite University
Anaelle Camarda, ISMM, LaPEA, Paris cite University

This study examines the explore-exploit behaviours of children aged from 5 to 7 years-old when solving a creative problem-solving task, and how they evolve through age and their educational curriculum (Montessori vs. Non Montessori school). We analyzed videos of children trying to solve the "box task" (German & Defeyter, 2000) using a structured observation grid. We hypothetized that: 1) Montessori children should develop functional fixity earlier than their non-Montessori children (from the age of 5), given that they evolve in an environment which fixes the specific use of an object earlier than in traditional curriculum 2) they may not be affected by this cognitive rigidity thanks to a compensatory phenomenon related to the increased effectiveness of certain cognitive functions essential to creative problem solving (Marshall et al., 2017), especially the exploration mechanisms. The videos of 165 children completing the box task were analysed using a coding grid, helping to identify specific behaviours related exploration and exploitation strategies. Previous analyses reveal that younger children (y=5) have a more varied and less functionally fixed exploration, while older children (y=6-7) show greater perseverance in their strategies although they are more sensitive to the effect of functional fixity, regardless of their scholar education. This effect is all the more true for Montessori children, who match the creative abilities of other children, due to the exacerbation of their exploratory behaviors. These results highlight the role of age and the educational environmment in the development of creative abilities, and will be regarding cognitive theories of créativity.

9: Leveraging Shape-of-Thought to Train Creativity: A Pilot Study

Nathalia da Cruz Alves, Georgetown University
Oded Kleinmintz, Georgetown University
Kibum Moon, Georgetown University
Adam Green, Georgetown University

Creativity is a critical competency in the 21st century, necessitating tools that measure and enhance creative potential. This pilot study evaluates the effectiveness of the Shape-of-Thought (SoT) Training Tool, developed by the Georgetown Laboratory for Relational Cognition, in fostering creativity through interactive visualizations of verbal outputs. Seventy-five participants completed either Alternative Uses Tasks (AUT) or Divergent Association Tasks (DAT), with originality assessed pre- and post-intervention using the Open Creativity Scoring AI. Results revealed a significant increase in mean originality scores post-intervention (p = 0.027), although maximum originality scores remained stable, preliminarily suggesting the tool's primary effect on average creativity rather than peak performance. Mixed-design ANOVA demonstrated a more pronounced improvement in originality for DAT tasks compared to AUT tasks, suggesting task-dependent sensitivity to the intervention. These findings highlight the potential of the SoT-based feedback to elevate creative output, and underscore the importance of task design in evaluating creativity interventions. Future research will explore the impact of verbal versus visual feedback.

10: Shape-of-Thought: geometric features derived from LLM-based embeddings differentiate human and AI semantic content

Oded M. Kleinmintz1 [ok162@georgetown.edu], Kibum Moon1 [km1735@georgetown.edu], Nathalia Da Cruz Alves1 [nathaliacruzalves@gmail.com], Yaxin Liu1 [yl1913@georgetown.edu] Mafalda Cardoso-Botelho1 [mcp158@georgetown.edu], Adam E. Green1 [aeg58@georgetown.edu]

1Department of Psychology, Georgetown University, Washington, DC, USA 20057

The scoring of creative performance remains a “black box”, generally affording little insight into the creative process. To address this issue, we leveraged state-of-the-art embeddings models to develop a method of extracting geometric features from human generated text. This “Shape-of-Thought” approach tracks the progression of ideas through semantic space as a person (or other agent) “travels” from one idea to the next. This travel may contain individual information regarding personal thinking styles, strategies, preferences, and habits, which can be identified via characteristic geometric features, such as the spacing of ideas, the volume of the shape/performance in the embedding space, the number of clusters, the linearity of the performance, uniqueness of the features, etc. We hypothesized that if SoT can pick up on individual features of semantic performance, we might be able to utilize these individual differences to differentiate the authorship of written content (human or AI). Humans, and AI models performed an essay writing task and a creative short story writing task. To test our hypothesis on whether SoT features can be used to classify new writings as human or AI, the SoT features were used to train several classification models. The results demonstrated the merit of the SoT method with several classification models’ F1s reaching above 0.95. We discuss the method as well as implications and future directions.

11: Neuroscience, the new frontier for the comprehension of the architectural lighting design function

Richard Caratti-Zarytkiewicz, Association des Concepteurs lumière et Eclairagistes - Association Française de l'Eclairage

In a 2014 interview lighting designer Charles G. Stone, described the perceptual purpose of lighting design comparing the perceptual exercise in the architecture and in the theatre: “in the theatre you sit in a chair and the scene changes in front of you or around you. In architecture you move through the ‘theatre’”. This comparison is a perfect illustration of how lighting design, as a visual tool, addresses our mind. Through the inner idea of the environment that is delivered in the project we explore, we are stimulated, either to go and check their message (in the architecture), either to a further imaginary elaboration (from our theatre seat, but also sometime in the architecture). Meeting William MC Lam’s cross checking reflexion in “Perception and Lighting as Formgiver for Architecture” lighting addresses to ‘the attributive stage of perception’ and the ‘attributive classification of visual stimuli’, i.e. to the way we anticipate our moving in the architecture. ‘Affective components of perception’ and the ‘attributive classification of visual stimuli’ throught lighting, assign a functional, or emotional ‘meaning’, to rooms, or objects we meet, so we determine why and how we will interact with them for functional or emotional reasons. Those considerations, associated to works about neuroaesthetics (Zeki), place cells and spatial memory, (O’Keefe J., Moser MB. & E.), mind-body interaction (Gallese), will build-up a mainstay for the theoretical thinking of the architectural lighting designer. Architectural lighting designers, architects and neuroscientists must elaborate a common protocol that will make human brain’s knowledge penetrate the architectural reflection.

12: The Art of Positivity in Drawing: Unveiling the Impact of Positive Mood States on Visual Creativity via Deep Learning

Sam Cong, University of Chicago

This study examines the mood-creativity linkage by integrating visual creative tasks with state-of-the-art computational techniques. Following the flexibility pathway in dual pathway to creativity model, it hypothesizes that positive activating moods enhance cognitive flexibility, thereby increasing originality in creative output. Diverging from the use of verbal tasks to measure creativity, this study employs an incompleteness drawing task to track the dynamics of the creative process. An online experiment conducts mood induction using validated film clips and presents three qualitatively distinct incomplete shapes, collecting narrative accounts from 90 participants recruited through MTurk. The study’s main contribution lies in its integration of drawing tasks with AI techniques to quantitatively assess flexibility and originality in creativity. Flexibility is measured using the Compositional Stroke Embedding model, which applies a Gaussian Mixture Model to predict potential strokes in the incompleteness shape drawing task. This approach evaluates the uncertainty and variability between possible strokes using entropy-related metrics to capture the dynamic range of creative options. Additionally, flexibility is assessed through Divergent Semantic Integration, which leverages BERT-generated embeddings to analyze the integration of different semantic categories. Meanwhile, originality is assessed with the Automated Drawing Assessment model trained on human ratings. By linking mood states to creativity through AI techniques, this study advances understanding of the flexibility pathway to creativity and highlights AI’s transformative potential in studying human cognition.

13: CrossModal Correspondence based Multisensory Integration from Haptic-Auditory-Visual (HAV) Cues : A Pilot Study

Swati Banerjee et.al

We live in a multisensory world, where all our senses work together for giving us a fulfilling experience of the environment that we are in or during our use of immersive technologies. Immersive technologies like Audiovisual or tactile based virtual environment are taking over every aspect of our life and it is becoming very important to study them. For gaining more insight into the temporal scale understanding of the integration phenomenon EEG based BCI can give us the understanding of the transient changes in the brain. In this study, we investigated the potential of incorporating haptics into crossmodal correspondence based research to induce MultiSensory Integration (MSI) effect through either the active touch users’ feedback or crossmodal correspondences with visual and auditory modalities, such as Kiki Bouba effect. Results suggested that it is indeed possible to achieve this type of integration without relying on complex haptic devices. Introducing haptics into BCI technologies through feedback touch or crossmodal correspondances holds potential to improve the user experience and information transfer rate (ITR). Participants, as expected, showed the lowest reaction times in congruent sequential test and the highest – in incongruent HAV cues based test. This indicates the importance preference for sequential cue presentation over simultaneous one. The time was significantly higher in case of Incongruent Haptic cues.

14: Studying fixation effects regarding food Systems: A pathway to foster consumers’ creativity for food transition

Victor Lasquellec ¹, Elsa T. Berthet 1,², Anaëlle Camarda ³, Emilie Salvia ⁴, Pascal Le Masson ⁵ and Mathieu Cassotti ⁴

¹ UMR 7372 Centre d’Études Biologiques de Chizé (CEBC), LTSER Zone Atelier Plaine & Val de Sèvres, La Rochelle
Université, 79360 Villiers-en-Bois
² USC 1339 Résilience, Centre d’Études Biologiques de Chizé (CEBC), LTSER Zone Atelier Plaine & Val
de Sèvres, La Rochelle Université, 79360 Villiers-en-Bois
³ UMR 7708T LaPEA, Université Paris Cité, Université Gustave Eiffel ; Institut Supérieur Maria Montessori, 94130
Nogent-sur-Marne
⁴ UMR 8240 Laboratoire de psychologie du développement et de l'éducation de l'enfant (LaPsyDÉ), CNRS,
Université Paris Cité, 75005 Paris
⁵ UMR 9217, Mines ParisTech, PSL Université, CGS, i3, CNRS 75272 Paris

As industrialized food systems have major impacts on health and the environment, food system transformation has become crucial. Our ongoing research examines consumers' ability to generate innovative solutions with the goal of changing food practices. We investigate obstacles to consumers’ innovation, particularly cognitive biases such as fixation effects: the triadic creativity model suggests that individuals often follow the "path of least resistance" in problem-solving, hindering original solutions. Subsequent research demonstrates that strategies tailored to specific fixation patterns can help overcome fixation effects and stimulate creativity. Identifying fixation effects is thus crucial for enhancing consumers' capacity to propose innovative solutions for food transition. In this empirical research, we apply an approach combining cognitive psychology and design science. We conducted creativity tasks in a rural area located in western France, with 865 children and teenagers. We elaborated creativity briefs related to food transition, particularly on local food, which is the subject of this article: How can I ensure that everything I eat comes from my village? To analyze the data collected, we used C-K (Concept-Knowledge) design theory to map idea exploration pathways, categorized the participant ideas, then calculated their frequency of occurrence. We observe fixation effects such as “growing individual vegetable gardens”, and we highlight differences between age groups. We further detail fixation effects within subcategories of ideas. Hence this empirical study, focused on crucial environmental issues, offers new methodological developments.

15: Creativity and Beauty: Is symmetry a good strategy for visual composition?

Yejeong Mutter, University of Konstanz
Ronald Huebner, University of Konstanz

This study explores symmetry’s influence on image creation and evaluation. Our previous study using visual composition revealed distinct patterns in beauty and creativity perception across expertise levels, with experts showing discriminating ability in creativity assessment. Among four picture types (lined-up, dispersed, clustered, and semantic), lined-up compositions received high beauty ratings, while semantic compositions were preferred in creativity assessment. However, the previous experiment used seven different-sized circles, making symmetrical composition impossible. Since symmetry is a primary compositional element evoking aesthetic preference and could aid in creating semantic images, the present study provided five pairs of differently sized circles. The two-stage experimental procedure involved participants first creating the most beautiful and most creative compositions, followed by a second stage where participants evaluated beauty and creativity of created images. Participants were art school graduates (expected experts) and psychology students (expected non-experts), with their expertise and creativity level verified through self-report. This time, images were assigned to multiple categories if necessary. Preliminary results showed lined-up type and symmetrical forms influence beauty ratings. For creativity, semantic compositions scored highly, while symmetry showed no clear effect. Notably, compositions with multiple interpretations received high creativity ratings from experts, and beauty and creativity ratings did not match their compositional goals, suggesting an unclear boundary between beauty and creativity during creation.

16: The Processes Underlying Visual Creative Exploration

Yogev Hendel, Hebrew University of Jerusalem

To achieve diverse solutions in creative search, humans tend to explore the space of possible solutions in search for novel and appropriate solutions. A common view of the exploration process is that it is random in nature, until one stumbles upon a solution that ignites a phase of directed move (exploitation) toward solutions from a similar theme. Yet, this view has not been tested nor quantified in many explore-exploit tasks, since the search trajectory is not accessible and only the products of the search are recorded. Here, we use data from the Creative Foraging Game (CFG) to study the processes of creative exploration. An advantage of the CFG is that it tracks the entire trajectories of the search process, allowing a peer into the clockwork of explorative behavior. We found that players' exploration is not random - Players tend to move towards specific shapes that serve as attractors (e.g. high symmetry shapes) even in their exploration, resulting in a sublinear rate of coverage of the shapes network. Through simulations of agents on the network of shapes, we further show that players' exploration statistics follow from a micro-scale process of power-law distribution in the ranks of chosen moves, indicating high skewness of preferences in the search. Our findings suggest that players' exploration is highly directed in nature, calling for a paradigm shift in how we describe and model explore-exploit behavior.

17: Are LLMs tools to understand human neurocognition during abstract reasoning

Christopher Pinier, University of Amsterdam
Claire Stevenson, University of Amsterdam
Michael Nunez, University of Amsterdam

Abstract reasoning, a key component of human intelligence, seems to have recently emerged in large language models (LLMs). If so, LLMs could help us provide a mechanistic explanation for the brain processes behind the abstract reasoning abilities of humans. In this study, we compared the performance of multiple LLMs to human performance in a visual abstract reasoning task. We found that while most LLMs cannot perform this task as well as human participants, some LLMs are competent enough for use as potential descriptive models. We propose that the best performing LLMs can be used as models to understand human performance, response times, and the timing of Event-Related Potentials (ERPs) as recorded by electroencephalography (EEG) during the task. We show initial behavioral and ERP results, where we use LLM embeddings and surprisal measures to predict cortical activity patterns. This is the first step in a larger project to create neurally-informed artificial networks as tools to understand human neurocognition.

18: Forma mentis networks predict creativity ratings of short texts via interpretable AI in human and GPT-simulated raters

Edith Haim, University of Trento, Rovereto
Natalie Fischer, University of Trento, Rovereto
Salvatore Citraro, National Research Council (CNR), Pisa
Giulio Rossetti, National Research Council (CNR), Pisa
Massimo Stella, University of Trento, Rovereto

Creativity is a fundamental skill of human cognition. We use textual forma mentis networks (TFMN) to extract network (semantic/syntactic associations) and emotional features from approximately one thousand human- and GPT3.5-generated stories. Using Explainable Artificial Intelligence (XAI) we test whether features relative to Mednick’s associative theory of creativity can explain creativity ratings assigned by humans and GPT-3.5. Using XGBoost, we examine 3 scenarios: (i) human ratings of human stories, (ii) GPT-3.5 ratings of human stories, and (iii) GPT-3.5 ratings of GPT-generated stories. Our findings reveal that GPT-3.5 ratings differ significantly from human ratings not only in terms of correlations but also because of feature patterns identified with XAI methods. GPT-3.5 favours “its own” stories and rates human stories differently from humans. Feature importance analysis with SHAP scores shows that: (i) network features are more predictive for human creativity ratings but also for GPT-3.5´s ratings of human stories; (ii) emotional features played a greater role than semantic/syntactic network structure in GPT-3.5 rating its own stories. These quantitative results underscore key limitations in GPT-3.5´s ability to align with human assessments of creativity. We emphasise the need for caution when using GPT-3.5 to assess and generate creative content, as it does not yet capture the nuanced complexity that characterises human creative processes.

19: Promoting Creativity through AI-Generated Visual Concept Blending: An Experimental Study on Divergent Thinking

Fabrizio Serrao, University of Milano Bicocca
Rafael Penaloza Nyssen, University of Milano Bicocca
Lorenzo Olearo, University of Milano Bicocca
Giorgio Longari, University of Milano Bicocca
Simone Melzi, University of Milano Bicocca
Alessandro Gabbiadini, University of Milano Bicocca

Hyper-associative thinking can enhance creativity. Indeed, experimental evidence has shown that combining unrelated concepts can promote creative ideas. While previous studies used text to prompt such combinations, presenting them through images may be even more effective. Artificial Intelligence diffusion models now allow for this by generating blended concepts. The aim of the present study is to test the effects of visual concept blending on creativity. Using a within-between experimental design, we will examine whether AI-generated images improve divergent thinking and which of the employed models, compared to the control condition, elicits the strongest effects. A power analysis indicated that 70 participants are needed to test the hypotheses, assuming a small effect size (f = 0.2). Two AI-based diffusion models will be adopted to generate visual blends of unrelated concepts. Participants will be divided into three groups. Two groups will view images created through the diffusion models by blending couples of unrelated objects (e.g., “chair” + “apple”) and will write a short description of each image. The control group will view and describe images showing the same objects unblended (e.g., a chair and an apple side by side). Time and word limits will be standardized across all groups. Divergent thinking will be assessed before and after exposure to the AI-generated images using the Alternate Uses Test. We hypothesize that participants exposed to blended images will show greater improvements in creative thinking than those in the control condition. These findings could offer strategies for using AI to enhance, rather than replace, human creativity.

20: Thinking Outside of the (Black) Box for Scoring Creative Thinking Assessments

Kristin Lansing, NeoWise
Garrett Jaeger, winded.vertigo

Scoring for Creative Thinking assessments often lacks transparency and standardization. We propose a creative thinking task type and scoring method that provides standardization and transparency for creative thinking assessments. The method proposes the use of skill-specific task types designed for specific skill operationalizations combined with scoring that utilizes stimulus-specific thematic frequencies (i.e., flexibility categories) across a spectrum of conventionality. Methods for skills operationalization and task design are discussed as well as the procedure for identifying themes (using both humans and AI), establishing thematic frequencies, and response coding. We illustrate the advantages of the procedure to improve the transparency and standardization of creative thinking assessments.

21: Evidence of Homogenization in Human Writing After ChatGPT

Kibum Moon, Georgetown University
Andrew Bank, Georgetown University
Kostadin Kushlev, Georgetown University
Adam Green, Georgetown University

Despite the rapid advancement of Large Language Models (LLMs), there are growing concerns that LLMs may limit the collective diversity of creative outputs. We examined this homogenizing effect of LLMs by comparing the collective diversity of human writings before and after the public release of ChatGPT in 2022. For this, we analyzed 1,400 college admissions essays, consisting of 200 randomly selected essays from each admissions cycle between 2019 and 2025. Then, we assessed the collective diversity of essay groups submitted each year by measuring the cosine distances between the sentence-level embeddings in each essay corpus. We found that the collective diversity was significantly lower for post-ChatGPT groups than for pre-ChatGPT groups. This homogenization effect was increasingly evident when more essays were combined into the groups. Specifically, each additional post-ChatGPT essay contributed significantly fewer novel ideas compared to each additional pre-ChatGPT essay. Importantly, when comparing the average semantic diversity at the individual level, we found no significant difference between pre-ChatGPT and post-ChatGPT groups. Together, our results show that individual post-ChatGPT essays are equally creative compared to pre-ChatGPT essays; however, post-ChatGPT essays lack creative diversity when evaluated at the collective level. Our findings offer empirical evidence supporting the LLM-induced homogenization in the real world, highlighting the potential risk of LLMs reducing creative and cultural diversity within society. We plan to replicate our findings using our full dataset of 135,000+ essays and other writings.

22: Evaluating Creative Short Story Generation in Humans and Large Language Models

Mete Ismayilzada, Idiap Research Institute, EPFL
Claire Stevenson, University of Amsterdam
Lonneke van der Plas, Università della Svizzera Italiana

Storytelling is a fundamental aspect of human communication, relying heavily on creativity to produce narratives that are novel, appropriate, and surprising. While large language models (LLMs) have demonstrated the ability to generate high-quality stories, their creative story-telling capabilities remain under-explored. In this work, we conduct a systematic analysis of creativity in short story generation across 60 LLMs and 60 people using a five-sentence creative story task. We used measures to automatically evaluate model- and human-generated stories across several dimensions of creativity, including novelty, surprise, diversity, elaboration and linguistic complexity. We also collected creativity ratings and Turing Test classifications from both novices and experts. Automated metrics show that LLMs generate stylistically complex stories but tend to fall short in terms of novelty, surprise and diversity when compared to average human writers. Expert ratings generally coincide with automated metrics. However, novices rate LLM stories to be more creative than human-generated stories. We discuss why and how these differences in ratings occur and their implications for both human and artificial creativity.

23: Automated scoring of creative achievements

Noah Meinzer, University of Graz
Janika Saretzki, University of Graz
Mathias Benedek, University of Graz

Creative achievement is a key criterion for validating measures of creative potential. Although existing inventories of creative achievement (e.g., CAQ, ICAA) are comprehensive, they risk overlooking relevant achievements due to the diversity of creative accomplishments across domains. As an alternative, a bottom-up approach invites participants to openly report their three most creative achievements, potentially offering a more inclusive and efficient process. However, this method requires scoring participants' descriptions, typically through human ratings. Here, we explore the potential for Large Language Models (LLMs) to effectively score these creative achievements. While data analysis is ongoing, initial findings reveal strong correlations between human and LLM-based ratings, even with out-of-the-box models (zero-shot prompting). We present comparative analyses of different LLMs (e.g., GPT-4, Llama) and appear robust across two datasets. Additionally, we explore an evaluation of zero-shot classification for automatically extracting the creative domains represented in each achievement. Advantages and limitations of these methods are discussed. In sum, automated scoring of self-reported creative achievements shows promise as an efficient alternative to traditional creative achievement inventories.

24: The Creative Metaphor Task: A Psychometric Assessment of Metaphor Production in English and Spanish

Hannah Merseal, University of Pennsylvania
Benjamin Goecke, University of Tübingen
Paul DiStefano, Penn State University
Janet van Hell, Penn State University
Roger Beaty, Penn State University

Metaphors are essential for communication, offering insight into creativity and cognition through figurative language. In bilingual individuals, metaphor production is influenced by language proficiency and cultural context. Currently, there is no reliable test measuring metaphor generation ability available beyond English. This study aims to develop a psychometric tool to assess bilingual metaphor production using simple prompts (e.g., “The small dog is…”) that require original metaphors (e.g., “a tiny tornado”). Using automatic item generation with GPT-4, we generated 100 metaphor prompts, evenly split between English and Spanish. A sample of 160 Spanish-English bilinguals on Prolific provided responses, averaging 8,220 ms per item (SD = 27,440 ms). To evaluate metaphor originality, 100 Spanish-English bilingual raters scored responses using a planned missing data design, improving efficiency while reducing participant burden. We applied a generalized partial credit model (GPCM) to compute factor scores for each item and person. Preliminary analysis indicated sufficient rater agreement and measurement precision across both languages (Factor Determinacy Index = .77, ranging from .58 to .87). To further develop the assessment tool, additional factor analysis will identify the most effective items, forming a short test of 15-20 items per language. Finally, large language models will be used to automate originality scoring. This research aims to provide a foundation for developing reliable and valid assessments of bilingual metaphor production, advancing the understanding of creative language use in multilingual populations and supporting linguistic assessment applications.

25: Neural signatures of linguistic creativity in internet memes

Marco Steinhauser, Catholic University of Eichstätt-Ingolstadt
Madita Bach, Catholic University of Eichstätt-Ingolstadt
Luisa Peter, Catholic University of Eichstätt-Ingolstadt
Amelie Weeger, Catholic University of Eichstätt-Ingolstadt
Thomas Hoffmann, Catholic University of Eichstätt-Ingolstadt

One domain of human cognition that has recently received considerable attention in cognitive linguistics and cognitive psychology is linguistic creativity. Here, we present preliminary data from an ongoing study which seeks to (1) identify neural responses to creative vs. uncreative memes in electrophysiological data and (2) track learning curves associated with these memes. In our paradigm, memes are displayed in two steps. First, a picture and the first part of the text are provided until participants press a key. Then, the second part of the text is provided which reveals the meaning of the meme and varies in creativity. Participants are required to detect letter transpositions in the second text part by providing a speeded manual response. Participants work through four blocks, and each of the 60 memes are presented once per block. Half of the memes are constructed as high-creativity stimuli and the other half are low-creativity stimuli, and participants additionally rate the creativity associated with each meme at the end of the experiment. We predict that high-creativity stimuli show enhanced event-related potentials related to semantic expectation violation (e.g., N400) as well as related phenomena in oscillatory dynamics (e.g., alpha/beta suppression). Moreover, we expect that high-creativity stimuli exhibit a steeper learning curve across the four blocks in the letter transposition task. If confirmed, these findings are in line with the idea that neural signals to creative content serve as learning signals that facilitate their encoding.

26: CXN-DT: Using constructional contexts to measure linguistic creativity

Marco Steinhauser, KU Eichstätt-Ingolstadt
Thomas Hoffmann, KU Eichstätt-Ingolstadt

Language is not just words that are used out of context. Most verbal creativity tasks, however, treat language as a bag-of-words phenomenon (an exception being Haim et al. 2024). Recent advances in cognitive linguists argue, instead, that language is better analyzed as a network of symbolic associations (‘constructions’, cxns) that include schemas bigger than words (Diessel 2019). These include idioms such as ‘X spills the beans’ (He spills the tea = He shares gossip) but also meaningful syntactic schemas such as ‘The Xer, the Yer’ (e.g. the more you read, the more you know). We now present a novel constructional approach to divergent thinking tasks. As part of this, we are currently running a first experiment that compares the results from a lexical naming task (Olson et al. 2021) with one that requires subjects to name words in a constructional context (such as the Transitive construction Yesterday, the man ––– the woman. or Yesterday, the woman ––– the hell out of the man). Using semantic vector spaces and productivity measures (LNRE models), we aim to show that constructional divergent thinking (CXN-DT) tasks allow for a better assessment of the appropriateness of divergent responses (and hence are a better proxy for creativity). References Diessel, H. 2019. The Grammar Network: How Linguistic Structure is Shaped by Language Use. CUP. Haim, E., et al. (2024). The Word-sentence construction (Woseco) task versus the verbal fluency task: Capturing features of scientific creativity via semantic networks. https://doi.org/10.31234/osf.io/f2z9a Olson, J.A. et al. (2021) Naming unrelated words predicts creativity. Proc. Natl. Acad. Sci. U.S.A. 118 (25) e2022340118.

27: The effects of immersion and contextual cue richness in enhancing divergent thinking performance in virtual environments

Jiayin Liu, Université Paris Cité, LaPEA
Jean-Marie Burkhardt, Univ Gustave Eiffel and Université Paris Cité, LaPEA
Todd Lubart, Université Paris Cité, LaPEA

The richness of contextual cues is a key environmental factor influencing divergent thinking performance in virtual environments (VEs). Immersion level, which affects user experience and performance in VEs, varies depending on the type of Virtual Reality (VR) equipment used (e.g., computer screens vs. Head-Mounted Displays (HMDs)). Additionally, individuals' interactions with the VE, as well as their user experience, such as affect, presence, and cybersickness, may also impact divergent thinking performance. To investigate how environmental factors (i.e., contextual cue richness, immersion), user behaviors (i.e., interaction with the VE), and user experience (i.e., affect, presence, cybersickness) influence divergent thinking performance, we conducted a mixed-design lab experiment with 38 undergraduate students. The study manipulated two levels of contextual cue richness (rich/poor) and two levels of immersion (immersive/non-immersive). Participants' divergent thinking performance was assessed using the Alternative Uses Task (AUT) before and after a 10-minute exploration of the given VE. Additionally, interaction duration and user experience were measured. Three hierarchical regression analyses were performed on three divergent thinking performance criteria: fluency, originality, and elaboration. Results indicated that the combination of an immersive environment and rich contextual cues had the strongest positive effect on divergent thinking performance. Furthermore, participants' behaviors and user experience, as well as the interaction between presence, cybersickness, and immersion, influenced divergent thinking performance in different ways.

28: How do Humans and Language Models Reason About Creativity? A Comparative Analysis

Antonio Laverghetta Jr., Pennsylvania State University
Jimmy Pronchick, Pennsylvania State University
Krupa Bhawsar, Pennsylvania State University
Roger E. Beaty, Pennsylvania State University.

Creativity assessment in science and engineering is increasingly based on both human and AI judgment, but the cognitive processes and biases behind these evaluations remain poorly understood. We conducted two experiments examining how contextual information (example solutions with expert ratings) impacts creativity evaluation. In Study 1, we analyzed creativity ratings from 72 experts with formal science or engineering training, comparing those who received example solutions with ratings (oracle) to those who did not (no oracle). Computational text analysis revealed that no oracle experts used more comparative language (e.g., "better/worse") and emphasized solution uncommonness, suggesting they may have relied more on memory retrieval for comparisons. In contrast, oracle experts used less comparative language and focused more on direct assessments of cleverness. In Study 2, parallel analyses with state-of-the-art large language models (LLMs) revealed that AI instead prioritized uncommonness and remoteness of ideas when rating originality, suggesting an evaluative process rooted around the semantic similarity of ideas. In the oracle condition, while LLM accuracy in predicting the true originality scores improved, the correlations of remoteness, uncommonness, and cleverness with originality also increased substantially - to upwards of 0.99 - suggesting a homogenization in the LLMs evaluation of the individual facets. These findings highlight important implications for how humans and AI reason about creativity and suggest diverging preferences for what different populations prioritize when rating.

29: All Eyes On The Smartphone Canvas: Expert and Non-Expert Viewing Patterns, Preferences and Memory of Online AI and Human Art

Bernard Vaernes, University of Oslo

Digitization has given artists and non-artists easy access to a variety of art forms, from abstract to figurative, and even works created by artificial intelligence. Such digitized versions of artworks are now accessible on many platforms, oftentimes accessed on smartphones. This study aims to assess how visual art expertise impacts viewing patterns, decision making and memory retention of different works when viewed online. In other words, it explores the differences in consumption of art in a modern real-life setting. To achieve this, 116 images of different works of art from the collection of the Norwegian National Gallery, and 116 equivalent AI generated works were presented to 24 graphic art, fine art and new media arts students from Fine Art Academies and University Art programs, and 23 non-art University students in Poland online on their smartphone screens, while their integrated selfie camera recorded their eye-movements. Participants rated the aesthetic appeal of the paintings, and in a second experiment completed a memory test of previously viewed and new, human and AI-generated works. Art students used more global (ambient) saccades and had smaller fixation spreads than non-art students, especially for abstract paintings. No convincing evidence was found for differences in aesthetic evaluations between art students and non art students. Evidence for higher memory scores overall and for AI and abstract paintings in art students was found. Future studies should use other methods as well as AI stimuli to study visual art expertise differences, and test the validity of existing expertise theories.

30: Automated Utility Scoring for the AUT

Rebeka Privoznikova*, University of Amsterdam
Surabhi Nath*, Max Plank Institute Tuëbingen
Raoul Grasman, University of Amsterdam
Luke Korthals, University of Amsterdam
Claire Stevenson, University of Amsterdam
* shared first authorship

Out of the two components of creativity - originality and utility - the automated scoring of the former has received much more attention in the literature than the latter. However, the utility of creative responses is nearly as important when assessing creativity, especially when evaluating responses generated by AI (and adolescents). For example, using a pen to "build a house", is humorous, but not very effective. In this project, we create a series of machine learning (ML) models to automatically predict expert ratings of response utility on the Alternative Uses Test (AUT). Evaluating utility differs from originality scoring and requires greater understanding of object characteristics and common-sense knowledge on how they function in the real-world. Therefore, besides traditional predictors related to response content and its uniqueness, we also include predictors relating responses to real-world knowledge (e.g., semantic distance to uses generated from a knowledge graph's 'UsedFor' relation). After finding the best predictors of AUT response utility, we examine the effects of training data on ML model performance, by comparing training regimes of different subsets of human, LLM and curated out-of-distribution responses. We compare our best performing ML algorithm's predictions to out-of-the-box and prompt-engineered LLM performance on both standard and novel challenge benchmarks. Preliminary results show that our ML model outperforms LLMs not trained for this task. We discuss the importance of curated training data and evaluating models' generalization capabilities on challenging datasets.

31: Enhancing Future-Oriented Thinking in Engineering Design Through Cognitive Interventions and AI Assistance

Avinash Aruon, Virginia Tech
Tripp Shealy, Virginia Tech
John Gero, University of North Carolina, Charlotte

Creativity in engineering design requires the ability to anticipate future challenges and opportunities, yet designers often focus on immediate constraints rather than long-term resilience. This study explores how structured interventions—future thinking prompts and generative AI assistance—shape creative problem-solving and neurocognitive engagement. Engineering students (n = 90) developed conceptual designs for a development project under three conditions: a control group without future-oriented prompts, a future-thinking group prompted to describe future site conditions, and an AI-assisted group using a generative AI tool to support future scenario envisioning. Functional near-infrared spectroscopy (fNIRS) captured neurocognitive activation patterns in the prefrontal cortex during the design process. Findings indicate that both future thinking prompts and AI assistance enhance creative ideation by encouraging designers to engage with long-term possibilities while reducing cognitive load. The group that used a generative AI tool developed more ideas and used significantly lower cognitive resources in their prefrontal cortex, specifically in the left ventrolateral PFC and left dorsolateral PFC regions. This research highlights how structured cognitive interventions can reshape neural activation patterns and foster more unique, creative, and resilient design solutions.

32: Human versus Large Language Models’ Associative and Meaning-making Processes

Liane Gabora and Linxuan Wang, University of British Columbia

To be helpful co-creators of textual content, large language models (LLMs) should be able to form and use associations in a human-like manner. We asked participants, ChatGPT (a LLM), and Seq2Seq (a LLM precursor), to invent new words along with their corresponding meanings such that each word feels well-suited to its meaning. (As an example, ChatGPT invented ‘sproinkle,’ and its corresponding meaning, ‘To energize or invigorate something in a lively and bouncy manner.’) Despite that all words were actually meaningless, participants were able to guess which word goes with which meaning for both human-generated and ChatGPT-generated word-meaning pairs, but not for Seq2Seq-generated word-meaning pairs. When ChatGPT was asked to explain how it came up with word-meaning pairs, the process it recounted was very similar to how humans generate new word-meaning pairs. In Study Two, to assess whether ChatGPT’s abilities extend across languages, this was repeated for Mandarin words. In Study Three, to assess whether a word-meaning pair feels right because the word shares phonetic elements with other semantically related words in the language, or due to universal properties inherent in the sounds themselves, we compare the ability of bilinguals with non-Mandarin speakers to identify which Chinese words correspond with each meaning. The relatability of ChatGPT’s word-meaning pairs suggests that ChatGPT mimics the aesthetic, associative processes underlying human sound symbolism and word generation. We analyze what features of both human brains and LLMs enable them to outperform Seq2Seq word-meaning pair generation task, and discuss the significance for machine creativity.

33: Training AI to Assess Human Creativity across Tasks, Modalities, and Languages

Roger E. Beaty, Pennsylvania State University
Simone A. Luchini, Pennsylvania State University
Benjamin Goecke, University of Tübingen
Mete Ismayilzada, École Polytechnique Fédérale de Lausanne (EPFL)
Antonio Laverghetta Jr., Pennsylvania State University
Peter Organisciak, University of Denver
John D. Patterson, Pennsylvania State University
Claire Stevenson, University of Amsterdam
Roni Reiter-Palmon, University of Nebraska Omaha

Large language models (LLMs) are increasingly used to automate creativity assessments, reducing reliance on onerous human scoring. However, current AI-based approaches to creativity scoring remain narrowly focused—limited to specific tasks (e.g., AUT), single modalities (e.g., text), or English-language contexts. We introduce ORACL—Originality Assessment Across Languages—a multimodal LLM capable of handling both images and texts across many languages. Fine-tuned on a novel dataset of 280,000 human-rated creative responses collected from the global creativity research community, ORACL spans 30 tasks (text and visual) in 10+ languages, from laboratory tasks like the AUT to naturalistic tasks like problem solving and story writing. Computational experiments demonstrate ORACL's ability to reliably predict human creativity ratings on unseen responses, indicating it captures consistent cross-cultural patterns in creativity evaluation. Importantly, ORACL shows evidence of generalization, predicting human ratings from languages and creativity tasks it was not trained on. Our results establish the first multilingual, multimodal AI system for creativity evaluation, with potential to assess creativity on other tasks and languages—pending further validation to understand the model’s limits and potential biases. We will release both the training dataset of 280k creative responses/human ratings, and the ORACL model, to enable automated creativity assessment globally and advance understanding of how humans and AI models evaluate creativity.

34: Pencils to Pixels: Studying Drawing Creativity in Children, Adults and AI

Surabhi S Nath, Max Planck Institute for Biological Cybernetics; Max Planck School of Cognition; University of Tübingen
Guiomar Del Cuvillo Y Schröder, University of Amsterdam
Claire Stevenson, University of Amsterdam

Visual creativity has received far less attention compared to verbal creativity, with only a handful of empirical investigations. This is due to the greater complexities involved in producing and evaluating visual outputs. To tackle this, we curate a novel dataset of drawings, a medium that offers sufficient control without compromising on creative potential. Using a popular creative drawing task, we curate a novel dataset comprising of ~1500 drawings by children (n=148, age groups 4-6, 7-9), adults (n=148) and AI (Dall-e, generated using three different prompts) and devise methods to systematically investigate visual creativity. We use computational measures to characterize two aspects of the drawings---(1) style and (2) content, at both product and process levels. For style, we define measures based on ink density, ink distribution and number of visual elements. For content, we use manually annotated categories to study conceptual diversity, and use embeddings of images and captions to compute distance measures. We compare the style, content and creativity of children, adults and AI drawings and build simple models to predict expert and automated creativity scores. We find significant differences in style and content of the different groups---children drawings had more components, while AI drawings had greater ink density and lines, and adult drawings revealed maximum conceptual diversity. Notably, we highlight a misalignment between creativity judgments obtained through expert and automated ratings and discuss its implications. Through these efforts, our work provides a framework for systematically studying human and artificial creativity beyond the textual modality.

35: Generative AI vs. Creative Brains: AI could beat us in Art… but also in Science?

Vera Eymann, Center for Cognitive Science, University of Kaiserslautern-Landau (RPTU), Kaiserslautern, Germany

Thomas Lachmann, Center for Cognitive Science, University of Kaiserslautern-Landau (RPTU), Kaiserslautern, Germany; Centro de Investigación Nebrija en Cognición (CINC), Universidad Nebrija, Madrid, Spain; Brain and Cognition Research Unit, Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium

Daniela Czernochowski, Center for Cognitive Science, University of Kaiserslautern-Landau (RPTU), Kaiserslautern, Germany

Scientific creativity encompasses the ability of conducting creative science experiments and developing creative approaches to solve science problems. Today, our world is in desperate need for creative minds to master the many challenges of our time, such as pollution, socioeconomic inequality, and disinformation. At the same time, we are witnessing an upsurge of generative artificial intelligence (AI), which has been proclaimed to permanently end our human creativity (e.g., Sternberg, 2024). In fact, AI already seems to be challenging the field of visual arts and music. But what about science? We developed a task that requires the generation of scientific hypotheses, the design of experiments and a justification in terms of their usefulness and originality to assess scientific creativity. Using a fictious scientific scenario, we asked students (enrolled in the study program Cognitive Science) as well as ChatGPT to create/generate an abbreviated version of a research proposal. Using a structured (blinded) rating, an expert from the respective field evaluated the students' research proposals and the proposals generated by ChatGPT in terms of their scientific quality and originality. Our results indicate that ChatGPT reached significantly higher overall scores in the task, associated with overall longer and more detailed responses. However, the subscale for scientific originality revealed that students’ ideas were rated as more original and creative. We will discuss further implications of our findings along with future directions for the research on scientific creativity. And whether or not the writing of grant proposals should be placed in artificial hands in the future.

36: Agentic Perspective on Human-AI Collaboration for Image Generation and Creative Writing: Insights from Think-Aloud Protocols

Janet Rafner, Aarhus University
Blanka Zana, Aarhus University
Ida Bang Hansen, Aarhus University
Simon Ceh, Graz University
Jacob Sherson, Aarhus University
Mathias Benedek, University of Graz
Izabela Lebuda, University of Wraclaw

As generative AI becomes an increasingly prevalent in creative domains, questions arise regarding its impact on human agency—specifically autonomy, control, efficacy, and ownership—throughout the creative process. This study investigates how individuals negotiate their creative agency when collaborating with AI tools for image generation and creative writing. Using think-aloud protocols and post-task interviews, we analyze the experiences of 13 participants across two co-creative tasks (image generation, 6; writing, 7). Through qualitative coding, we systematically identify patterns in how participants express and regulate agency throughout their interactions with AI. Our findings reveal that agency is dynamic rather than static, fluctuating at different stages of the creative process. Participants experienced high autonomy in idea generation but struggled with control in execution, particularly in image generation tasks. Creative efficacy was enhanced when AI suggestions were used strategically—either by aligning with user intent or serving as counter-inspiration. Ownership perceptions varied, with some participants maintaining a strong sense of authorship, while others felt distanced from their outputs due to the AI’s unpredictable influence. These insights suggest that AI-mediated creativity is not a simple augmentation of human ability but a negotiation of agency that depends on the tools’ interaction styles and affordances. Our study highlights the need for co-creative AI systems that adapt to users’ shifting agency needs, fostering meaningful collaboration rather than passive reliance on automation.

37: The Art of Prompting is a Creative Skill! Replicating the Be-Creative Effect in Human-AI Co-Creativity.

Simon Ceh, University of Graz
Roger Beaty, PennState University

Generative AI is spearheading the digital transformation of creative work (e.g., Rafner et al., 2024) but humans still need to prompt the AI systems (e.g., Oppenlaender, 2023). So far, little research has investigated how individual differences impact the successful creative use of generative AI (e.g. Orwig et al., 2024). We present data from the Alternative Prompting Task (APT), where the goal is to generate image prompts for open-ended themes (e.g., "Silence") under time constraints (3 minutes). Using a within-subjects design, 195 participants completed the task under two conditions: be-creative (generate creative prompts) versus be-fluent (generate many prompts). We measured intelligence (e.g., gf, gc), divergent thinking, AI experience, and other individual differences. We replicated the classic be-creative effect (e.g., Nusbaum et al., 2014) in human-AI co-creativity: When instructed to be creative, participants generated fewer (M = 7.47 vs. 12.22) but more creative (M = 2.11 vs. 1.78) prompts compared to the be-fluent condition. Noteworthy, participants with higher fluid intelligence showed greater creative gains under be-creative instructions (β = 0.05, p < .001), exemplifying the relevance of cognitive ability even in the age of generative AI. We will present additional findings on the role of individual differences and whether creative prompting equals creative results.

38: Creative Partner: Comparing human and generative AI as collaborator in creative tasks

Clin KY Lai, Pennsylvania State University
Simone Luchini, Pennsylvania State University
Roger Beaty, Pennsylvania State University

Creativity is often a collaborative process involving both the generation of novel ideas and the evaluation of those ideas to identify the best solutions. While the ability of generative AI to mimic naturalistic language has led to its increasing integration into various creative tasks, studies of human-AI co-creativity have shown mixed results. Few studies to date have directly contrasted human-human and human-AI creative collaboration—a crucial step in understanding creativity in an increasingly AI-present world. Thus, it remains unclear if ideas generated with AI tend to be more or less creative than ideas generated when collaborating with another human. This study addresses this gap by examining the dynamics of human-human and human-AI collaboration across two creative tasks: the Alternative Uses Task (AUT) and a creative short story writing task. Participants were paired to complete two creative tasks designed to assess divergent thinking and collaborative creativity. They were randomly assigned either Role A (responding first) or Role B (responding after A) and took turns generating responses with either a human or an AI partner. The human-AI condition will follow the same format, with the AI serving as the collaborative partner. This explored the effects of human-AI versus human-human collaboration on creative outcomes and how individuals' creative ability and perceptions of their partner (human or AI) influenced collaborative outcomes. Insights from this research will provide a deeper understanding of the potential benefits and limitations of integrating generative AI into creative workflows.

39: AI-Driven Transformation of Creative Learning Environments in STEAM

Estelle Linjun Wu, University of Cambridge

STEAM, fosters interdisciplinary learning and real-world problem-solving, aligning with the evolving demands of a globally connected world.Thus synergy between STEAM and emerging technologies—artificial intelligence (AI)—presents new opportunities for cultivating creativity in learners. With AI increasingly reshaping the social and material dimensions of learning environments, this study explores how AI-powered tools, particularly large language models (e.g., ChatGPT, ERNIE Bot), complement traditional educational technologies, influence the construction of creative learning environments and student engagement in STEAM programs. Using ethnographic methods, data were collected through participant observation, in-depth interviews with 32 students and 6 instructors, and artifact analysis within three secondary schools renowned for their innovative pedagogical practice in Beijing. Five core themes were identified shape students' creative engagement: attitude, collaboration, climate, conflict, and material components. Findings suggest that generative AI enhances creative practices by supporting prototyping visualisation, ideational fluency, and design iteration. However, students also face tensions between human originality and AI-generated contributions, reflecting broader challenges in human-AI co-creation. Notably, the timing, context and manner of AI adoption critically influence the social and material interactions, and the overall creative process and learning experience.This research offers practical guidance for educators to effectively mentor students in the use of AI within STEAM, ultimately fostering creativity-supportive learning environments in an AI-driven world.

40: Evaluating AI’s Ideas: The Roles of Individual Creativity and Expertise in Human-AI Co-Creativity

Paul V. DiStefano, Pennsylvania State University, United States
Daniel C. Zeitlen, Pennsylvania State University, United States
Janet Rafner, Aarhus University, Denmark
Pier-Luc de Chantal, University of Quebec, Canada
Aoran Peng, Pennsylvania State University, United States
Scarlett Miller, Pennsylvania State University, United States
Roger E. Beaty, Pennsylvania State University, United States

As generative artificial intelligence (AI) increasingly integrates into education and work, it is crucial to understand who benefits most from human-AI collaboration. This study examines how domain expertise, creative self-efficacy, and baseline creative ability influence human-AI co-creativity in a real-world engineering design task. We simulated co-creativity using pre-generated ideas from GPT-3.5-turbo, ensuring consistent AI suggestions to assess idea generation and evaluation. Engineering (N = 99) and psychology students (N = 212) first generated an initial solution (Idea 1), evaluated AI-generated solutions, and then revised their response (Idea 2). Linear mixed-effects models revealed that expertise, baseline generation ability, and evaluation ability predicted Idea 2 quality. Engineering students consistently produced more novel and effective solutions, highlighting the role of expertise. However, both groups improved comparably after evaluating ChatGPT’s ideas, supporting the “rising tides lift all boats” hypothesis—AI benefits individuals across expertise equally. Additionally, using a novel categorization scheme comparing Idea 1, the ChatGPT ideas, and Idea 2, we found significant group differences in what inspired participants’ Idea 2. These findings underscore the importance of domain expertise and evaluation skills in human-AI co-creativity. While AI can enhance creative output, human expertise remains essential for grounding AI-generated ideas in practical reality, emphasizing the need to develop domain-specific knowledge and evaluation skills in education, work, and professional settings.