When artificial minds negotiate: dark personality and the ultimatum game in large language models
We investigate if Large Language Models (LLMs) exhibit personality-driven strategic behavior in the Ultimatum Game by manipulating Dark Factor of Personality (D-Factor) profiles via standardized prompts. Across 400k decisions from 17 open-source models and 4,166 human benchmarks, we test whether LLM...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Book/Monograph Working Paper |
| Language: | English |
| Published: |
Heidelberg
Heidelberg University, Department of Economics
16 Dez. 2025
|
| Series: | AWI discussion paper series
no. 768 (November 2025) |
| In: |
AWI discussion paper series (no. 768 (November 2025))
|
| DOI: | 10.11588/heidok.00037813 |
| Subjects: | |
| Online Access: | Verlag, kostenfrei: https://archiv.ub.uni-heidelberg.de/volltextserver/37813/1/Ferraz_Olah_Sazedul_et._al._2025_dp768%20A.pdf Resolving-System, kostenfrei: https://nbn-resolving.org/urn:nbn:de:bsz:16-heidok-378137 Resolving-System, kostenfrei: https://doi.org/10.11588/heidok.00037813 |
| Author Notes: | Vinícius Ferraz, Tamas Olah, Ratin Sazedul, Robert Schmidt, Christiane Schwieren |
| Summary: | We investigate if Large Language Models (LLMs) exhibit personality-driven strategic behavior in the Ultimatum Game by manipulating Dark Factor of Personality (D-Factor) profiles via standardized prompts. Across 400k decisions from 17 open-source models and 4,166 human benchmarks, we test whether LLMs playing the proposer and responder roles exhibit systematic behavioral shifts across five D-Factor levels (from least to most selfish). The proposer role exhibited strong monotonic declines in fair offers from 91% (D1) to 17% (D5), mirroring human patterns but with 34% steeper gradients, indicating hypersensitivity to personality prompts. Responders diverged sharply: where humans became more punitive at higher D-levels, LLMs maintained high acceptance rates (75-92%) with weak or reversed D-Factor sensitivity, failing to reproduce reciprocity-punishment dynamics. These role-specific patterns align with strong-weak situation accounts-personality matters when incentives are ambiguous (proposers) but is muted when contingent (responders). Cross-model heterogeneity was substantial: Models exhibiting the closest alignment with human behavior, according to composite similarity scores (integrating prosocial rates, D-Factor correlations, and odds ratios), were dolphin3, deepseek_1.5b, and llama3.2 (0.74-0.85), while others exhibited extreme or non-variable behavior. Temperature settings (0.2 vs. 0.8) exerted minimal influence. We interpret these patterns as prompt-driven regularities rather than genuine motivational processes, suggesting LLMs can approximate but not fully replicate human strategic behavior in social dilemmas. |
|---|---|
| Physical Description: | Online Resource |
| DOI: | 10.11588/heidok.00037813 |