A closer look at how large language models ‘trust' humans: patterns and biases
Valeria Lerman, Yaniv Dover
Proceedings of the Royal Society A Mathematical Physical and Engineering Science·2026
Abstract
As large language models (LLMs) and LLM-based agents increasingly interact with humans in decision-making contexts, understanding trust dynamics between humans and AI agents becomes crucial. While human trust in AI is well-studied, how LLMs develop emulated trust in humans remains far less understood. We compare five LLMs with human participants across five scenarios and 43 200 simulations where subjects vary in competence, benevolence, integrity, and demographics. LLMs and humans show partial convergence: both reliably distinguish high- from low-trait levels, and higher competence and integrity yield favourable decisions, suggesting a shared basic structure of trust. However, LLMs are significantly more extreme and internally consistent, and the basic trust model explains much more variance in their decisions than in humans'. LLMs also treat the three dimensions as more independent, whereas humans collapse them into a global ‘good person' impression. Within the models themselves, we observe substantial heterogeneity in how trust is emulated. LLMs also exhibit stronger, more systematic demographic biases than humans. Overall, LLMs seem to implement a coherent but rigid and sometimes biased model of interpersonal trust that only partially aligns with human judgment, highlighting both their potential and their limitations as decision-support tools and as models of human social cognition.