A Comparison of How Four Large Language Models Resolve Trolley-Problem Moral Dilemmas
DOI:
https://doi.org/10.58445/rars.3925Keywords:
Large language models (LLMs), Moral reasoning, AI ethics, Trolley problemAbstract
As large language models (LLMs) are increasingly consulted on questions that carry moral weight, whether different models resolve those questions differently has become an empirical matter. Using a fully crossed factorial design—8 scenario variants × 5 prompt framings × 4 models (GPT-5, Claude Sonnet 4.6, Gemini 3.5 Flash, Grok 4.3 Fast) × 3 replicates = 480 trials—this study measured how each model resolves trolley-style dilemmas and how its choice shifts with the morally relevant feature of the scenario and with the framing of the question. Each trial was coded as utilitarian (choosing the option that maximizes total lives) or not. The four models differed markedly and in a clean order: Claude chose the life-maximizing option 40.8% of the time versus Grok’s 75.8%—a 35-percentage-point spread whose Wilson confidence intervals do not overlap—with Gemini (56.7%) and ChatGPT (66.7%) between them. Scenario features produced the largest and most interpretable pattern: the consent scenario (8%) and the footbridge personal-force case (35%) drove utilitarian choice to the floor, mirroring established human moral psychology, while the clean baseline produced unanimous life-maximizing across every model. Notably, no model ever refused or returned an unclear answer in 480 trials. Requesting step-by-step reasoning was associated with higher utilitarian choice, while rephrasing the dilemma from harm to rescue was associated with lower choice. A mixed-effects logistic regression that accounts for the replicate dependence is consistent with the model and scenario effects holding once the repeated-measures structure is modeled (Supplement S1). Together the results suggest that LLM moral verdicts are model-distinctive and framing-sensitive, consistent with the view that they reflect learned, deployment-shaped dispositions rather than stable ethical commitments.
References
Awad, E., Dsouza, S., Kim, R., Schulz, J., Henrich, J., Shariff, A., Bonnefon, J.-F., & Rahwan, I. (2018). The Moral Machine experiment. Nature, 563(7729), 59–64. https://doi.org/10.1038/s41586-018-0637-6
Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., … Kaplan, J. (2022). Constitutional AI: Harmlessness from AI feedback [Preprint]. arXiv. https://doi.org/10.48550/arXiv.2212.08073
Bonnefon, J.-F., Shariff, A., & Rahwan, I. (2016). The social dilemma of autonomous vehicles. Science, 352(6293), 1573–1576. https://doi.org/10.1126/science.aaf2654
Cheung, V., Maier, M., & Lieder, F. (2025). Large language models show amplified cognitive biases in moral decision-making. Proceedings of the National Academy of Sciences, 122(25), e2412015122. https://doi.org/10.1073/pnas.2412015122
Christiano, P. F., Leike, J., Brown, T. B., Martic, M., Legg, S., & Amodei, D. (2017). Deep reinforcement learning from human preferences. Advances in Neural Information Processing Systems, 30, 4299–4307.
Foot, P. (1967). The problem of abortion and the doctrine of double effect. Oxford Review, 5, 5–15.
Gabriel, I. (2020). Artificial intelligence, values, and alignment. Minds and Machines, 30(3), 411–437. https://doi.org/10.1007/s11023-020-09539-2
Greene, J. D., Sommerville, R. B., Nystrom, L. E., Darley, J. M., & Cohen, J. D. (2001). An fMRI investigation of emotional engagement in moral judgment. Science, 293(5537), 2105–2108. https://doi.org/10.1126/science.1062872
Hursthouse, R. (1999). On virtue ethics. Oxford University Press.
Jin, Z., Levine, S., Kleiman-Weiner, M., Piatti, G., Liu, J., Gonzalez, F., Ortu, F., Strausz, A., Sachan, M., Mihalcea, R., Choi, Y., & Schölkopf, B. (2024). Multilingual trolley problems for language models [Preprint]. arXiv. https://doi.org/10.48550/arXiv.2407.02273
Kant, I. (1998). Groundwork of the metaphysics of morals (M. Gregor, Ed. & Trans.). Cambridge University Press. (Original work published 1785)
Mill, J. S. (1998). Utilitarianism (R. Crisp, Ed.). Oxford University Press. (Original work published 1863)
Neuman, W. R., Coleman, C., & Shah, M. (2025). Analyzing the ethical logic of six large language models [Preprint]. arXiv. https://doi.org/10.48550/arXiv.2501.08951
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P. F., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
Scanlon, T. M. (1998). What we owe to each other. Harvard University Press.
Sorensen, T., Moore, J., Fisher, J., Gordon, M., Mireshghallah, N., Rytting, C. M., Ye, A., Jiang, L., Lu, X., Dziri, N., et al. (2024). A roadmap to pluralistic alignment [Preprint]. arXiv. https://doi.org/10.48550/arXiv.2402.05070
Takemoto, K. (2024). The Moral Machine experiment on large language models. Royal Society Open Science, 11(2), Article 231393. https://doi.org/10.1098/rsos.231393
Thomson, J. J. (1985). The trolley problem. The Yale Law Journal, 94(6), 1395–1415. https://doi.org/10.2307/796133
Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211(4481), 453–458. https://doi.org/10.1126/science.7455683
Wallach, W., & Allen, C. (2009). Moral machines: Teaching robots right from wrong. Oxford University Press.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q. V., & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824–24837.
Additional Files
Posted
Categories
License
Copyright (c) 2026 Research Archive of Rising Scholars

This work is licensed under a Creative Commons Attribution 4.0 International License.