Using Large Language Models to Transform Estate Planning
Richard Yeaw Chong Seow
Discipline: Artificial Intelligence
Abstract:
Background: Advancements in generative artificial intelligence (AI)
and large language models (LLMs) have introduced new possibilities in
democratizing access to professional services. Malaysia's non-Muslims
can rely on AI-driven tools to draft their wills without legal assistance, but
empirical evaluations of LLM chatbots' reliability are absent.
Methods: Grounded in the Technology Acceptance Model, this study
assesses the accuracy, legal validity, comprehensibility, and reliability of five
prominent LLM chatbots, namely ChatGPT 3.5, ChatGPT 4.0, Claude Sonnet,
Gemini Pro, and Microsoft Copilot.
Results: ChatGPT 4.0 consistently outperformed other models across
all complexity levels in succession and drafting-related question tasks,
showing the highest reliability and accuracy. Gemini Pro performed well
for introductory and intermediate queries, particularly in drafting simple
wills. In contrast, Copilot and Claude Sonnet exhibited high variability and
struggled with complex queries. Across all chatbots, performance declined
significantly with increased query complexity. Qualitative assessment reveals
inconsistencies, misinterpretations, and occasional legal inaccuracies,
particularly when prompts contain incomplete information.
Conclusion: While specific LLM chatbots, particularly ChatGPT 4.0,
demonstrate potential as reliable tools for basic estate planning, their
limitations in handling complex legal instructions underscore the need for
caution. By shedding light on the role of AI in legal contexts, this research
significantly enriched both scholarly and practical dialogues, enhancing
our understanding of AI’s potential to revolutionize the legal landscape,
particularly in estate planning.
References:
- Alawida, M., Mejri, S., Mehmood, A., Chikhaoui, B., & Isaac Abiodun, O. (2023). A Comprehensive Study of ChatGPT: Advancements, Limitations, and Ethical Considerations in Natural Language Processing and Cybersecurity. Information, 14(8), 462. https://doi.org/10.3390/info14080462
- Armour, J., & Sako, M. (2020). AI-enabled Business Models in Legal Services: From Traditional Law Firms to Next-generation Law Companies? Journal of Professions and Organization, 7(1), 27–46. https://doi.org/10.1093/jpo/joaa001
- Aslan, I., Çınar, O., & Özen, Ü. (2014). Developing Strategies for the Future of Healthcare in Turkey by Benchmarking and SWOT Analysis. Procedia - Social and Behavioral Sciences, 150, 230–240. https://doi.org/10.1016/j.sbspro.2014.09.043
- Atkinson, K., Bench-Capon, T., & Bollegala, D. (2020). Explanation in AI and Law: Past, Present and Future. Artificial Intelligence, 289, 103387. https://doi.org/10.1016/j.artint.2020.103387
- Aubry, J.-P., Munnell, A. H., & Wettstein, G. (2023). Can Incentives Increase the Writing of Wills? An Experiment (Center for Retirement Research).
- Balagurunathan, Y., Mitchell, R., & El Naqa, I. (2021). Requirements and Reliability of AI in the Medical Context. Physica Medica, 83, 72–78. https://doi.org/10.1016/j.ejmp.2021.02.024
- Basir, F., Ahmad, W., & Rahman, M. (2023). Estate Planning Behaviour: A Systematic Literature Review. Journal of Risk and Financial Management, 16(2), 84. https://doi.org/10.3390/jrfm16020084
- Bathaee, Y. (2018). The Artificial Intelligence Black Box and the Failure of Intent and Causation. Harvard Journal of Law and Technology, 31(2), 889–934. https://jolt.law.harvard.edu/assets/articlePDFs/v31/The-Artificial-Intelligence-Black-Box-and-the-Failure-of-Intent-and-Causation-Yavar-Bathaee.pdf
- Bazzari, F. H., & Bazzari, A. H. (2024). Utilizing ChatGPT in Telepharmacy. Cureus, 16(1), e52365. https://doi.org/10.7759/cureus.52365
- Bench-Capon, T., Araszkiewicz, M., Ashley, K., Atkinson, K., Bex, F., Borges, F., Bourcier, D., Bourgine, P., Conrad, J. G., Francesconi, E., Gordon, T. F., Governatori, G., Leidner, J. L., Lewis, D. D., Loui, R. P., McCarty, L. T., Prakken, H., Schilder, F., Schweighofer, E., … Wyner, A. Z. (2012). A History of AI and Law in 50 Papers: 25 Years of the International Conference on AI and Law. Artificial Intelligence and Law, 20(3), 215–319. https://doi.org/10.1007/s10506-012-9131-x
- Brynjolfsson, E., Li, D., & Raymond, L. (2025). Generative AI at Work. The Quarterly Journal of Economics, 140(2), 889–942. https://doi.org/10.1093/qje/qjae044
- Bujang, M. A., Omar, E. D., & Baharum, N. A. (2018). A Review on Sample Size Determination for Cronbach’s Alpha Test: A Simple Guide for Researchers. Malaysian Journal of Medical Sciences, 25(6), 85–99. https://doi.org/10.21315/mjms2018.25.6.9
- Bujanga, M. A., & Baharum, N. (2017). A Simplified Guide to Determination of Sample Size Requirements for Estimating the Value of Intraclass Correlation Coefficient: A Review. Archives of Orofacial Sciences, 12(1), 1–12. https://aos.usm.my/docs/Vol_12/aos-article-0246.pdf
- Cheng, E. W. L. (2019). Choosing Between the Theory of Planned Behavior (TPB) and the Technology Acceptance Model (TAM). Educational Technology Research and Development, 67(1), 21–37. https://doi.org/10.1007/s11423-018-9598-6
- Choi, J. H., Hickman, K. E., Monahan, A. B., & Schwarcz, D. (2022). ChatGPT Goes to Law School. Journal of Legal Education, 71(3), 387-400. https://scholarship.law.umn.edu/cgi/viewcontent.cgi?article=2055&context=faculty_articles
- Choi, S. L., & Carr, D. (2023). Older Adults’ Relationship Trajectories and Estate Planning. Journal of Family and Economic Issues, 44(2), 356–372. https://doi.org/10.1007/s10834-022-09839-y
- Choi, S. L., McDonough, I. M., Kim, M., & Kim, G. (2019). Estate Planning Among Older Americans: The Moderating Role of Race and Ethnicity. Financial Planning Review, 2(3–4), e1058. https://doi.org/10.1002/cfp2.1058
- Cox, D., & Stark, O. (2005). Bequests, Inheritances and Family Traditions. SSRN Electronic Journal, Art. WP#2005-09. https://doi.org/10.2139/ssrn.1148982
- Davis, F. D., Bagozzi, R. P., & Warshaw, P. R. (1989). User Acceptance of Computer Technology: A Comparison of Two Theoretical Models. Management Science, 35(8), 982–1003. https://doi.org/10.1287/mnsc.35.8.982
- Deveci, C. D., Baker, J. J., Sikander, B., & Rosenberg, J. (2023). A Comparison of Cover Letters Written by ChatGPT-4 or Humans. Danish Medical Journal, 70(12), A06230412.
- Egli, A. (2023). ChatGPT, GPT-4, and Other Large Language Models: The Next Revolution for Clinical Microbiology? Clinical Infectious Diseases, 77(9), 1322–1328. https://doi.org/10.1093/cid/ciad407
- Egli, A., Schrenzel, J., & Greub, G. (2020). Digital Microbiology. Clinical Microbiology and Infection, 26(10), 1324–1331. https://doi.org/10.1016/j.cmi.2020.06.023
- Emenike, M. E., & Emenike, B. U. (2023). Was This Title Generated by ChatGPT? Considerations for Artificial Intelligence Text-generation Software Programs for Chemists and Chemistry Educators. Journal of Chemical Education, 100(4), 1413–1418. https://doi.org/10.1021/acs.jchemed.3c00063
- Gazulla, E. D., Martins, L., & Fernández-Ferrer, M. (2023). Designing Learning Technology Collaboratively: Analysis of A Chatbot Co-design. Education and Information Technologies, 28(1), 109–134. https://doi.org/10.1007/s10639-022-11162-w
- Gill, A., Mand, H. S., Obradovich, J. D., & Mathur, N. (2017). Influence of Meditation on Estate Planning Decisions: Evidence from Indian Survey Data. Financial Innovation, 3(1), 27. https://doi.org/10.1186/s40854-017-0078-5
- Gilson, A., Safranek, C. W., Huang, T., Socrates, V., Chi, L., Taylor, R. A., & Chartash, D. (2023). How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Medical Education, 9, e45312. https://doi.org/10.2196/45312
- Glikson, E., & Woolley, A. W. (2020). Human Trust in Artificial Intelligence: Review of Empirical Research. Academy of Management Annals, 14(2), 627–660. https://doi.org/10.5465/annals.2018.0057
- González-Estrada, E., Villaseñor, J. A., & Acosta-Pech, R. (2022). Shapiro-Wilk Test for Multivariate Skew-normality. Computational Statistics, 37(4), 1985–2001. https://doi.org/10.1007/s00180-021-01188-y
- Granić, A., & Marangunić, N. (2019). Technology Acceptance Model in Educational Context: A Systematic Literature Review. British Journal of Educational Technology, 50(5), 2572–2593. https://doi.org/10.1111/bjet.12864
- Harrington, B. (2012). Trust and Estate Planning: The Emergence of a Profession and Its Contribution to Socioeconomic Inequality. Sociological Forum, 27(4), 825–846. https://doi.org/10.1111/j.1573-7861.2012.01358.x
- Hassani, H., & Silva, E. S. (2023). The Role of ChatGPT in Data Science: How AI-assisted Conversational Interfaces are Revolutionizing the Field. Big Data and Cognitive Computing, 7(2), 62. https://doi.org/10.3390/bdcc7020062
- Hess, T. J., McNab, A. L., & Basoglu, K. A. (2014). Reliability Generalization of Perceived Ease of Use, Perceived Usefulness, and Behavioral Intentions. MIS Quarterly, 38(1), 1–28. https://doi.org/10.25300/MISQ/2014/38.1.01
- Horkey, C. (2009). Estate Planning Documents in Virginia Among Adults 50 and Over with at Least One Adult Child. Virginia Polytechnic Institute and State University. https://vtechworks.lib.vt.edu/items/6c86840e-2a52-44e9-ab5f-8b9c97ac168c/full
- Horton, D. (2017). Tomorrow’s Inheritance: The Frontiers of Estate Planning Formalism. Boston College Law Review, 58(2), 539–598. https://bclawreview.bc.edu/articles/468
- Ismail, S., Hashima, N., Kamisa, R., Harunb, H., & Samad, N. N. A. (2013). Determinants of Attitude towards Estate Planning in Malaysia: An Empirical Investigation. Conference: International Conference on Economics & Business Research, 1–9. https://www.researchgate.net/publication/260986225_Determinants_of_Attitude_towards_Estate_Planning_in_Malaysia_An_Empirical_Investigation
- Izenman, A. J. (2008). Modern Multivariate Statistical Techniques. Springer New York. https://doi.org/10.1007/978-0-387-78189-1
- Kim, K. T., & Stebbins, R. (2021). Everybody Dies: Financial Education and Basic Estate Planning. Journal of Financial Counseling and Planning, 32(3), 402–416. https://doi.org/10.1891/JFCP-19-00076
- Koo, T. K., & Li, M. Y. (2016). A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. Journal of Chiropractic Medicine, 15(2), 155–163. https://doi.org/10.1016/j.jcm.2016.02.012
- Kooli, C. (2023). Artificial Intelligence Dissociative Identity Disorder (AIDIS): The Dark Side of ChatGPT. QScience Connect, 2023(2). https://doi.org/10.5339/connect.2023.2
- Kung, T. H., Cheatham, M., Medenilla, A., Sillos, C., De Leon, L., Elepaño, C., Madriaga, M., Aggabao, R., Diaz-Candido, G., Maningo, J., & Tseng, V. (2023). Performance of ChatGPT on USMLE: Potential for AI-assisted Medical Education Using Large Language Models. PLOS Digital Health, 2(2), e0000198. https://doi.org/10.1371/journal.pdig.0000198
- Lee, D. Y., & Lehto, M. R. (2013). User Acceptance of YouTube for Procedural Learning: An Extension of the Technology Acceptance Model. Computers & Education, 61, 193–208. https://doi.org/10.1016/j.compedu.2012.10.001
- Lim, H. S. M., & Taeihagh, A. (2019). Algorithmic Decision-making in AVs: Understanding Ethical and Technical Concerns for Smart Cities. Sustainability, 11(20), 5791. https://doi.org/10.3390/su11205791
- Lim, Z. W., Pushpanathan, K., Yew, S. M. E., Lai, Y., Sun, C.-H., Lam, J. S. H., Chen, D. Z., Goh, J. H. L., Tan, M. C. J., Sheng, B., Cheng, C.-Y., Koh, V. T. C., & Tham, Y.-C. (2023). Benchmarking Large Language Models’ Performances for Myopia Care: A Comparative Analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard. EBioMedicine, 95, 104770. https://doi.org/10.1016/j.ebiom.2023.104770
- Madura, J., & Gill, H. S. (2015). Personal Finance, Third Canadian Edition. Pearson Canada.
- Mendenhall, E., Muzizi, L., Stephenson, R., Chomba, E., Ahmed, Y., Haworth, A., & Allen, S. (2007). Property Grabbing and Will Writing in Lusaka, Zambia: An Examination of Wills of HIV-infected Cohabiting Couples. AIDS Care, 19(3), 369–374. https://doi.org/10.1080/09540120600774362
- Mendoza, S., Sánchez-Adame, L. M., Urquiza-Yllescas, J. F., González-Beltrán, B. A., & Decouchant, D. (2022). A Model to Develop Chatbots for Assisting the Teaching and Learning Process. Sensors, 22(15), 5532. https://doi.org/10.3390/s22155532
- Meyer, J. G., Urbanowicz, R. J., Martin, P. C. N., O’Connor, K., Li, R., Peng, P.-C., Bright, T. J., Tatonetti, N., Won, K. J., Gonzalez-Hernandez, G., & Moore, J. H. (2023). ChatGPT and Large Language Models in Academia: Opportunities and Challenges. BioData Mining, 16(1), 20. https://doi.org/10.1186/s13040-023-00339-9
- Miller, T. (2019). Explanation in Artificial Intelligence: Insights from the Social Sciences. Artificial Intelligence, 267, 1–38. https://doi.org/10.1016/j.artint.2018.07.007
- Mishra, P., Pandey, C., Singh, U., Keshri, A., & Sabaretnam, M. (2019). Selection of Appropriate Statistical Methods for Data Analysis. Annals of Cardiac Anaesthesia, 22(3), 297. https://doi.org/10.4103/aca.ACA_248_18
- Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The Ethics of Algorithms: Mapping the Debate. Big Data & Society, 3(2), 1–21. https://doi.org/10.1177/2053951716679679
- Mojadeddi, Z. M. R. J. (2023). The Impact of AI and ChatGPT on Research Reporting. The New Zealand Medical Journal (Online), 136(1575), 60–64. https://doi.org/10.26635/6965.6122
- Müller, R., & Büttner, P. (1994). A Critical Discussion of Intraclass Correlation Coefficients. Statistics in Medicine, 13(23–24), 2465–2476. https://doi.org/10.1002/sim.4780132310
- Nasrul, M. A., & Mohd Salim, W. N. (2018). Administration of Estates in Malaysia: Determinant of Factors Behind the Delay in the Distribution of the Deceased’s Asset. Journal of Nusantara Studies, 3(1), 75–86. https://doi.org/10.24200/jonus.vol3iss1pp75-86
- Ooi, K.-B., Tan, G. W.-H., Al-Emran, M., Al-Sharafi, M. A., Capatina, A., Chakraborty, A., Dwivedi, Y. K., Huang, T.-L., Kar, A. K., Lee, V.-H., Loh, X.-M., Micu, A., Mikalef, P., Mogaji, E., Pandey, N., Raman, R., Rana, N. P., Sarker, P., Sharma, A., … Wong, L.-W. (2025). The Potential of Generative Artificial Intelligence Across Disciplines: Perspectives and Future Directions. Journal of Computer Information Systems, 65(1), 76–107. https://doi.org/10.1080/08874417.2023.2261010
- Piya, S., Shamsuzzoha, A., Khadem, M., & Al Kindi, M. (2021). Integrated Analytical Hierarchy Process and Grey Relational Analysis Approach to Measure Supply Chain Complexity. Benchmarking: An International Journal, 28(4), 1273–1295. https://doi.org/10.1108/BIJ-03-2020-0108
- Rahmanti, A. R., Yang, H.-C., Bintoro, B. S., Nursetyo, A. A., Muhtar, M. S., Syed-Abdul, S., & Li, Y.-C. J. (2022). SlimMe, A Chatbot with Artificial Empathy for Personal Weight Management: System Design and Finding. Frontiers in Nutrition, 9, 870775. https://doi.org/10.3389/fnut.2022.870775
- Rissland, E. L., Ashley, K. D., & Loui, R. P. (2003). AI and Law: A Fruitful Synergy. Artificial Intelligence, 150(1–2), 1–15. https://doi.org/10.1016/S0004-3702(03)00122-X
- Sætra, H. S. (2020). A Shallow Defence of A Technocracy of Artificial Intelligence: Examining the Political Harms of Algorithmic Governance in the Domain of Government. Technology in Society, 62, 101283. https://doi.org/10.1016/j.techsoc.2020.101283
- Solomovich, L., & Abraham, V. (2024). Exploring the Influence of ChatGPT on Tourism Behavior Using the Technology Acceptance Model. Tourism Review, ahead-of-print(ahead-of-print). https://doi.org/10.1108/TR-10-2023-0697
- Street, M. (2006). A Holistic Approach to Estate Planning: Paramount in Protecting Your Family, Your Wealth, and Your Legacy. Pepperdine Dispute Resolution Law Journal, 7(1), 141–163. https://digitalcommons.pepperdine.edu/drlj/vol7/iss1/5/
- Surden, H. (2014). Machine Learning and Law. Washington Law Review, 89(1), 87–106. https://digitalcommons.law.uw.edu/wlr/vol89/iss1/5
- Surden, H. (2019). Artificial Intelligence and Law: An Overview. Georgia State University Law Review, 35(4), 1305–1337. https://readingroom.law.gsu.edu/gsulr/vol35/iss4/8
- Susskind, R. (2019). Online Courts and the Future of Justice. Oxford University Press. https://doi.org/10.1093/oso/9780198838364.001.0001
- Taeihagh, A. (2021). Governance of Artificial Intelligence. Policy and Society, 40(2), 137–157. https://doi.org/10.1080/14494035.2021.1928377
- Tomita, S., Komiya-Ito, A., Imamura, K., Kita, D., Ota, K., Takayama, S., Makino-Oi, A., Kinumatsu, T., Ota, M., & Saito, A. (2013). Prevalence of Aggregatibacter Actinomycetemcomitans, Porphyromonas Gingivalis and Tannerella Forsythia in Japanese Patients with Generalized Chronic and Aggressive Periodontitis. Microbial Pathogenesis, 61–62, 11–15. https://doi.org/10.1016/j.micpath.2013.04.006
- Trajtenberg, M. (2019). Artificial Intelligence as the Next GPT: A Political-economy Perspective. In A. Agrawal, J. Gans, & A. Goldfarb (Eds.), The Economics of Artificial Intelligence: An Agenda (pp. 175–186). University of Chicago Press. https://doi.org/10.7208/chicago/9780226613475.003.0006
- Venkatesh, V., Morris, M. G., Davis, G. B., & Davis, F. D. (2003). User Acceptance of Information Technology: Toward a Unified View. MIS Quarterly, 27(3), 425–478. https://doi.org/10.2307/30036540
- Venkatesh, V., Thong, J. Y. L., & Xu, X. (2012). Consumer Acceptance and Use of Information Technology: Extending the Unified Theory of Acceptance and Use of Technology. MIS Quarterly, 36(1), 157–178. https://doi.org/10.2307/41410412
- Wach, K., Duong, C. D., Ejdys, J., Kazlauskaitė, R., Korzynski, P., Mazurek, G., Paliszkiewicz, J., & Ziemba, E. (2023). The Dark Side of Generative Artificial Intelligence: A Critical Analysis of Controversies and Risks of ChatGPT. Entrepreneurial Business and Economics Review, 11(2), 7–30. https://doi.org/10.15678/EBER.2023.110201
- Weeks, R., Sangha, P., Cooper, L., Sedoc, J., White, S., Gretz, S., Toledo, A., Lahav, D., Hartner, A.-M., Martin, N. M., Lee, J. H., Slonim, N., & Bar-Zeev, N. (2023). Usability and Credibility of a COVID-19 Vaccine Chatbot for Young Adults and Health Workers in the United States: Formative Mixed Methods Study. JMIR Human Factors, 10, e40533. https://doi.org/10.2196/40533
- Westwood, S. (2015). Complicating Kinship and Inheritance: Older Lesbians’ and Gay Men’s Will-Writing in England. Feminist Legal Studies, 23(2), 181–197. https://doi.org/10.1007/s10691-015-9287-3
- Yan, Y., Oswald, E., & Roy, A. (2024). Not Optimal but Efficient: A Distinguisher Based on the Kruskal-Wallis Test. In H. Seo & S. Kim (Eds.), Information Security and Cryptology – ICISC 2023 (pp. 240–258). Springer Singapore. https://doi.org/10.1007/978-981-97-1235-9_13.
- Zandi, G. R., Abidin, S. Z., & Swee, K. N. (2017). The Preparations of Employees Towards Retirement and Estate Planning: The Case of Malaysia. International Journal of Applied Business and Economic Research, 15(22), 673–684. https://serialsjournals.com/abstract/32784_ch_51_f_-_gholamreza_zandi.pdf