Using Large Language Models to Transform Estate Planning

Richard Yeaw Chong  Seow,

Using Large Language Models to Transform Estate Planning

Abstract:

Background: Advancements in generative artificial intelligence (AI) and large language models (LLMs) have introduced new possibilities in democratizing access to professional services. Malaysia's non-Muslims can rely on AI-driven tools to draft their wills without legal assistance, but empirical evaluations of LLM chatbots' reliability are absent. Methods: Grounded in the Technology Acceptance Model, this study assesses the accuracy, legal validity, comprehensibility, and reliability of five prominent LLM chatbots, namely ChatGPT 3.5, ChatGPT 4.0, Claude Sonnet, Gemini Pro, and Microsoft Copilot. Results: ChatGPT 4.0 consistently outperformed other models across all complexity levels in succession and drafting-related question tasks, showing the highest reliability and accuracy. Gemini Pro performed well for introductory and intermediate queries, particularly in drafting simple wills. In contrast, Copilot and Claude Sonnet exhibited high variability and struggled with complex queries. Across all chatbots, performance declined significantly with increased query complexity. Qualitative assessment reveals inconsistencies, misinterpretations, and occasional legal inaccuracies, particularly when prompts contain incomplete information. Conclusion: While specific LLM chatbots, particularly ChatGPT 4.0, demonstrate potential as reliable tools for basic estate planning, their limitations in handling complex legal instructions underscore the need for caution. By shedding light on the role of AI in legal contexts, this research significantly enriched both scholarly and practical dialogues, enhancing our understanding of AI’s potential to revolutionize the legal landscape, particularly in estate planning.

References:

Alawida, M., Mejri, S., Mehmood, A., Chikhaoui, B., & Isaac Abiodun, O. (2023). A Comprehensive Study of ChatGPT: Advancements, Limitations, and Ethical Considerations in Natural Language Processing and Cybersecurity. Information, 14(8), 462. https://doi.org/10.3390/info14080462
Armour, J., & Sako, M. (2020). AI-enabled Business Models in Legal Services: From Traditional Law Firms to Next-generation Law Companies? Journal of Professions and Organization, 7(1), 27–46. https://doi.org/10.1093/jpo/joaa001
Aslan, I., Çınar, O., & Özen, Ü. (2014). Developing Strategies for the Future of Healthcare in Turkey by Benchmarking and SWOT Analysis. Procedia - Social and Behavioral Sciences, 150, 230–240. https://doi.org/10.1016/j.sbspro.2014.09.043
Atkinson, K., Bench-Capon, T., & Bollegala, D. (2020). Explanation in AI and Law: Past, Present and Future. Artificial Intelligence, 289, 103387. https://doi.org/10.1016/j.artint.2020.103387
Aubry, J.-P., Munnell, A. H., & Wettstein, G. (2023). Can Incentives Increase the Writing of Wills? An Experiment (Center for Retirement Research).
Balagurunathan, Y., Mitchell, R., & El Naqa, I. (2021). Requirements and Reliability of AI in the Medical Context. Physica Medica, 83, 72–78. https://doi.org/10.1016/j.ejmp.2021.02.024
Basir, F., Ahmad, W., & Rahman, M. (2023). Estate Planning Behaviour: A Systematic Literature Review. Journal of Risk and Financial Management, 16(2), 84. https://doi.org/10.3390/jrfm16020084
Bathaee, Y. (2018). The Artificial Intelligence Black Box and the Failure of Intent and Causation. Harvard Journal of Law and Technology, 31(2), 889–934. https://jolt.law.harvard.edu/assets/articlePDFs/v31/The-Artificial-Intelligence-Black-Box-and-the-Failure-of-Intent-and-Causation-Yavar-Bathaee.pdf
Bazzari, F. H., & Bazzari, A. H. (2024). Utilizing ChatGPT in Telepharmacy. Cureus, 16(1), e52365. https://doi.org/10.7759/cureus.52365
Bench-Capon, T., Araszkiewicz, M., Ashley, K., Atkinson, K., Bex, F., Borges, F., Bourcier, D., Bourgine, P., Conrad, J. G., Francesconi, E., Gordon, T. F., Governatori, G., Leidner, J. L., Lewis, D. D., Loui, R. P., McCarty, L. T., Prakken, H., Schilder, F., Schweighofer, E., … Wyner, A. Z. (2012). A History of AI and Law in 50 Papers: 25 Years of the International Conference on AI and Law. Artificial Intelligence and Law, 20(3), 215–319. https://doi.org/10.1007/s10506-012-9131-x
Brynjolfsson, E., Li, D., & Raymond, L. (2025). Generative AI at Work. The Quarterly Journal of Economics, 140(2), 889–942. https://doi.org/10.1093/qje/qjae044
Bujang, M. A., Omar, E. D., & Baharum, N. A. (2018). A Review on Sample Size Determination for Cronbach’s Alpha Test: A Simple Guide for Researchers. Malaysian Journal of Medical Sciences, 25(6), 85–99. https://doi.org/10.21315/mjms2018.25.6.9
Bujanga, M. A., & Baharum, N. (2017). A Simplified Guide to Determination of Sample Size Requirements for Estimating the Value of Intraclass Correlation Coefficient: A Review. Archives of Orofacial Sciences, 12(1), 1–12. https://aos.usm.my/docs/Vol_12/aos-article-0246.pdf
Cheng, E. W. L. (2019). Choosing Between the Theory of Planned Behavior (TPB) and the Technology Acceptance Model (TAM). Educational Technology Research and Development, 67(1), 21–37. https://doi.org/10.1007/s11423-018-9598-6
Choi, J. H., Hickman, K. E., Monahan, A. B., & Schwarcz, D. (2022). ChatGPT Goes to Law School. Journal of Legal Education, 71(3), 387-400. https://scholarship.law.umn.edu/cgi/viewcontent.cgi?article=2055&context=faculty_articles
Choi, S. L., & Carr, D. (2023). Older Adults’ Relationship Trajectories and Estate Planning. Journal of Family and Economic Issues, 44(2), 356–372. https://doi.org/10.1007/s10834-022-09839-y
Choi, S. L., McDonough, I. M., Kim, M., & Kim, G. (2019). Estate Planning Among Older Americans: The Moderating Role of Race and Ethnicity. Financial Planning Review, 2(3–4), e1058. https://doi.org/10.1002/cfp2.1058
Cox, D., & Stark, O. (2005). Bequests, Inheritances and Family Traditions. SSRN Electronic Journal, Art. WP#2005-09. https://doi.org/10.2139/ssrn.1148982
Davis, F. D., Bagozzi, R. P., & Warshaw, P. R. (1989). User Acceptance of Computer Technology: A Comparison of Two Theoretical Models. Management Science, 35(8), 982–1003. https://doi.org/10.1287/mnsc.35.8.982
Deveci, C. D., Baker, J. J., Sikander, B., & Rosenberg, J. (2023). A Comparison of Cover Letters Written by ChatGPT-4 or Humans. Danish Medical Journal, 70(12), A06230412.
Egli, A. (2023). ChatGPT, GPT-4, and Other Large Language Models: The Next Revolution for Clinical Microbiology? Clinical Infectious Diseases, 77(9), 1322–1328. https://doi.org/10.1093/cid/ciad407
Egli, A., Schrenzel, J., & Greub, G. (2020). Digital Microbiology. Clinical Microbiology and Infection, 26(10), 1324–1331. https://doi.org/10.1016/j.cmi.2020.06.023
Emenike, M. E., & Emenike, B. U. (2023). Was This Title Generated by ChatGPT? Considerations for Artificial Intelligence Text-generation Software Programs for Chemists and Chemistry Educators. Journal of Chemical Education, 100(4), 1413–1418. https://doi.org/10.1021/acs.jchemed.3c00063
Gazulla, E. D., Martins, L., & Fernández-Ferrer, M. (2023). Designing Learning Technology Collaboratively: Analysis of A Chatbot Co-design. Education and Information Technologies, 28(1), 109–134. https://doi.org/10.1007/s10639-022-11162-w
Gill, A., Mand, H. S., Obradovich, J. D., & Mathur, N. (2017). Influence of Meditation on Estate Planning Decisions: Evidence from Indian Survey Data. Financial Innovation, 3(1), 27. https://doi.org/10.1186/s40854-017-0078-5
Gilson, A., Safranek, C. W., Huang, T., Socrates, V., Chi, L., Taylor, R. A., & Chartash, D. (2023). How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Medical Education, 9, e45312. https://doi.org/10.2196/45312
Glikson, E., & Woolley, A. W. (2020). Human Trust in Artificial Intelligence: Review of Empirical Research. Academy of Management Annals, 14(2), 627–660. https://doi.org/10.5465/annals.2018.0057
González-Estrada, E., Villaseñor, J. A., & Acosta-Pech, R. (2022). Shapiro-Wilk Test for Multivariate Skew-normality. Computational Statistics, 37(4), 1985–2001. https://doi.org/10.1007/s00180-021-01188-y
Granić, A., & Marangunić, N. (2019). Technology Acceptance Model in Educational Context: A Systematic Literature Review. British Journal of Educational Technology, 50(5), 2572–2593. https://doi.org/10.1111/bjet.12864
Harrington, B. (2012). Trust and Estate Planning: The Emergence of a Profession and Its Contribution to Socioeconomic Inequality. Sociological Forum, 27(4), 825–846. https://doi.org/10.1111/j.1573-7861.2012.01358.x
Hassani, H., & Silva, E. S. (2023). The Role of ChatGPT in Data Science: How AI-assisted Conversational Interfaces are Revolutionizing the Field. Big Data and Cognitive Computing, 7(2), 62. https://doi.org/10.3390/bdcc7020062
Hess, T. J., McNab, A. L., & Basoglu, K. A. (2014). Reliability Generalization of Perceived Ease of Use, Perceived Usefulness, and Behavioral Intentions. MIS Quarterly, 38(1), 1–28. https://doi.org/10.25300/MISQ/2014/38.1.01
Horkey, C. (2009). Estate Planning Documents in Virginia Among Adults 50 and Over with at Least One Adult Child. Virginia Polytechnic Institute and State University. https://vtechworks.lib.vt.edu/items/6c86840e-2a52-44e9-ab5f-8b9c97ac168c/full
Horton, D. (2017). Tomorrow’s Inheritance: The Frontiers of Estate Planning Formalism. Boston College Law Review, 58(2), 539–598. https://bclawreview.bc.edu/articles/468
Ismail, S., Hashima, N., Kamisa, R., Harunb, H., & Samad, N. N. A. (2013). Determinants of Attitude towards Estate Planning in Malaysia: An Empirical Investigation. Conference: International Conference on Economics & Business Research, 1–9. https://www.researchgate.net/publication/260986225_Determinants_of_Attitude_towards_Estate_Planning_in_Malaysia_An_Empirical_Investigation
Izenman, A. J. (2008). Modern Multivariate Statistical Techniques. Springer New York. https://doi.org/10.1007/978-0-387-78189-1
Kim, K. T., & Stebbins, R. (2021). Everybody Dies: Financial Education and Basic Estate Planning. Journal of Financial Counseling and Planning, 32(3), 402–416. https://doi.org/10.1891/JFCP-19-00076
Koo, T. K., & Li, M. Y. (2016). A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. Journal of Chiropractic Medicine, 15(2), 155–163. https://doi.org/10.1016/j.jcm.2016.02.012
Kooli, C. (2023). Artificial Intelligence Dissociative Identity Disorder (AIDIS): The Dark Side of ChatGPT. QScience Connect, 2023(2). https://doi.org/10.5339/connect.2023.2
Kung, T. H., Cheatham, M., Medenilla, A., Sillos, C., De Leon, L., Elepaño, C., Madriaga, M., Aggabao, R., Diaz-Candido, G., Maningo, J., & Tseng, V. (2023). Performance of ChatGPT on USMLE: Potential for AI-assisted Medical Education Using Large Language Models. PLOS Digital Health, 2(2), e0000198. https://doi.org/10.1371/journal.pdig.0000198
Lee, D. Y., & Lehto, M. R. (2013). User Acceptance of YouTube for Procedural Learning: An Extension of the Technology Acceptance Model. Computers & Education, 61, 193–208. https://doi.org/10.1016/j.compedu.2012.10.001
Lim, H. S. M., & Taeihagh, A. (2019). Algorithmic Decision-making in AVs: Understanding Ethical and Technical Concerns for Smart Cities. Sustainability, 11(20), 5791. https://doi.org/10.3390/su11205791
Lim, Z. W., Pushpanathan, K., Yew, S. M. E., Lai, Y., Sun, C.-H., Lam, J. S. H., Chen, D. Z., Goh, J. H. L., Tan, M. C. J., Sheng, B., Cheng, C.-Y., Koh, V. T. C., & Tham, Y.-C. (2023). Benchmarking Large Language Models’ Performances for Myopia Care: A Comparative Analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard. EBioMedicine, 95, 104770. https://doi.org/10.1016/j.ebiom.2023.104770
Madura, J., & Gill, H. S. (2015). Personal Finance, Third Canadian Edition. Pearson Canada.
Mendenhall, E., Muzizi, L., Stephenson, R., Chomba, E., Ahmed, Y., Haworth, A., & Allen, S. (2007). Property Grabbing and Will Writing in Lusaka, Zambia: An Examination of Wills of HIV-infected Cohabiting Couples. AIDS Care, 19(3), 369–374. https://doi.org/10.1080/09540120600774362
Mendoza, S., Sánchez-Adame, L. M., Urquiza-Yllescas, J. F., González-Beltrán, B. A., & Decouchant, D. (2022). A Model to Develop Chatbots for Assisting the Teaching and Learning Process. Sensors, 22(15), 5532. https://doi.org/10.3390/s22155532
Meyer, J. G., Urbanowicz, R. J., Martin, P. C. N., O’Connor, K., Li, R., Peng, P.-C., Bright, T. J., Tatonetti, N., Won, K. J., Gonzalez-Hernandez, G., & Moore, J. H. (2023). ChatGPT and Large Language Models in Academia: Opportunities and Challenges. BioData Mining, 16(1), 20. https://doi.org/10.1186/s13040-023-00339-9
Miller, T. (2019). Explanation in Artificial Intelligence: Insights from the Social Sciences. Artificial Intelligence, 267, 1–38. https://doi.org/10.1016/j.artint.2018.07.007
Mishra, P., Pandey, C., Singh, U., Keshri, A., & Sabaretnam, M. (2019). Selection of Appropriate Statistical Methods for Data Analysis. Annals of Cardiac Anaesthesia, 22(3), 297. https://doi.org/10.4103/aca.ACA_248_18
Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The Ethics of Algorithms: Mapping the Debate. Big Data & Society, 3(2), 1–21. https://doi.org/10.1177/2053951716679679
Mojadeddi, Z. M. R. J. (2023). The Impact of AI and ChatGPT on Research Reporting. The New Zealand Medical Journal (Online), 136(1575), 60–64. https://doi.org/10.26635/6965.6122
Müller, R., & Büttner, P. (1994). A Critical Discussion of Intraclass Correlation Coefficients. Statistics in Medicine, 13(23–24), 2465–2476. https://doi.org/10.1002/sim.4780132310
Nasrul, M. A., & Mohd Salim, W. N. (2018). Administration of Estates in Malaysia: Determinant of Factors Behind the Delay in the Distribution of the Deceased’s Asset. Journal of Nusantara Studies, 3(1), 75–86. https://doi.org/10.24200/jonus.vol3iss1pp75-86
Ooi, K.-B., Tan, G. W.-H., Al-Emran, M., Al-Sharafi, M. A., Capatina, A., Chakraborty, A., Dwivedi, Y. K., Huang, T.-L., Kar, A. K., Lee, V.-H., Loh, X.-M., Micu, A., Mikalef, P., Mogaji, E., Pandey, N., Raman, R., Rana, N. P., Sarker, P., Sharma, A., … Wong, L.-W. (2025). The Potential of Generative Artificial Intelligence Across Disciplines: Perspectives and Future Directions. Journal of Computer Information Systems, 65(1), 76–107. https://doi.org/10.1080/08874417.2023.2261010
Piya, S., Shamsuzzoha, A., Khadem, M., & Al Kindi, M. (2021). Integrated Analytical Hierarchy Process and Grey Relational Analysis Approach to Measure Supply Chain Complexity. Benchmarking: An International Journal, 28(4), 1273–1295. https://doi.org/10.1108/BIJ-03-2020-0108
Rahmanti, A. R., Yang, H.-C., Bintoro, B. S., Nursetyo, A. A., Muhtar, M. S., Syed-Abdul, S., & Li, Y.-C. J. (2022). SlimMe, A Chatbot with Artificial Empathy for Personal Weight Management: System Design and Finding. Frontiers in Nutrition, 9, 870775. https://doi.org/10.3389/fnut.2022.870775
Rissland, E. L., Ashley, K. D., & Loui, R. P. (2003). AI and Law: A Fruitful Synergy. Artificial Intelligence, 150(1–2), 1–15. https://doi.org/10.1016/S0004-3702(03)00122-X
Sætra, H. S. (2020). A Shallow Defence of A Technocracy of Artificial Intelligence: Examining the Political Harms of Algorithmic Governance in the Domain of Government. Technology in Society, 62, 101283. https://doi.org/10.1016/j.techsoc.2020.101283
Solomovich, L., & Abraham, V. (2024). Exploring the Influence of ChatGPT on Tourism Behavior Using the Technology Acceptance Model. Tourism Review, ahead-of-print(ahead-of-print). https://doi.org/10.1108/TR-10-2023-0697
Street, M. (2006). A Holistic Approach to Estate Planning: Paramount in Protecting Your Family, Your Wealth, and Your Legacy. Pepperdine Dispute Resolution Law Journal, 7(1), 141–163. https://digitalcommons.pepperdine.edu/drlj/vol7/iss1/5/
Surden, H. (2014). Machine Learning and Law. Washington Law Review, 89(1), 87–106. https://digitalcommons.law.uw.edu/wlr/vol89/iss1/5
Surden, H. (2019). Artificial Intelligence and Law: An Overview. Georgia State University Law Review, 35(4), 1305–1337. https://readingroom.law.gsu.edu/gsulr/vol35/iss4/8
Susskind, R. (2019). Online Courts and the Future of Justice. Oxford University Press. https://doi.org/10.1093/oso/9780198838364.001.0001
Taeihagh, A. (2021). Governance of Artificial Intelligence. Policy and Society, 40(2), 137–157. https://doi.org/10.1080/14494035.2021.1928377
Tomita, S., Komiya-Ito, A., Imamura, K., Kita, D., Ota, K., Takayama, S., Makino-Oi, A., Kinumatsu, T., Ota, M., & Saito, A. (2013). Prevalence of Aggregatibacter Actinomycetemcomitans, Porphyromonas Gingivalis and Tannerella Forsythia in Japanese Patients with Generalized Chronic and Aggressive Periodontitis. Microbial Pathogenesis, 61–62, 11–15. https://doi.org/10.1016/j.micpath.2013.04.006
Trajtenberg, M. (2019). Artificial Intelligence as the Next GPT: A Political-economy Perspective. In A. Agrawal, J. Gans, & A. Goldfarb (Eds.), The Economics of Artificial Intelligence: An Agenda (pp. 175–186). University of Chicago Press. https://doi.org/10.7208/chicago/9780226613475.003.0006
Venkatesh, V., Morris, M. G., Davis, G. B., & Davis, F. D. (2003). User Acceptance of Information Technology: Toward a Unified View. MIS Quarterly, 27(3), 425–478. https://doi.org/10.2307/30036540
Venkatesh, V., Thong, J. Y. L., & Xu, X. (2012). Consumer Acceptance and Use of Information Technology: Extending the Unified Theory of Acceptance and Use of Technology. MIS Quarterly, 36(1), 157–178. https://doi.org/10.2307/41410412
Wach, K., Duong, C. D., Ejdys, J., Kazlauskaitė, R., Korzynski, P., Mazurek, G., Paliszkiewicz, J., & Ziemba, E. (2023). The Dark Side of Generative Artificial Intelligence: A Critical Analysis of Controversies and Risks of ChatGPT. Entrepreneurial Business and Economics Review, 11(2), 7–30. https://doi.org/10.15678/EBER.2023.110201
Weeks, R., Sangha, P., Cooper, L., Sedoc, J., White, S., Gretz, S., Toledo, A., Lahav, D., Hartner, A.-M., Martin, N. M., Lee, J. H., Slonim, N., & Bar-Zeev, N. (2023). Usability and Credibility of a COVID-19 Vaccine Chatbot for Young Adults and Health Workers in the United States: Formative Mixed Methods Study. JMIR Human Factors, 10, e40533. https://doi.org/10.2196/40533
Westwood, S. (2015). Complicating Kinship and Inheritance: Older Lesbians’ and Gay Men’s Will-Writing in England. Feminist Legal Studies, 23(2), 181–197. https://doi.org/10.1007/s10691-015-9287-3
Yan, Y., Oswald, E., & Roy, A. (2024). Not Optimal but Efficient: A Distinguisher Based on the Kruskal-Wallis Test. In H. Seo & S. Kim (Eds.), Information Security and Cryptology – ICISC 2023 (pp. 240–258). Springer Singapore. https://doi.org/10.1007/978-981-97-1235-9_13.
Zandi, G. R., Abidin, S. Z., & Swee, K. N. (2017). The Preparations of Employees Towards Retirement and Estate Planning: The Case of Malaysia. International Journal of Applied Business and Economic Research, 15(22), 673–684. https://serialsjournals.com/abstract/32784_ch_51_f_-_gholamreza_zandi.pdf