I have had a look at a new interesting paper – “Financial Statement Analysis with Large Language Models” – where the authors investigates whether an Large Language Model (LLM) can predict earnings as effectively as human analysts.
Summary of the Study
In the study, researchers Alex G. Kim, Maximilian Muhn, and Valeri V. Nikolaev from the University of Chicago examine whether LLMs can perform financial statement analysis similarly to professional analysts. They provide standardized and anonymous financial statements to GPT-4 and ask it to predict the direction of future earnings.
Methodology:
- Data Preparation: The financial statements are anonymized and standardized, removing any identifying information to ensure the model can’t rely on prior knowledge of specific companies.
- Prompts: Two types of prompts are used – a simple prompt and a Chain-of-Thought (CoT) prompt. The CoT prompt guides the model through a detailed analytical process, similar to a human analyst’s workflow.
- Comparative Analysis: The LLM’s predictions are compared to those of human analysts and other machine learning models, such as logistic regression and artificial neural networks (ANNs).
- Evaluation Metrics: The accuracy and F1-score are used to measure prediction performance. The study also examines the economic usefulness of trading strategies based on LLM predictions.
Main Takeaways
- Prediction Accuracy: The study finds that GPT-4, especially when using the CoT prompt, can predict earnings changes with an accuracy that surpasses human analysts. This performance is on par with, and sometimes exceeds, state-of-the-art specialized machine learning models.
- Economic Insights: GPT-4 doesn’t just rely on training memory but generates useful narrative insights about a company’s future performance. This capability allows it to form predictions that are often more accurate than those made by human analysts.
- Market Performance: Trading strategies based on GPT-4’s predictions yield higher Sharpe ratios and alphas compared to those based on other models, suggesting potential financial benefits.
My Perspective: LLMs and Market Predictions
While the study convincingly demonstrates that LLMs like GPT-4 can predict earnings changes effectively, it’s important to distinguish between predicting earnings and predicting market prices.
Earnings predictions do not directly translate to market predictions, especially in highly liquid markets like the US stock market.
The reality is that in liquid markets, where numerous participants are using advanced machine learning techniques, beating the market consistently is incredibly challenging. These methods have become widespread, and any potential edge from using LLMs would likely be arbitraged away quickly. Thirty years ago, there might have been more opportunities to exploit such models, but in today’s highly efficient markets, the competition is fierce.
Where LLMs Can Make a Difference
If one aims to leverage LLMs and machine learning to “beat the market,” the focus should shift to less liquid markets. Illiquid markets, such as certain niche commodities or retail prices (or the price of for example soccer players), may offer more opportunities for these models to provide a competitive edge. In these markets, the relative lack of sophisticated competition and slower dissemination of information could allow LLMs to identify and exploit inefficiencies more effectively. But with the explosion in the use of LLMs and Machine Learning models we will use see these markets becoming “as financial markets” in terms of efficiency and market pricing.
Conclusion
In conclusion, while LLMs show great promise in performing financial analysis and predicting earnings, their ability to outperform in highly liquid markets remains questionable. For those looking to leverage these advanced models, the key may lie in targeting illiquid markets where their predictive power can provide a true advantage.
Contact:
Contact:
+45 52 50 25 06
LC@paice.io