AI cybersecurity and digital ethics Valuation and portfolio optimisation

Quantifying a firm's AI engagement: Constructing objective, data-driven, AI stock indices using 10-K filings

DOIhttps://doi.org/10.1016/j.t...
Article publication dateJanuary 2, 2025
Post publication dateApril 2, 2026

Can an algorithm reading corporate filings build a better investment index than professional managers?

Lennart Ante and Aman Saggu study AI stock classification and index construction in their paper "Quantifying a firm's AI engagement: Constructing objective, data-driven, AI stock indices using 10-K filings".

They apply natural language processing to annual 10-K filings from 3,395 NASDAQ-listed firms between 2010 and 2022, deriving binary and weighted AI engagement scores for each firm.

Their main conclusions include:

AI mentions in corporate 10-K filings grew dramatically over the sample period, rising from 7-11 filings per year between 2010 and 2015 to 527 in 2022.
Companies classified as AI stocks earned cumulative average abnormal returns of 17.25% in the three months following ChatGPT's launch in November 2022, compared with 11.59% for non-AI stocks.
AI index weights are significant positive predictors of abnormal returns, as indices based on more recent disclosures show stronger predictive power than those placing greater weight on historical AI communications.
The four NLP-based indices outperform existing AI-themed ETFs, delivering a mean daily return of 0.076% versus 0.056% without exhibiting higher volatility.
The NLP indices achieved mean Sharpe and Sortino ratios of 0.039 and 0.037, compared with 0.029 and 0.028 for existing AI ETFs, which is a more efficient return generation per unit of risk.
Index providers may be overcharging investors for products that an NLP-based approach can match or exceed at a fraction of the cost, as there is no positive correlation between expense ratios and daily returns among 14 AI-themed ETFs.

For index providers and ETF sponsors, these findings make a compelling case for replacing subjective asset-selection criteria with transparent, disclosure-based metrics.

As a limitation, the methodology relies on keyword frequency in 10-K filings, which cannot distinguish substantive AI integration from superficial or aspirational mentions.

The study is also confined to NASDAQ-listed U.S. companies, limiting generalisability to other markets and reporting regimes.

Partnerships

Partnerships

Quantifying a firm's AI engagement: Constructing objective, data-driven, AI stock indices using 10-K filings