AI cybersecurity and digital ethics Valuation and portfolio optimisation

Quantifying a firm's AI engagement: Constructing objective, data-driven, AI stock indices using 10-K filings

Can an algorithm reading corporate filings build a better investment index than professional managers?

Lennart Ante and Aman Saggu study AI stock classification and index construction in their paper "Quantifying a firm's AI engagement: Constructing objective, data-driven, AI stock indices using 10-K filings".

They apply natural language processing to annual 10-K filings from 3,395 NASDAQ-listed firms between 2010 and 2022, deriving binary and weighted AI engagement scores for each firm.

Their main conclusions include:

  • AI mentions in corporate 10-K filings grew dramatically over the sample period, rising from 7-11 filings per year between 2010 and 2015 to 527 in 2022.
  • Companies classified as AI stocks earned cumulative average abnormal returns of 17.25% in the three months following ChatGPT's launch in November 2022, compared with 11.59% for non-AI stocks.
  • AI index weights are significant positive predictors of abnormal returns, as indices based on more recent disclosures show stronger predictive power than those placing greater weight on historical AI communications.
  • The four NLP-based indices outperform existing AI-themed ETFs, delivering a mean daily return of 0.076% versus 0.056% without exhibiting higher volatility.
  • The NLP indices achieved mean Sharpe and Sortino ratios of 0.039 and 0.037, compared with 0.029 and 0.028 for existing AI ETFs, which is a more efficient return generation per unit of risk.
  • Index providers may be overcharging investors for products that an NLP-based approach can match or exceed at a fraction of the cost, as there is no positive correlation between expense ratios and daily returns among 14 AI-themed ETFs.

For index providers and ETF sponsors, these findings make a compelling case for replacing subjective asset-selection criteria with transparent, disclosure-based metrics.

As a limitation, the methodology relies on keyword frequency in 10-K filings, which cannot distinguish substantive AI integration from superficial or aspirational mentions.

The study is also confined to NASDAQ-listed U.S. companies, limiting generalisability to other markets and reporting regimes.