Stakeholders management Governance and board effectiveness ESG integration Active ownership stewardship and engagement

Behind ESG ratings: unpacking sustainability metrics

Do ESG ratings really tell investors how sustainable a company is?

Barbara Bijelic, Benjamin Michel and Konstantin Mann examine the scope and characteristics of ESG metrics used by leading rating providers in their OECD report "Behind ESG ratings: unpacking sustainability metrics".

They assemble a dataset of over 2000 individual metrics, classify each one across 23 sustainability topics, and run structured interviews with rating providers and reporting framework bodies.

Their main conclusions include:

  • Coverage varies sharply across topics: corporate governance is assessed using over 20 metrics on average per product, while biodiversity, business resilience, and community relations rely on fewer than 5 metrics.
  • Methodological divergence is striking, with one rating product using 28 times more metrics than another to measure corporate governance, and metric counts for GHG emissions ranging from 1 to 47.
  • 68% of metrics capture self-reported policies and activities, while only 30% measure actual outputs such as emissions, accidents, or pay gaps.
  • Qualitative data dominates at 72%, and only 17% of all metrics are quantitative output measures, raising the risk that ratings reward disclosure and process rather than measurable progress.
  • Less than 5% of metrics are explicitly forward-looking or dynamic, and only 7% capture supply chain risks.
  • Controversy-based metrics account for 15% of all metrics and tend to flag past incidents rather than measure the quality of ongoing due diligence.

This paper shows sustainability performance is overwhelmingly assessed through inputs rather than outcomes, and most ratings are silent on transition pathways and value-chain accountability.

They reinforce the case for the ESRS and ISSB shift toward process-based disclosure on due diligence, suggesting that current rating products lag behind regulatory expectations rather than support them.

The study focuses on metric design rather than scoring methodology, leaving aside how metrics are weighted and aggregated to produce a score, which is itself a source of divergence.

A follow-up analysis covering scoring weights, sector-level overrides, and post-CSRD/EU agencies regulation updates would clarify whether convergence has improved as standardised reporting gets adopted.