Philanthropy & Impact

Impact measurement frameworks: IMP, IRIS+, and beyond

From input metrics to genuine outcomes measurement.

Editorial Team8 min read
A volunteer sitting in a van organizing medicine boxes for shipment.
Photo: RDNE Stock project / Pexels

Key takeaways

  • The Impact Management Project's five dimensions provide a structured ontology for impact claims, but the framework only works when applied before capital deployment, not after
  • IRIS+ offers over 600 standardised metrics; most family offices use fewer than 20, and a significant subset of those are input or activity metrics rather than genuine outcomes
  • Theory of change is the connective tissue between an investment thesis and measurable impact—without it, even well-collected data tells no coherent story
  • Regulatory pressure from SFDR Article 9, the EU Taxonomy, and emerging SEC climate disclosure rules is forcing disclosure discipline, but compliance metrics and decision-useful metrics are rarely the same thing
  • A practical impact measurement stack for a family office should distinguish between three tiers: operational indicators, portfolio-level KPIs, and strategic outcomes—each with different update frequencies and decision triggers
  • Independent verification of impact claims remains structurally underused; fewer than 15% of impact reports from private market funds include third-party data validation, according to the GIIN's 2023 survey
  • The honest test for any metric is whether it would change an allocation decision if it moved adversely—if not, it is reporting decoration, not measurement

Why most impact data sits in reports rather than decision rooms

The Global Impact Investing Network's 2023 State of the Market report surveyed 313 impact investors managing approximately $1.164 trillion in assets. Among private market allocators, 98% reported collecting impact data. Among the same group, fewer than 40% said that data materially influenced portfolio construction or manager selection. The gap is not a data collection problem. It is a framework design problem—and family offices, which often lack the dedicated impact teams of large institutional allocators, are particularly exposed to it.

The proliferation of impact frameworks over the past decade has compounded the issue. IMP, IRIS+, the UN SDGs, B Impact Assessment, the EU Taxonomy's Do No Significant Harm criteria, SFDR Principal Adverse Indicators—each framework serves a distinct purpose, and each was designed by a different constituency with different incentives. A family office principal seeking genuine feedback on whether their capital is generating real-world change needs to impose their own architecture on this landscape, rather than accept any single framework wholesale.

The IMP framework: a useful ontology, not a measurement system

The Impact Management Project, convened between 2016 and 2021 before its work was absorbed into the IMP Structured Network, produced what remains the most coherent conceptual architecture in impact investing. Its five dimensions—What, Who, How Much, Contribution, and Risk—provide a shared language for describing impact claims with enough precision to enable comparison across asset classes and geographies.

The 'What' dimension asks what outcomes an investment generates, and whether those outcomes represent improvements or harms. 'Who' asks whose lives are affected, with particular attention to whether beneficiaries are underserved relative to the counterfactual. 'How Much' addresses depth, breadth, and duration of change—acknowledging that a shallow improvement affecting one million people may or may not be preferable to a transformative improvement affecting ten thousand. 'Contribution' is arguably the most demanding dimension, requiring honest assessment of whether the investment caused the impact or merely accompanied it. 'Risk' addresses the probability that outcomes will differ from expectations.

The IMP's contribution dimension is where most impact claims quietly collapse. An investment in a profitable solar developer in OECD markets may generate genuine environmental benefit, but if that capital would have been raised regardless, the investor's contribution to the impact is minimal. Honest application of the framework forces this question before, not after, the commitment is made.

The practical limitation of IMP for family offices is precisely its strength: it is a classification and communication framework, not a data collection or aggregation system. It tells you what questions to ask, not how to answer them with numbers. For a single-family office running a concentrated impact portfolio of ten to fifteen direct investments, IMP provides the qualitative scaffolding for investment memos and annual impact reviews. For a multi-family office aggregating data across dozens of fund managers, IMP alone cannot produce the standardised, comparable outputs that governance committees need.

IRIS+ and the standardisation trade-off

What IRIS+ does well

IRIS+, maintained by the GIIN, addresses the data collection problem that IMP leaves open. Its catalogue of over 600 standardised metrics, organised into thematic taxonomies aligned with the SDGs, allows investors to request comparable data from fund managers across a portfolio. The system's Core Metric Sets—curated shortlists of 10 to 20 metrics for specific impact themes such as financial inclusion, food security, or clean energy—are particularly useful for family offices that want consistency without building a bespoke data architecture from scratch.

For a family office with concentrated exposure to, say, climate infrastructure and smallholder agriculture, using IRIS+ Core Metric Sets for those two themes provides reasonable cross-manager comparability with manageable reporting burden. Metrics such as PI9468 (number of low-income clients served), OI7301 (total greenhouse gas emissions reduced or avoided in metric tonnes of CO2 equivalent), and FP5271 (number of smallholder farmers with increased incomes) are well-defined enough to resist creative interpretation by managers seeking to present favourable data.

The metric selection problem

The deeper problem with IRIS+ is not its catalogue but how practitioners use it. Across a sample of 47 impact fund reports reviewed by BlueMark in its 2022 Making the Mark study, the most commonly reported IRIS+ metrics were overwhelmingly inputs and activities rather than outcomes: number of investments made, capital deployed, number of companies in portfolio, jobs supported. These metrics are easy to collect, difficult to dispute, and tell almost nothing about whether lives improved or ecosystems recovered.

The distinction matters operationally. An input metric measures what you put in—dollars committed, hectares financed, loans extended. An activity metric measures what you did—number of training sessions delivered, tonnes of waste processed. An output metric measures the immediate, observable result—number of participants trained, megawatt-hours of clean energy produced. An outcome metric measures the change in the state of the world that the activity was intended to produce—whether participants found better employment, whether local air quality improved. Impact metrics, strictly defined, add counterfactual discipline to outcomes—what would have happened anyway versus what happened because of the intervention.

A family office that reports 'jobs supported' without specifying whether those jobs represent net employment creation, whether they are formal versus informal, and whether they would have existed without the investment has not measured impact. It has decorated a report with a number.

Theory of change as the connective tissue

Theory of change—the explicit causal chain linking an investment's activities to the outcomes it intends to produce—is the analytical tool that converts both IMP and IRIS+ from reporting exercises into genuine measurement systems. Without a documented theory of change, metric selection is arbitrary: you measure what is available, not what matters.

A rigorous theory of change for a workforce development fund, for instance, would specify the population served (low-income adults in OECD cities with post-secondary education debt), the intervention (subsidised vocational training in high-demand technical trades), the immediate outputs (completion certificates, placement rates), the intermediate outcomes (wage levels 12 months post-placement, employer retention rates), and the long-term outcomes (household income trajectories, intergenerational education outcomes). Each link in this chain is a testable hypothesis. The metrics a family office should demand from that fund manager are precisely the data points that can confirm or refute each hypothesis—not the data points that are easiest to collect.

Theory of change also clarifies the timeline problem that afflicts most impact measurement. Genuine outcomes—reduced child mortality, improved soil carbon, narrowed gender wage gaps—manifest over years or decades. Portfolio reporting cycles run quarterly or annually. Family offices need to decide explicitly which metrics belong to which time horizon, and resist the pressure to substitute short-term proxies for long-term outcomes in ways that obscure rather than illuminate.

Regulatory pressure and the compliance-versus-decision-useful distinction

European family offices, and non-European family offices holding UCITS or AIFMD-regulated fund interests, are now operating under SFDR disclosure requirements that mandate Principal Adverse Impact statements for Article 8 and Article 9 products. The EU Taxonomy adds a further layer, requiring reporting on the proportion of investments that qualify as environmentally sustainable under its six environmental objectives and Do No Significant Harm criteria. The SEC's climate disclosure rules, finalised in March 2024 for larger registrants and currently subject to legal challenge, are moving US-domiciled family office investments in a similar direction.

The practical risk is that compliance-driven reporting creates an illusion of measurement rigour while consuming the organisational capacity that would otherwise support genuine decision-useful analysis. SFDR's 18 mandatory PAI indicators include metrics such as GHG intensity of investee companies, exposure to fossil fuels, and violations of UNGC principles. These are useful data points for risk screening. They are not impact metrics in the IMP sense—they describe characteristics of investments, not changes in the state of the world attributable to capital deployment.

Family offices should maintain a deliberate separation between their compliance reporting stack and their impact measurement stack. The former is driven by regulatory obligation, jurisdiction, and legal structure. The latter should be driven entirely by the investment thesis and theory of change. Conflating them produces portfolios that score well on PAI indicators while generating opaque or unmeasured real-world outcomes.

Building a decision-useful impact measurement architecture

The three-tier model

A practical architecture for a family office managing between $200 million and $1 billion in impact-oriented assets should operate on three tiers, each with distinct metrics, update frequencies, and decision triggers. The operational tier covers inputs, activities, and outputs that fund managers report quarterly—capital deployed, units produced, customers served. These metrics are used to monitor whether portfolio companies are executing their business plans, not to assess impact. The portfolio tier covers intermediate outcomes reported annually—employment quality metrics, client income changes, environmental performance verified against baseline. These metrics feed the annual impact review and inform decisions about follow-on capital or manager re-up. The strategic tier covers long-term outcomes assessed every three to five years—population-level changes in the geographies or sectors where the family office concentrates its impact mandate. These metrics require external data sources, independent verification, and are the only tier that genuinely tests whether the portfolio's theory of change is correct.

Independent verification and the integrity gap

The GIIN's 2023 survey finding—that fewer than 15% of impact reports include third-party data validation—reflects a structural problem in private markets impact investing: the entity with the strongest interest in favourable impact data is the same entity collecting and reporting it. For family offices making direct investments, commissioning independent social or environmental assessments at the mid-point and exit of each investment is operationally feasible and considerably cheaper than the due diligence costs of a failed investment. For fund investments, requesting third-party verification as a condition of re-up is a negotiating lever that larger, more sophisticated LPs are beginning to use.

Verification need not mean full impact audits—a term that has no established professional standard. It means requiring that at least the key outcome metrics in a fund's impact report be traced to primary data sources (beneficiary surveys, administrative records, satellite imagery, third-party databases) rather than manager estimates, and that a qualified external reviewer has confirmed the data collection methodology is fit for purpose. This is achievable, affordable, and the single most effective intervention available to family offices seeking to move from reporting decoration to genuine measurement.

The adversarial metric test

A final discipline worth institutionalising in any family office impact governance process is what might be called the adversarial metric test. For every metric in the portfolio impact report, the investment committee should ask: if this number moved adversely by 20% next year, would we change our allocation to this manager or this theme? If the honest answer is no, the metric is reporting decoration and should be deprioritised in favour of metrics that would actually inform a decision. This test tends to eliminate most input metrics, most activity counts, and most SDG-alignment scores, while preserving outcome metrics tied directly to the theory of change. The resulting shortlist will be shorter and harder to fill with favourable data—which is precisely the point.

Stay informed

Weekly insights for family office professionals.

No spam. Unsubscribe anytime.

Related reading