Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

AI Governance NIST AI RMF NIST AI RMF MEASURE What is
What IS NIST AI RMF Measure

Author: Derrick D. Jackson
Title: Founder & Senior Director of Cloud Security Architecture & Risk
Credentials: CISSP, CRISC, CCSP
Last updated: January 4th, 2026

What is NIST AI RMF Measure?

It’s the function that answers a critical question: can you prove your AI system actually works as intended?

MEASURE in 60 Seconds: You’ve mapped your AI system’s context. Now what? MEASURE is where you actually test whether things work as intended. It’s the third function of NIST’s AI Risk Management Framework, and it answers a critical question: can you prove your AI system is trustworthy? UnitedHealth learned this lesson publicly when their nH Predict algorithm showed a 90% error rate on appeals. MEASURE won’t catch every problem, but it creates the evidence trail that separates “we thought it worked” from “we verified it works.”

Who This Article Is For

  • New to AI governance? Start here for foundational concepts in plain language.
  • Compliance or security professional? You’ll find RACI matrices and framework crosswalks for implementation.
  • Executive? Focus on the Business Impact section and the 60-second summary above.

This article covers MEASURE at an introductory level. For context-setting, see the NIST AI RMF MAP guide. For organizational foundations, review the GOVERN function.

2 Who This Article Is For

Executive Summary

MEASURE is the analytical engine of NIST’s AI Risk Management Framework. It employs quantitative, qualitative, and mixed-method tools to analyze, assess, benchmark, and monitor AI risk. The function includes 22 subcategories across four categories.

Where MAP identifies what could go wrong, MEASURE determines whether it actually is going wrong. The function evaluates trustworthiness characteristics including validity, reliability, safety, security, privacy, fairness, and explainability. Skip this step and your risk decisions lack evidence.

3 Executive Summary 1

Why NIST AI RMF MEASURE Matters: A Real Example

The UnitedHealth nH Predict case demonstrates what happens when organizations deploy AI without adequate measurement.

UnitedHealth’s subsidiary NaviHealth developed an algorithm called nH Predict to evaluate post-acute care claims for Medicare Advantage beneficiaries. The system predicted how long patients needed rehabilitation care, then flagged claims extending beyond that prediction for denial. According to court documents in Estate of Gene B. Lokken v. UnitedHealth Group, the algorithm had a 90% error rate (nine of ten appealed denials were reversed).

Specific MEASURE failures in this case:

  • MEASURE 2.5 violated: System validity and reliability were not demonstrated before deployment
  • MEASURE 2.6 absent: Safety risks weren’t evaluated (patients were discharged prematurely)
  • MEASURE 3.1 missing: No mechanisms tracked the 90% appeal reversal pattern
  • MEASURE 4.2 gap: No validation that system performed consistently as intended

In February 2025, a federal court allowed breach of contract claims to proceed. By September 2025, the judge denied UnitedHealth’s attempt to limit discovery. Had MEASURE activities been implemented, the error patterns would have surfaced before families faced $14,000 monthly costs.

4 Case Study

Business Impact: Why Executives Should Care

The UnitedHealth case isn’t isolated. Workday faces a certified collective action over alleged AI hiring discrimination. A 2024 University of Washington study (Wilson & Caliskan) found AI resume screening systems preferred white-associated names 85% of the time versus Black-associated names just 9%.

Organizations deploying AI without proper measurement face legal exposure (class actions, regulatory investigations), operational costs (system shutdowns, mandatory audits), and reputational damage. MEASURE creates documentation demonstrating due diligence. When regulators ask “how do you know this works?”, MEASURE provides answers.

5 Business Impact Why Executives Should Care 1

Understanding NIST AI RMF MEASURE’s Role

Four functions comprise the framework: Govern, Map, Measure, Manage. MAP establishes context for individual AI systems. MEASURE takes that context and applies rigorous analysis through Test, Evaluation, Verification, and Validation (TEVV) processes. The outputs feed directly into MANAGE for risk response decisions.

One critical distinction: MEASURE isn’t a one-time gate. AI systems drift over time as real-world data diverges from training data. Continuous measurement catches problems before they become lawsuits.

6 Understanding NIST AI RMF MEASUREs Role

The Four Categories of NIST AI RMF MEASURE

Each category addresses different measurement dimensions per NIST AI 100-1:

CategoryFocusSubcategories
MEASURE 1Methods and metrics selection3
MEASURE 2Trustworthiness evaluation13
MEASURE 3Risk tracking over time3
MEASURE 4Measurement feedback3

MEASURE 1 selects appropriate metrics starting with highest-priority risks. MEASURE 2 evaluates trustworthiness characteristics (safety, security, fairness, privacy, explainability). MEASURE 3 implements continuous monitoring. MEASURE 4 validates measurement approaches in deployment contexts.

7 The Four Categories of NIST AI RMF MEASURE

How to Implement NIST AI RMF MEASURE: Top 5 Starting Points

Not all 22 subcategories require equal effort. Start here:

PrioritySubcategoryWhy Start Here
1MEASURE 2.5 (Validity & Reliability)Foundation for all trustworthiness claims
2MEASURE 2.11 (Fairness & Bias)Highest litigation risk area currently
3MEASURE 1.3 (Independent Assessment)Mitigates confirmation bias from developers
4MEASURE 3.1 (Risk Tracking)Catches drift and emergent risks post-deployment
5MEASURE 2.1 (TEVV Documentation)Creates audit trail for all testing activities
8 How to Implement NIST AI RMF MEASURE

RACI Matrix for MEASURE Implementation

Who owns what? RACI clarifies accountability:

ActivityAccountableResponsibleConsultedInformed
Metric SelectionCRO / AI Governance LeadAI Risk / TEVV TeamsDomain ExpertsProduct Owners
System EvaluationCTOIndependent AssessorsAI DevelopersLegal
Safety & Security TestingCISORed Teams / Security EngineersSafety EngineersIncident Response
Bias AuditsEthics CommitteeAlgorithmic AuditorsImpacted CommunitiesHR / Legal
Continuous MonitoringAI Governance CommitteeOperations TeamsEnd UsersExecutive Leadership

NIST emphasizes that assessors should not be the same people who developed the system.

9 RACI Matrix for MEASURE Implementation

Framework Crosswalk

MEASURE aligns with existing frameworks:

MEASURE CategoryISO 27001:2022ISO 42001:2023CIS Controls v8
MEASURE 1Clause 9.1 (Monitoring)Clause 9 (Performance Evaluation)CIS 8 (Audit Log)
MEASURE 2A.8.16 (Monitoring Activities)Clause 9.1 (Monitoring, Measurement)CIS 18 (Pen Testing)
MEASURE 3A.5.7 (Threat Intelligence)Clause 9.1, 10.1 (Monitoring, Improvement)CIS 17 (Incident Response)
MEASURE 4Clause 9.3 (Management Review)Clause 10 (Improvement)CIS 14 (Security Awareness)

ISO 42001 provides the closest alignment because it addresses AI-specific concerns that ISO 27001 doesn’t cover.

10 Framework Crosswalk 1

Warning Signs You Need NIST AI RMF MEASURE

These symptoms indicate MEASURE gaps:

  • No documented test results for deployed AI systems (missing MEASURE 2.1)
  • Developers test their own systems without independent review (missing MEASURE 1.3)
  • Appeal or complaint rates aren’t tracked systematically (missing MEASURE 3.1)
  • Bias testing happens once at launch, never again (missing MEASURE 3.3)

Most organizations discover these gaps after problems surface publicly.

11 Warning Signs

Frequently Asked Questions

Q: What’s the difference between MAP and MEASURE? MAP identifies risks and establishes context. MEASURE quantifies and tracks those risks through testing and monitoring.

Q: How often should MEASURE activities occur? Initial TEVV before deployment, then continuously during operation. Frequency depends on risk level.

Q: Can MEASURE be outsourced? Testing can be. Accountability cannot. You remain responsible for AI systems you deploy.

Q: What if we can’t measure certain risks? Document them explicitly. MEASURE 1.1 requires documenting risks that cannot be measured.


Next Steps

New to AI governance? Start with the GOVERN function before tackling MEASURE activities.

Compliance professional? Use the RACI matrix to identify stakeholders, then document your first AI system using MEASURE 2.5 (validity testing) as your starting point.

Executive? Ask your teams: “What’s our error rate on AI-driven decisions, and how do we know?” If they can’t answer, you have MEASURE gaps creating liability exposure.

13 Next Steps

Ready to Test Your Knowledge?


Article based on NIST AI 100-1 (Artificial Intelligence Risk Management Framework) and supporting documentation from the NIST Trustworthy and Responsible AI Resource Center. Case references: Estate of Gene B. Lokken v. UnitedHealth Group, CASE 0:23-cv-03514-JRT-SGE (D. Minn.); Mobley v. Workday, Inc., Case No. 23-CV-770 (N.D. Cal.).


Author

Derrick Jackson

I’m the Founder of Tech Jacks Solutions and a Senior Director of Cloud Security Architecture & Risk (CISSP, CRISC, CCSP), with 20+ years helping organizations (from SMBs to Fortune 500) secure their IT, navigate compliance frameworks, and build responsible AI programs.

Leave a comment

Your email address will not be published. Required fields are marked *