Tool

AI vendor risk scorecard

Fifteen questions across five categories (Data Handling, Model Governance, Security Controls, Vendor Maturity, and Legal/Contract) that score any AI vendor against NIST AI RMF, SOC 2, and the procurement baselines US regulated organizations need. Each answer is worth 0–3 points. You get a /45 score, a per-category breakdown, and a risk tier with concrete next steps. Use it for OpenAI, Anthropic, Microsoft Copilot, custom GPTs, or any embedded-AI SaaS your team is evaluating.

By Stefan Efros, CEO & Founder, EFROS

Updated · June 17, 2026

Vendor being scored (optional)

Used only to label your result on-screen. Stays in your browser.

0 of 15 answered

1
Data Handling
Does the vendor process Protected Health Information (PHI) or other regulated PII?
2
Data Handling
Where is customer data stored and processed?
3
Data Handling
Is the data retention and deletion policy clearly documented?
4
Model Governance
Has the vendor published a model card or system card for the model(s) you'll use?
5
Model Governance
Is bias, fairness, and harm testing documented and externally verifiable?
6
Model Governance
Is there a formal model-update notification process, including breaking-change advance notice?
7
Security Controls
Does the vendor maintain a current SOC 2 Type II report?
8
Security Controls
Does the vendor commission an independent penetration test at least annually?
9
Security Controls
Is encryption enforced in transit AND at rest, with documented key management?
10
Vendor Maturity
How long has the vendor been in business operating this product line?
11
Vendor Maturity
Is the vendor's funding or financial position stable and publicly verifiable?
12
Vendor Maturity
Can the vendor provide reference customers in YOUR industry at YOUR scale?
13
Legal / Contract
Is a Data Processing Agreement (DPA) available, including AI sub-processor disclosure?
14
Legal / Contract
What is the vendor's liability cap in the master agreement?
15
Legal / Contract
Are termination rights and data-return obligations clearly contracted?

Your answers stay in your browser. Nothing is submitted unless you ask for the PDF benchmark.

Why a vendor scorecard, not a vendor questionnaire

Standard vendor questionnaires (SIG Lite, CAIQ, NIST SP 800-171 assessor packets) are calibrated for traditional SaaS. They ask about hosting, access control, and incident response, but they don't catch the model-governance, training-data-lineage, and update-cadence questions that decide whether an AI vendor is safe to deploy in a regulated environment. This scorecard layers the AI-specific dimensions on top of the procurement basics so you end up with a single defensible score you can take to your CISO, GC, and executive sponsor.

It is anchored to NIST AI RMF (GOVERN, MAP, MEASURE, MANAGE), the NIST AI RMF GPAI Profile (2024) for foundation models, SOC 2 Trust Services Criteria, and ISO/IEC 42001:2023. Where vendor contracts are typical for US enterprise AI procurement (DPA, BAA, sub-processor disclosure, liability caps, termination rights), the scoring reflects the negotiating ranges we see in client engagements.

How the five categories are weighted

All five categories carry equal weight (3 questions × 3 max points = 9 points per category, 45 points total). That is deliberate. Over-weighting any single dimension (e.g. security controls) creates blind spots in the others. A vendor with a flawless SOC 2 Type II report but no model card, no BAA, and a $50K liability cap is not actually a safe procurement, and the equal-weight scoring catches that pattern.

If your specific risk environment justifies different weighting (e.g. a clinical AI deployment should over-weight Data Handling and Model Governance), the per-category breakdown lets you re-prioritize the next steps without redoing the scorecard.

What the risk tiers actually mean

Low risk (36–45) means standard procurement controls apply: add to AI inventory, document the use case in your AUP, set a 12-month review cadence. Medium risk (25–35)means additional safeguards are required: close every 0–1 gap with a contractual rider, layer human-in-the-loop on high-stakes outputs, and time-box the pilot. High risk (15–24) means a formal AI risk assessment is required before contract execution: engage legal counsel, commission an independent vendor security assessment, and document an exit plan with data-portability proof BEFORE you deploy. Critical risk (0–14) means do not proceed without major remediation or substitution: pause procurement, evaluate alternatives, and if the business need is real, require a formal risk-acceptance memo signed by GC and CISO.

What this scorecard does not cover

This is a 15-question self-assessment. It is not a substitute for a formal vendor security assessment, a SOC 2 deep-dive, a training-data lineage review, or counsel-led DPA negotiation. If you score a vendor in the High or Critical tier and the business still needs the relationship, the next step is a paid assessment, not a re-scored questionnaire. The scorecard also does not evaluate your own organization's readiness to use the vendor (your AUP, your AI inventory, your monitoring); for that, run the Free AI Risk Score.

Get a detailed vendor assessment

AI vendor risk scorecard

Why a vendor scorecard, not a vendor questionnaire

How the five categories are weighted

What the risk tiers actually mean

What this scorecard does not cover

From scorecard to AI governance program

Free AI Risk Score

AI Vendor Governance Index

EFROS AI Governance service

NIST AI RMF practical guide

AI Governance for law firms

Vendor assessment call