Skip to main content

Tool

AI vendor risk scorecard

Fifteen questions across five categories — Data Handling, Model Governance, Security Controls, Vendor Maturity, and Legal/Contract — that score any AI vendor against NIST AI RMF, SOC 2, and the procurement baselines US regulated organizations need. Each answer is worth 0–3 points. You get a /45 score, a per-category breakdown, and a risk tier with concrete next steps. Use it for OpenAI, Anthropic, Microsoft Copilot, custom GPTs, or any embedded-AI SaaS your team is evaluating.

By Stefan Efros, CEO & Founder, EFROSReviewed by Stefan Efros, CEO & Founder, EFROS
Reviewed ·

Used only to label your result on-screen. Stays in your browser.

0 of 15 answered

  1. 1

    Data Handling

    Does the vendor process Protected Health Information (PHI) or other regulated PII?

  2. 2

    Data Handling

    Where is customer data stored and processed?

  3. 3

    Data Handling

    Is the data retention and deletion policy clearly documented?

  4. 4

    Model Governance

    Has the vendor published a model card or system card for the model(s) you'll use?

  5. 5

    Model Governance

    Is bias, fairness, and harm testing documented and externally verifiable?

  6. 6

    Model Governance

    Is there a formal model-update notification process — including breaking-change advance notice?

  7. 7

    Security Controls

    Does the vendor maintain a current SOC 2 Type II report?

  8. 8

    Security Controls

    Does the vendor commission an independent penetration test at least annually?

  9. 9

    Security Controls

    Is encryption enforced in transit AND at rest, with documented key management?

  10. 10

    Vendor Maturity

    How long has the vendor been in business operating this product line?

  11. 11

    Vendor Maturity

    Is the vendor's funding or financial position stable and publicly verifiable?

  12. 12

    Vendor Maturity

    Can the vendor provide reference customers in YOUR industry at YOUR scale?

  13. 13

    Legal / Contract

    Is a Data Processing Agreement (DPA) available — including AI sub-processor disclosure?

  14. 14

    Legal / Contract

    What is the vendor's liability cap in the master agreement?

  15. 15

    Legal / Contract

    Are termination rights and data-return obligations clearly contracted?

Your answers stay in your browser. Nothing is submitted unless you ask for the PDF benchmark.

Why a vendor scorecard, not a vendor questionnaire

Standard vendor questionnaires (SIG Lite, CAIQ, NIST SP 800-171 assessor packets) are calibrated for traditional SaaS — they ask about hosting, access control, and incident response, but they don't catch the model-governance, training-data-lineage, and update-cadence questions that decide whether an AI vendor is safe to deploy in a regulated environment. This scorecard layers the AI-specific dimensions on top of the procurement basics so you end up with a single defensible score you can take to your CISO, GC, and executive sponsor.

It is anchored to NIST AI RMF (GOVERN, MAP, MEASURE, MANAGE), the NIST AI RMF GPAI Profile (2024) for foundation models, SOC 2 Trust Services Criteria, and ISO/IEC 42001:2023. Where vendor contracts are typical for US enterprise AI procurement (DPA, BAA, sub-processor disclosure, liability caps, termination rights), the scoring reflects the negotiating ranges we see in client engagements.

How the five categories are weighted

All five categories carry equal weight (3 questions × 3 max points = 9 points per category, 45 points total). That is deliberate — over-weighting any single dimension (e.g. security controls) creates blind spots in the others. A vendor with a flawless SOC 2 Type II report but no model card, no BAA, and a $50K liability cap is not actually a safe procurement, and the equal-weight scoring catches that pattern.

If your specific risk environment justifies different weighting (e.g. a clinical AI deployment should over-weight Data Handling and Model Governance), the per-category breakdown lets you re-prioritize the next steps without redoing the scorecard.

What the risk tiers actually mean

Low risk (36–45) means standard procurement controls apply — add to AI inventory, document the use case in your AUP, set a 12-month review cadence. Medium risk (25–35)means additional safeguards required — close every 0–1 gap with a contractual rider, layer human-in-the-loop on high-stakes outputs, and time-box the pilot. High risk (15–24) means a formal AI risk assessment is required before contract execution — engage legal counsel, commission an independent vendor security assessment, and document an exit plan with data-portability proof BEFORE you deploy. Critical risk (0–14)means do not proceed without major remediation or substitution — pause procurement, evaluate alternatives, and if the business need is genuine, require a formal risk-acceptance memo signed by GC and CISO.

What this scorecard does not cover

This is a 15-question self-assessment. It is not a substitute for a formal vendor security assessment, a SOC 2 deep-dive, a training-data lineage review, or counsel-led DPA negotiation. If you score a vendor in the High or Critical tier and the business still needs the relationship, the next step is a paid assessment — not a re-scored questionnaire. The scorecard also does not evaluate your own organization's readiness to use the vendor (your AUP, your AI inventory, your monitoring); for that, run the Free AI Risk Score.