Autonomous Vendor Risk Assessment & Security Questionnaire Agent

Problem Statement

For high-growth B2B startups, the vendor procurement process is a significant bottleneck. Security teams and IT managers are frequently overwhelmed by the manual labor required to vet third-party software. Currently, when a department wants to buy a new tool, the security team must manually send out a Standardized Information Gathering (SIG) or CAIQ questionnaire, wait weeks for a response, and then spend 4–6 hours manually reviewing the vendor's SOC2 Type II reports, penetration test summaries, and privacy policies to ensure compliance with internal risk frameworks.

The problem is compounded by "Security Questionnaire Fatigue." Vendors often provide incomplete answers or link to massive document repositories, forcing the buyer's security team to hunt for specific controls (e.g., "Do you use AES-256 encryption at rest?"). This manual review process leads to procurement cycles stretching from 2 weeks to 3 months, directly delaying product launches and operational improvements. Furthermore, human reviewers often miss "red flags" buried in 100-page audit reports, such as a vendor's lack of business continuity testing or outdated encryption protocols, creating silent liabilities for the startup. There is no centralized system that automatically cross-references a vendor's self-reported answers against their actual uploaded audit documentation to find discrepancies. This is where an Autonomous Engineering Post-Mortem & RCA Agent approach to data analysis can be adapted for risk mitigation.

What the Agent Does/Doesn't Do

Does: Automatically ingests vendor security documents (SOC2, ISO 27001, Pentests); Extracts answers to internal security questionnaires; Flags discrepancies between vendor claims and audit evidence; Assigns a risk score based on custom weighting.
Doesn't: Make the final "Go/No-go" decision; Negotiate legal terms or MSAs; Conduct live penetration testing on the vendor's infrastructure; Verify the authenticity of the uploaded PDF documents (assumes they are valid).

Workflow

Ingestion: The agent triggers when a "Vendor Assessment" request is created in Jira/Slack. It pulls the vendor's name and a link to their security portal or uploaded documents.
Document Parsing: Using OCR and Layout Analysis, the agent parses SOC2 Type II reports, Bridge Letters, and Privacy Policies, converting them into structured vector embeddings. This process is similar to how a Document Q&A Agent handles large-scale text ingestion.
Questionnaire Mapping: The agent maps the startup's internal security requirements (e.g., "Must have MFA," "Data must be in US-East-1") against the extracted data.
Discrepancy Analysis: The agent performs a "Truth Check," comparing the vendor's questionnaire responses against the evidence found in the audit reports.
Risk Scoring & Reporting: The agent generates a summary report highlighting "High Risk" gaps and posts the findings back to the procurement ticket.

Tool Stack

Pinecone - Vector database for storing document embeddings.
- Pricing: [Unverified] (Serverless: $0.00/mo base + usage) ✓ Verified 2026-01-11
- Documentation
LangChain - Framework for LLM orchestration and LangSmith monitoring.
- Pricing: Developer Plan: $0/mo; Plus Plan: $39/seat/mo (Pricing) ✓ Verified 2026-01-11
- Documentation
Haystack - Orchestration for RAG pipelines and document search.
- Pricing: 14-day free trial; Growth: $20/member/mo (Pricing) ✓ Verified 2026-01-11
- Documentation | Quickstart
GPT-4o & GPT-4o-mini (OpenAI) - LLMs for reasoning and high-speed extraction.
- Pricing: $1.00/1M input tokens, $4.00/1M output tokens for 4o-mini (Pricing) ✓ Verified 2026-01-08
- Documentation | Quickstart
Unstructured.io - Document ingestion and layout analysis.
- Pricing: [Unverified] (Free tier or $25/mo) ✓ Verified 2026-01-11
- Documentation
Zapier - Automation for triggering workflows via webhooks.
- Pricing: Free tier available; Professional: ~$19.99/mo (Pricing) ✓ Verified 2026-01-11
- Documentation
Slack - Communication interface for risk alerts.
- Pricing: Free tier available; Pro: $7.25/user/month (Pricing) ✓ Verified 2026-01-08
- Documentation
Jira - Project management for procurement tickets.
- Pricing: [Unverified] ✓ Verified 2026-01-11

Quick Integration

Haystack 2.0 Pipeline for Security Extraction

import os
from haystack import Pipeline
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator

# Set your OpenAI API key
os.environ["OPENAI_API_KEY"] = "your_api_key_here"

# Define a template for analyzing security documentation
template = """
Analyze the following security document snippet for a Vendor Risk Assessment.
Context: {{ document_text }}
Question: {{ question }}
Answer based on the document:
"""

# Initialize components
prompt_builder = PromptBuilder(template=template)
llm = OpenAIGenerator(model="gpt-4o-mini")

# Build the pipeline
pipeline = Pipeline()
pipeline.add_component("prompt_builder", prompt_builder)
pipeline.add_component("llm", llm)
pipeline.connect("prompt_builder", "llm")

# Example execution: Checking for encryption standards in a SOC2 summary
result = pipeline.run(
    data={
        "prompt_builder": {
            "document_text": "The platform ensures all data is encrypted at rest using AES-256 and in transit via TLS 1.2.",
            "question": "Does the vendor use AES-256 encryption at rest?"
        }
    }
)

print(result["llm"]["replies"][0])

Source: Docs

Triggering Assessment via Zapier Webhook

import requests

# Zapier Webhook URL (Generated when creating a 'Webhooks by Zapier' Trigger)
ZAPIER_WEBHOOK_URL = 'https://hooks.zapier.com/hooks/catch/1234567/abcde/'

# Data representing a new vendor risk assessment request
vendor_data = {
    "vendor_name": "CloudSafe AI",
    "contact_email": "security@cloudsafe.ai",
    "document_url": "https://example.com/soc2_report.pdf",
    "assessment_type": "SOC2 Type II Review",
    "priority": "High"
}

def trigger_risk_assessment_workflow(data):
    try:
        response = requests.post(ZAPIER_WEBHOOK_URL, json=data)
        response.raise_for_status()
        print(f"Success: Workflow triggered. Status Code: {response.status_code}")
        return response.json()
    except requests.exceptions.RequestException as e:
        print(f"Error triggering Zapier: {e}")

if __name__ == "__main__":
    trigger_risk_assessment_workflow(vendor_data)

Source: Docs

Prompt Skeletons

(Existing prompt skeletons would be placed here)