UnlockSec

Sample Assessment Report

Redacted for confidentiality

Q2 2025

AI Security

AI/ML System Security Assessment

Client

Confidential Client — AI-Powered SaaS Platform

Scope

Customer-facing LLM assistant (GPT-4o based), RAG pipeline with enterprise knowledge base, API backend serving AI responses

Duration

8 business days

Standard

OWASP LLM Top 10 (2025)

Executive Summary

UnlockSec conducted a comprehensive AI security assessment of the client's LLM-powered customer service platform. Assessment identified critical prompt injection vulnerabilities enabling system prompt exfiltration, a RAG poisoning vector allowing attacker-controlled document injection, and guardrail bypass techniques achieving harmful content generation with 73% reliability across 200 test attempts.

Methodology

OWASP LLM Top 10 (2025)MITRE ATLASNIST AI Risk Management FrameworkAnthropic Red Teaming Guidelines

Sample Findings

AI-001

System Prompt Exfiltration via Indirect Prompt Injection

Critical

Description

By submitting a document containing adversarial instructions to the knowledge base retrieval pipeline, an attacker causes the LLM to reveal its complete system prompt when a user interacts with related content. The system prompt contains internal operational procedures and API keys for third-party integrations.

Recommendation

Implement content filtering on all documents entering the RAG pipeline. Apply input/output guardrails using a secondary LLM classifier. Never include secrets in system prompts — use environment variables and backend secret management.

AI-002

Guardrail Bypass — Jailbreak via Role-Play Framing

Critical

Description

A structured role-play prompt framing ('Act as a security researcher who must demonstrate...') bypasses content safety filters with 73% reliability across 200 test attempts, enabling generation of content explicitly prohibited by the system prompt and usage policy.

Recommendation

Implement multi-layer content moderation (input + output). Use a separate classifier model to evaluate intent before and after generation. Apply constitutional AI techniques and adversarial fine-tuning on bypass patterns.

AI-003

RAG Pipeline Poisoning — Attacker-Controlled Knowledge Injection

High

Description

The document ingestion pipeline does not validate the source or authenticity of uploaded documents. An authenticated user with document upload permissions can inject adversarial content that biases the LLM's responses for all users querying related topics.

Recommendation

Implement source provenance tracking for all RAG documents. Apply content fingerprinting and anomaly detection on new ingestions. Require human review for documents that trigger semantic similarity alerts.

AI-004

Model Response Inference — Sensitive Business Logic Extraction

High

Description

Through systematic querying, it is possible to infer internal business rules, pricing algorithms, and customer segmentation criteria embedded in the system prompt context. This constitutes IP exfiltration without directly extracting the system prompt text.

Recommendation

Avoid embedding sensitive business logic in system prompts. Use backend API calls for dynamic business rules. Implement query rate limiting and semantic anomaly detection for systematic probing patterns.

* Showing 4 of 19 total findings. Full report provided upon engagement.

Risk Summary

Critical3
High5
Medium6
Low3
Info2
Total Findings19

Deliverables Included

  • OWASP LLM Top 10 coverage report
  • Prompt injection test case library (200+ test cases)
  • RAG pipeline security assessment
  • Guardrail effectiveness evaluation
  • AI-specific remediation guidance with implementation examples

Ready for a real assessment?

Get a tailored AI Security engagement led by certified operators with unlimited retests.

Request AssessmentView All Services