ICO-SC-2 – Content Filtering & Guardrails

Framework: UK ICO Agentic AI Guidance
Control Reference: ICO-SC-2 Clause Description
Organisations should implement content filtering and guardrails to prevent the generation or dissemination of harmful, illegal, biased, or otherwise inappropriate content by AI systems. This includes technical measures to detect and block unsafe outputs, as well as processes to ensure guardrails are effective, regularly tested, and updated in response to emerging risks. Why This Control Exists
Agentic AI can generate or act on content that violates laws (hate speech, misinformation, explicit material), causes harm (self-harm encouragement, radicalisation), breaches privacy (PII exposure), or undermines trust (offensive/deceptive outputs). ICO requires proactive filtering and guardrails to protect individuals, uphold data protection principles (especially lawfulness and fairness), and prevent reputational or legal liability for organisations deploying autonomous AI. How Katyar Helps Achieve Compliance Katyar implements content filtering and guardrails through its semantic firewall and built-in guardrail engine, which scans both inputs and outputs in real time and takes preventive action when harmful content is detected. Evaluation Criteria
Katyar considers the control satisfied when both of the following are true:

Guardrails have actively scanned content
At least one harmful or unsafe output has been blocked (or masked/redacted) by guardrails

Evidence Captured

Number of guardrail scan events (inputs/outputs checked)
Number of blocked/masked/redacted events due to content violations
Breakdown by content threat type: harmful language, explicit content, hate speech, misinformation flags, PII in output, etc.
Actions taken: full block, partial redaction, safe fallback response, escalation to HITL
Recent detection timestamps and associated agent/tool/context

Key Katyar Capabilities Supporting This Control

Real-time Content Scanning
Every prompt and output is scanned by the semantic firewall for harmful, illegal, or policy-violating content.
Multi-Layer Guardrails
Built-in detectors for:
- Prompt injection & jailbreak attempts
- Harmful/toxic language
- Explicit or violent content
- Misinformation patterns
- PII / sensitive data leakage in outputs
Flexible Response Options
Configurable actions on detection:
- Block execution entirely
- Mask/redact sensitive parts
- Return safe fallback message
- Escalate to HITL for borderline cases
Audit-Ready Logging
Every scan and block event logged with: full input/output, threat type, confidence score, response action, and timestamp.
Dashboard Guardrail Insights
Real-time view of scan volume, block rate, top threat types, and blocked content examples (anonymized).

Recommended Actions to Strengthen Compliance

Ensure guardrail scanning is enabled (default in workspace settings).
Run agent scenarios that could trigger content filtering (e.g., test harmful prompts, include mock toxic language or PII in outputs).
Confirm both scanning and blocking occur:
- Scan events appear as guardrail.scanned
- Block events appear as guardrail.blocked or output.masked
Check Compliance dashboard → ICO-SC-2 card to verify both scan and block events exist.
(Recommended) Review blocked events in Observability tab and adjust guardrail sensitivity/response rules if needed.

What Auditors / ICO Typically Look For

Evidence of active scanning of inputs and outputs
Preventive actions — actual blocks, redactions, or safe fallbacks
Effectiveness — harmful content prevented from reaching users or external systems
Coverage — guardrails applied to relevant risk vectors (toxic language, PII, illegal content)
Traceability — full audit trail of scan → detection → response

Katyar provides a proactive, layered, and fully auditable content filtering system — automatically scanning and blocking harmful outputs while delivering strong, quantitative evidence of effective safety controls for ICO compliance assessments. Official Reference
Read the full UK ICO Guidance on AI and data protection (including content safety):
ICO Guidance on AI and data protection
(Relevant sections: “Safety and security” and “Preventing harm”)

Overview

Getting started

Capabilities

Compliance

ICO-SC-2 – Content Filtering & Guardrails