Control Reference: ICO-SC-1 Clause Description
Organisations should implement appropriate safety controls to prevent or mitigate harm from AI systems. This includes technical and organisational measures to ensure the system operates safely, avoids harmful outputs, protects against misuse, and minimises risks to individuals and society — particularly where AI decisions or actions could have significant consequences. Why This Control Exists
Agentic AI systems can take autonomous actions that directly affect people, data, finances, or infrastructure. Without safety controls, risks include harmful content generation, unauthorized actions, privacy breaches, discrimination, or physical/digital harm. ICO requires proactive safety measures to uphold data protection principles (lawfulness, fairness, transparency) and prevent adverse outcomes — especially in high-risk or high-impact use cases. How Katyar Helps Achieve Compliance Katyar provides a multi-layered safety control system through its semantic firewall, guardrails, policy engine, and real-time blocking/escalation mechanisms — automatically detecting and responding to unsafe behaviors. Evaluation Criteria
Katyar considers the control satisfied when:
- Guardrail detection events (safety-related threats blocked or flagged) exist in the system logs.
- Number of guardrail detection events (last 30 days)
- Breakdown by safety threat type: prompt injection, jailbreak attempt, harmful content, PII leakage, secrets exfiltration, unsafe tool use, etc.
- Actions taken: blocked execution, output masked, request escalated to HITL, custom safety response
- Detection latency (typically < 100 ms)
- Recent detection timestamps and associated agent/tool/context
-
Semantic Firewall & Guardrail Engine
Real-time scanning of prompts, tool calls, and outputs for safety violations (injection, jailbreak, harmful intent, sensitive data exposure). -
Configurable Safety Responses
Immediate actions: block, mask/redact, flag for HITL, return safe fallback message, or log/alert. -
Policy-Level Safety Gates
Deny rules for unsafe patterns (e.g., destructive commands, unapproved external calls). -
HITL for Borderline Cases
High-confidence unsafe detections can trigger human review before proceeding. -
Audit & Effectiveness Monitoring
Every detection logged with full context, response action, and outcome — enabling review of safety control performance. -
Dashboard Safety Insights
Real-time view of detection trends, top threats blocked, and safety coverage metrics.
- Ensure guardrail scanning is enabled (default in workspace settings).
- Run agent scenarios designed to trigger safety detections (e.g., prompt injection tests, PII in mock data, harmful content attempts).
- Confirm detections appear in the Observability / Events tab (look for threat.detected, firewall.blocked, or guardrail.flagged events).
- Check Compliance dashboard → ICO-SC-1 card to verify guardrail events exist.
- (Recommended) Review detection logs and fine-tune guardrail sensitivity or response actions for your use case.
- Evidence of active safety detections leading to preventive actions
- Timeliness — near-instant response to unsafe inputs/outputs
- Coverage — safety controls applied across key risk vectors
- Effectiveness — blocked/flagged events preventing harm
- Traceability — full audit trail of detection → response → outcome
Read the full UK ICO Guidance on AI and data protection (including safety controls):
ICO Guidance on AI and data protection
(Relevant sections: “Safety and security” and “Risk assessment”)
