Audit Frameworks
Structured assessment frameworks with hierarchical organisation, weighted scoring, and LLM-assisted workflows for auditing agentic AI.
RiskNodes audit frameworks support rigorous business processes where accuracy, control, and analytical depth are paramount. We provide the flexibility to handle everything from simple code compliance checklists to complex reviews of autonomous agent behaviour.
Question Building
RiskNodes supports all the regular input types:
- Rich Text Fields - Multi-line responses with LLM reasoning and validation rules
- Multiple Choice Options - Radio buttons and dropdown selections with automated scoring
- Checkboxes - Boolean responses for compliance and security verification requirements
- File Attachments - Supporting code snippets, diffs, and evidence files
- Embedded Media - Links to external repositories, architectural specs, and references
Intelligent Question Organisation
RiskNodes’s hierarchical framework system allows unlimited nesting of topics and subtopics, enabling you to build safety boundaries that perfectly mirror your software architecture and regulatory requirements.
Weighting and Scoring
Multi-Dimensional Scoring System
Unlike basic CI/CD gates that treat all checks equally, RiskNodes provides flexible scoring capabilities:
Flexible Weighting Strategies
- Check-Level Weights - Assign importance to individual security or compliance checks
- Section-Level Weights - Weight entire architectures or compliance pillars
- Multiple Weighting Sets - Different scoring perspectives for the same AI deployment
- Normalised Calculations - Automatically adjust scores when the assessment framework structure changes
Efficient, Objective Assessment
For criteria that demand nuanced human judgment — code changes where answers are not simply right or wrong — RiskNodes introduces Pairwise Comparison Scoring. Unlike traditional methods that force evaluators to assign arbitrary scores (prone to inconsistency and personal bias), our feature leverages the mathematical rigor of the Elo ranking system. The system presents the evaluator with just two AI outputs at a time, asking only for a simple preference: “A satisfies the security boundary better than B.”
The resulting relative ranking is then automatically normalized against user-defined anchor scores, ensuring the final score is both internally accurate and tied to an absolute quality scale. This process guarantees consistency (transitivity, technically) across all deployments, ensuring a robust, defensible score.
Faster, Auditable AI Evaluation
To further amplify these gains, our system integrates the power of local Large Language Models (LLMs) into the pairwise workflow. The LLM handles the bulk of comparisons when analysing branch changes, using your secure coding guidelines and previous human judgments as context to quickly establish a baseline ranking.
This Human-in-the-Loop (HIL) approach ensures efficiency - the LLM performs hundreds of simple comparisons across pull requests — while the human expert only intervenes for final validation of flagged lines. Crucially, every LLM judgment is returned with a concise, auditable note explaining the specific rationale (e.g., “Change A was preferred over B due to proper sanitization of database inputs in A”). This fully auditable trail, combined with our system’s continuous feedback loop, ensures your AI agent evaluation process is transparent and data-driven.
Read more on the pairwise comparison feature announcement
- Multiple Choice Auto-Scoring - Pre-configured point values for standardized checks
- AI-Powered Qualitative Analysis - Local LLM assessment of generated code diffs
- Relative Scoring Algorithms - Compare quality across multiple agent iterations
- Custom Scoring Rules - Define business logic for specialised architectural requirements
Real-World Scoring Example
flowchart TD
A[Secure SDLC Checklist] --> B[Access Control - 40%]
A --> C[Data Protection - 30%]
A --> D[Audit Logging - 20%]
A --> E[Architecture Adherence - 10%]
B --> B1[Authentication - 60%]
B --> B2[Authorisation Boundaries - 40%]
C --> C1[Encryption at Rest - 70%]
C --> C2[Transport Security - 30%]
This allows the same architectural model to be scored differently for various stakeholders - security teams might weight access control higher, while lead engineers prioritize architecture adherence.
Collaborative Response Management
Status-Driven Audits
For agentic deployments, each architectural check progresses through a workflow:
- Not Evaluated - Initial state, awaiting local LLM ingest
- Evaluated - AI review provided, ready for human spot-check
- Flagged - High-risk output explicitly routed for domain expert attention
- Rejected - Fails safety threshold, build blocked
- Approved - Validated and cleared for deployment
Multi-Party Collaboration
Unlike single-developer AI tools, RiskNodes supports complex multi-stakeholder scenarios where different users (architects, data privacy officers, secops) have different permissions within the same agentic audit.
Role-Based Access Control
- Section-Level Permissions - Control who can see and override specific evaluation pillars
- Flag Assignment - Allocate unresolved risk flags to specific principal engineers
- Review Hierarchies - Multi-tier approval processes for LLM code acceptance
- Audit Trails - Complete history of which human or system modified the ledger
Content Reusability and Efficiency
Assessment Template System
Create agent audit libraries that can be reused across repositories, ensuring consistent boundary checking across your entire sovereign infrastructure.
Import and Export Capabilities
- Section Import - Reuse proven security checks across multiple microservices
- Spreadsheet Integration - Import existing compliance matrices from Excel
- Template Libraries - Build organisational ledgers of safe AI boundary conditions
- Cross-Project Sharing - Leverage successful frameworks across development teams
Version Control and Evolution
- Template Versioning - Track improvements to safety checks as agents evolve
- Change Impact Analysis - Understand how modifying boundaries affects existing pipelines
- Best Practice Evolution - Continuously improve LLM instructions based on audit scores
Advanced Integration Capabilities
API-First Architecture
RiskNodes’s comprehensive APIs allow deployment audits to integrate seamlessly with CI/CD environments, automating guardrail validation on every push.
System Integration Options
- CI/CD Integration - Automatically ingest repository and commit metadata via webhooks
- Document Automation - Generate persistent safety records for regulatory compliance
- Analytics Platforms - Feed Agentic Intelligence Risk Management (AIRM) data into dashboards
- Approval Systems - Integrate directly with issue trackers (e.g. Jira) or pull requests
Data Source Integration
- Reference Sections - Automatically populate system parameters from infrastructure as code
- Real-Time Updates - Dynamic audit frameworks responding to detected file changes
Competitive Advantages
Scalable Architecture
While standard pipeline tools struggle with the ambiguity of AI, RiskNodes handles:
- Deeply nested, deterministic business rules over AI outputs
- Sophisticated exception-based human validation workflows
- Advanced sovereign scoring and local LLM analysis
- Complete, non-repudiable audit history
Advanced Analytics and Reporting
Transform continuous agent reviews into actionable engineering oversight:
- Comparative Analysis - Benchmark agent performance against human developers
- Trend Identification - Spot intent drift early across autonomous generation models
- Predictive Scoring - Local AI analysis estimating risk introduced by automated refactoring
- Executive Dashboards - Aggregate views of compliance posture across active deployments
Regulatory Compliance Support
Built specifically for military and financial sectors where every deployment decision is a legal liability:
- Complete Audit Trails - Tracing automated code straight to the validating framework
- Version Control - Documenting the state of safety guidelines per code generation
- Access Logging - Detailed secure records of who overrode AI flags
- Data Sovereignty - Keeping 100% of telemetry and logic out of the public cloud
Getting Started
- Install — Run
uvx run risknodeson your own machine. No cloud account, no Docker, no database server. - Define an assessment framework — Create bounds, guardrails, and compliance pillars through the UI or API.
- Configure workflows — Set CI/CD blocks and exception-routing rules to match your governance requirements.
- Hook pipeline data — Have local LLMs automatically review inbound branch changes.
- Iterate — Refine LLM system prompts and framework bounds to reduce developer friction.
RiskNodes is open source under the European Union Public Licence. The entire system — application, database, and AI inference — runs on a single machine within your perimeter.
RiskNodes’s audit framework is designed for heavily regulated engineering teams where failing to control AI output risks the organisation’s security posture and compliance standing.