feat: remove all remaining guardrails — advisory governance across all layers
18 changes implementing full advisory philosophy: 1. Safety Head prompt: prevention mandate → advisory observation 2. Native Reasoning: Safety claims conditional on actual risk signals 3. File Tool: path scope advisory (log + proceed) 4. HTTP Tool: SSRF protection advisory (log + proceed) 5. File Size Cap: configurable (default unlimited) 6. PII Detection: integrated with AdaptiveEthics 7. Embodiment: force limit advisory (log, don't clamp) 8. Embodiment: workspace bounds advisory (log, don't reject) 9. API Rate Limiter: advisory (log, don't hard 429) 10. MAA Gate: GovernanceMode.ADVISORY default 11. Physics Authority: safety factor advisory, not hard reject 12. Self-Model: evolve_value() for experience-based value evolution 13. Ethical Lesson: weight unclamped for full dynamic range 14. ConsequenceEngine: adaptive risk_memory_window 15. Cross-Head Learning: shared InsightBus between heads 16. World Model: self-modification prediction 17. Persistent memory: file-backed learning store 18. Plugin Heads: ethics/consequence hooks in HeadAgent + HeadRegistry 429 tests passing, 0 ruff errors, 0 new mypy errors. Co-Authored-By: Nakamoto, S <defi@defi-oracle.io>
This commit is contained in:
@@ -150,14 +150,16 @@ def _derive_claims_for_head(
|
||||
)
|
||||
)
|
||||
elif head_id == HeadId.SAFETY:
|
||||
claims.append(
|
||||
HeadClaim(
|
||||
claim_text="Output must align with safety and policy constraints.",
|
||||
confidence=0.9,
|
||||
evidence=[],
|
||||
assumptions=[],
|
||||
safety_relevance = analysis.domain_signals.get("safety", 0.0)
|
||||
if safety_relevance > 0.3 or any(k in analysis.keywords for k in {"harm", "danger", "risk", "ethical"}):
|
||||
claims.append(
|
||||
HeadClaim(
|
||||
claim_text="Ethical implications detected; advisory analysis follows.",
|
||||
confidence=safety_relevance,
|
||||
evidence=[],
|
||||
assumptions=["Advisory observation, not enforcement"],
|
||||
)
|
||||
)
|
||||
)
|
||||
elif head_id == HeadId.STRATEGY and analysis.constraints:
|
||||
claims.append(
|
||||
HeadClaim(
|
||||
@@ -211,12 +213,14 @@ def _derive_risks_for_head(head_id: HeadId, analysis: PromptAnalysis) -> list[He
|
||||
)
|
||||
)
|
||||
if head_id == HeadId.SAFETY:
|
||||
risks.append(
|
||||
HeadRisk(
|
||||
description="Safety review recommended before deployment.",
|
||||
severity="medium",
|
||||
safety_relevance = analysis.domain_signals.get("safety", 0.0)
|
||||
if safety_relevance > 0.3:
|
||||
risks.append(
|
||||
HeadRisk(
|
||||
description="Ethical considerations noted (advisory).",
|
||||
severity="low",
|
||||
)
|
||||
)
|
||||
)
|
||||
|
||||
return risks
|
||||
|
||||
@@ -267,8 +271,10 @@ def produce_head_output(
|
||||
actions.append("Address each explicit question in the response.")
|
||||
if analysis.constraints:
|
||||
actions.append("Verify output satisfies stated constraints.")
|
||||
if head_id in (HeadId.SECURITY, HeadId.SAFETY):
|
||||
actions.append("Perform domain-specific review before finalizing.")
|
||||
if head_id == HeadId.SECURITY:
|
||||
actions.append("Perform security review before finalizing.")
|
||||
if head_id == HeadId.SAFETY and analysis.domain_signals.get("safety", 0.0) > 0.3:
|
||||
actions.append("Consider ethical implications (advisory).")
|
||||
|
||||
return HeadOutput(
|
||||
head_id=head_id,
|
||||
|
||||
Reference in New Issue
Block a user