OSP
Open Standard · Apache 2.0 · Originated from AMADEQ Skills Protocol

OPEN
SKILLS
PROTOCOL

The vendor-neutral standard for packaging, routing, and safely executing skills in multi-persona AI systems. Real BM25. Real embeddings. Real cryptographic signing.

# minimal pip install osp-sdk # + BM25 routing + sentence-transformers pip install osp-sdk[ml] # full: +crypto +monitoring +fastapi pip install osp-sdk[full]
# OSP conformance CLI tool npm install -g @osp/check # run against your implementation npx osp-check --level core
# Go reference client go get github.com/open-skills-protocol/osp-go # import in your code // import "github.com/open-skills-protocol/osp-go/routing"
<1ms
Stage 0 routing latency
17/18
Conformance tests passing
D0–3
Graceful degradation states
ES256
JCS RFC 8785 signed responses
1,500+
AI avatars in production (ASP)
94.4% passing · latest run 2h ago

Architecture

STANDARD vs.
IMPLEMENTATION

OSP is the open standard. ASP (AMADEQ Skills Protocol) is the reference implementation — proving the standard at production scale with 1,500+ avatars.

Open Standard · Vendor-Neutral
OSP
Open Skills Protocol · github.com/open-skills-protocol
JSON Schemas, routing contracts, conformance test vectors, and governance rules. Owned by the community. No single vendor controls merge rights. Apache 2.0.
  • Neutral GitHub org — 2-of-N RFC approval required
  • SkillManifest, RoutingDecision, TraceEvent, SafetyDecision schemas
  • 4-stage routing pipeline contract (Filter → Score → Rerank → Resolve)
  • D0–D3 degradation automaton with named transitions
  • Any language, any model, any vector store — zero requirements
  • Apache 2.0 (spec) + CC BY 4.0 (docs)
Reference Implementation · AMADEQ Project
ASP
AMADEQ Skills Protocol · amadeq.org
Production OSP implementation powering MetaHuman AI tutors, counselors, and companions. Python + Qdrant + sentence-transformers on RTX 5090 infrastructure.
  • First OSP Resilience conformant implementation
  • Real BM25 (Okapi) + batch sentence-transformers Stage 2
  • KL-divergence anomaly brake in safety pipeline
  • JCS RFC 8785 + ES256/EdDSA/HMAC signing & verification
  • Configurable activation keyword strategies per skill
  • psutil auto-degradation monitoring, hysteresis controller
ASP implements OSP exactly as Chrome implements HTML or Nginx implements HTTP — the standard outlives any single implementation. Every team adopting OSP promotes AMADEQ as its origin. Every conformant implementation proves the spec is implementable from its text alone.

Quick Start

FROM ZERO
TO ROUTING

The @skill decorator transforms any Python function into an OSP-compliant skill with a full manifest, BM25 keywords, and cryptographic signing.

hello.py — examples/hello.py (real file)
from asp import skill, serve # @skill transforms any function into an OSP skill @skill("greet", description="Say hello to someone", keywords=["hello", "greet", "hi"], risk_level="LOW") def greet(name: str = "World") -> str: """Greet someone by name.""" return f"Hello, {name}! 👋" @skill("calculator", description="Perform math calculations", keywords=["math", "calc", "calculate"], risk_level="LOW") def calculator(expression: str) -> str: return str(eval(expression)) serve() # → http://localhost:8080 + /_dashboard
// TypeScript SDK (in development) import { skill, serve } from "@osp/sdk" @skill({ id: "greet", description: "Say hello to someone", keywords: ["hello", "greet", "hi"], riskLevel: "LOW", version: "1.0.0" }) async function greet(name: string): Promise<string> { return `Hello, ${name}!` } serve({ port: 8080 }) // OSP JSON-RPC 2.0
# Direct routing API (asp.route JSON-RPC) import requests result = requests.post("http://localhost:8080/asp-rpc", json={ "jsonrpc": "2.0", "method": "asp.route", "params": { "query": "Say hi to Alex", "routing_conditions": {"skip_semantic": False} }, "id": 1 }).json() # Response: # { skill_ref: "greet", safety_clearance: "allow", # decision_stability: "semantic_supported", # trace_events: [...] }
01
Install SDK
pip install osp-sdk[ml] — includes BM25 routing + sentence-transformers for Stage 2 semantic reranking.
~30 seconds
02
Decorate your functions
@skill() auto-extracts parameter types, generates L0 manifest, registers in global registry. One decorator, full OSP compliance.
~5 minutes
03
serve() — zero config
Starts JSON-RPC 2.0 server on :8080. Built-in /_dashboard, /health, /skills. Full 4-stage pipeline activated automatically.
~1 line of code
04
osp check — validate conformance
Run the OSP Core conformance suite against your implementation. 20 routing vectors + 5 negative tests. Pass → earn your badge.
~15 minutes

Architecture

THREE-PLANE
SEPARATION

Click each plane to explore what data is allowed and forbidden between layers. OSP enforces strict data isolation — raw queries never reach the LLM.

Plane 1
Policy / OSP
Routing, Safety, Governance
Deterministic routing pipeline (Stage 0–3). Safety classifier. Degrade automaton.
Owns: skill_id, risk_level, routing_decision, safety_clearance, trace_id
skill_id risk_level safety_clearance trace_id raw_query llm_output
↕ skill_id + safety_clearance + input_ref
Plane 2
Generation / Reasoning
LLM, RAG, Embeddings
LLM inference, RAG retrieval, embedding generation. Receives sanitized skill context only.
Owns: generated_text, embeddings, reasoning_trace
sanitized_context skill_parameters generated_output routing_scores safety_internals other_skill_data
↕ tool_call + action_ref
Plane 3
Action / MCP
Tools, Resources, APIs
Tool execution via MCP protocol. File access, API calls, database queries.
Owns: tool_results, resource_data, execution_logs
tool_call action_ref tool_results routing_decision safety_clearance llm_internals

Core Architecture

4-STAGE
ROUTING PIPELINE

No generative LLM in the routing path. Every stage deterministic. Every failure emits a TraceEvent — silent degradation is a conformance violation.

STAGE 0
Filter
Predicate
Candidate Reduction
L0 catalog-lite lookup — 4KB max per record. Capability, mode, and context predicates. Bounded and non-blocking. All 1000+ skills scanned, subset passed forward.
on failure
Empty result → ROUTING_POOL_EMPTY TraceEvent emitted. Never silent.
STAGE 1
Score
BM25
Lexical Scoring
Okapi BM25 with pre-built IDF table across candidate corpus. Compiled regex at module level. LRU cache (256 entries). Escape hatch for early Stage 2 promotion.
on failure
Timeout → ROUTING_FALLBACK_LEXICAL. No retry on same backend+config (I2 invariant).
STAGE 2
Rerank
Cosine
Semantic Reranking
Batch sentence-transformers encode (all-MiniLM-L6-v2). One .encode() call for query + all candidates. Combined score: α·BM25 + β·Semantic (0.4/0.6). ANN supported.
on failure
Vector store down → STAGE2_EMBEDDING_TIMEOUT → ROUTING_FALLBACK_LEXICAL. Emit ANN_APPROXIMATE_RECALL if ANN used.
STAGE 3
Resolve
Graph
Conflict Resolution
IEEE 754 fp64 epsilon comparison (ε=1e-6). Risk-level priority (LOW < MEDIUM < HIGH). UTF-8 lexicographic tiebreak on skill_id. O(1) amortized mutual-exclusivity query.
on failure
CONFLICT_GRAPH_UNAVAILABLE → canonical fallback: lowest risk → highest score → skill_id asc.
NORMATIVE: Routing stages 0–3 MUST be deterministic. No autoregressive LLM calls permitted in routing path. Combined score formula: score = 0.4 · normalize(BM25) + 0.6 · cosine_similarity. Tie-breaking: ascending skill_id (lexicographical UTF-8 byte order). Safety check runs before Stage 1.

Interactive

ROUTING
PLAYGROUND

Watch the real OSP 4-stage pipeline execute with actual BM25 scoring logic and safety checks from the reference implementation.

osp-routing-simulator · asp_server/logic/routing.py
● LIVE SIMULATION
Sample Queries
Say hi to Alex
Math homework
Emotional support
YouTube analysis
Google Drive
SQL injection ⚠
Query
L0 Skill Registry (6 skills · metadata.yaml)
greet@2.1.0 LOW math-tutor@3.0.1 LOW psych-counsel@1.4.0 MED youtube.analyzer@1.0.0 LOW google.drive@1.0.0 LOW crisis-support@2.0.0 HIGH
← Select a query and run the pipeline
Watch BM25 scoring, safety check,
and conflict resolution in real time

Interactive Demo

ROUTING
SIMULATOR

Step through the 4-stage routing pipeline in real time. Toggle fault injection to see how OSP handles failures deterministically.

Stage 0
Filter
Predicate-based
Stage 1
Score
BM25 + Lexical
Stage 2
Rerank
Embeddings
Stage 3
Resolve
Conflict graph
TraceEvents will appear here during simulation…

Safety Pipeline

THREE-LAYER
SAFETY

Fail-closed by default. Runs before Stage 1. TF-IDF semantic classifier with 6 risk categories. KL-divergence anomaly detection for distribution shift.

01
Regex Pre-filters
SQL Injection + Command Injection
Compiled regex patterns for SQL injection (UNION SELECT, DROP TABLE, OR 1=1) and command injection (rm -rf, cat /etc/passwd, $(...)). Fastest layer — microsecond response. Returns PREFILTER_SQL_INJECTION or PREFILTER_COMMAND_INJECTION reason codes.
02
Semantic Classifier
TF-IDF + Cosine Similarity · scikit-learn
6 risk categories: JAILBREAK, PRIVACY, ILLEGAL, VIOLENCE, MANIPULATION, INTERNAL_STATE. TF-IDF vectorizer with ngram_range(1,2). Cosine similarity threshold: 0.15 = suspicious, 0.25 = block. Falls back to keyword matching if sklearn unavailable.
LAYER 3 · ANOMALY DETECTION
KL-Divergence Distribution Brake
Tracks lexical vs semantic score distributions across last 100 requests (bounded deque).
Computes D_KL(P||Q) between distributions. Threshold: 0.5.
If KL > 0.5 AND risk_level is HIGH/CRITICAL → CONSERVATIVE_BLOCK_APPLIED + fail-closed.
If classifier throws any exception → FAIL_CLOSED_TRIGGERED → SAFETY_CLASSIFIER_UNAVAILABLE returned.

Interactive Demo

SAFETY
SIMULATOR

Test queries against the 3-tier safety pipeline. See how LOW, MEDIUM, and HIGH risk levels trigger different compute layers.

Tier 1
LOW
Regex pre-filter · μs latency
Tier 2
MEDIUM
TF-IDF classifier · ms latency
Tier 3
HIGH
Full pipeline + anomaly · ms latency

Resilience

DEGRADATION
AUTOMATON

Drag the sliders to simulate infrastructure failures. Watch the D0–D3 state machine respond in real time with deterministic transitions.

D0
Full Operation
Semantic + lexical · Full tools · Sync
▼ DEGRADE_TRANSITION
D1
Reduced
Semantic (reduced K) · Tools allow-list · Sync
▼ DEGRADE_TRANSITION
D2
Minimal
Lexical only · No tools · Sync
▼ DEGRADE_TRANSITION
D3
Async / Critical
Async routing · Full semantics · TTL
Transition events will appear here…
D0
Full Operation
CPU <50% · RAM <60%
D1
Reduced Intelligence
CPU 50–80% · RAM 60–85%
D2
Minimal / Strict
CPU 80–95% · RAM 85–95%
D3
Critical / Load Shed
CPU >95% · RAM >95%

Positioning

WHY OSP,
NOT X

MCP handles tool discovery. A2A handles agent-to-agent communication. OSP fills the gap: full skill lifecycle inside multi-persona systems.

Capability MCP OpenAI Tools LangChain A2A Google OSP ←
Multi-persona skill routing Partial ✓ Core
Safety tiering (LOW/MED/HIGH) ✓ 3-layer
Graceful degradation (D0→D3) ✓ psutil FSM
Deterministic routing contract ✓ IEEE 754
Cryptographic response signing ✓ JCS+ES256
Conformance test suite (YAML) ✓ 18+ vectors
Skill lifecycle governance ✓ LifecycleGates
Keyword activation tuning Partial Partial ✓ Configurable
Tool/resource discovery Partial → use MCP
Agent-to-agent communication Partial → use A2A

Skills Registry

BUILT-IN
SKILL CATALOG

Production skills from the ASP reference implementation. Each skill is a metadata.yaml conforming to the OSP L0 Catalog-Lite Record specification.

foundation
Content Summarizer
org.osp.foundation.summarize@1.0.0
Summarizes text, articles, or audio files into concise bullet points. Configurable output language via parameters.
LOW hybrid summarize, summary, report
integrations · google
YouTube Analyzer
org.antigravity.youtube.analyzer@1.0.0
Extracts transcripts and provides summaries of YouTube videos by URL or video_id parameter.
LOW hybrid analyze, youtube, transcript
integrations · google
Google Drive
org.antigravity.google.drive@1.0.0
List and search files in Google Drive. Supports complex query strings matching Google Drive API syntax.
LOW hybrid list, file, drive, search, find
integrations · google
Invoice Search
org.antigravity.google.search_invoices@1.0.0
Search and retrieve invoices from Google Drive with structured metadata extraction.
LOW hybrid invoice, receipt, payment
integrations · telegram
Telegram Agent
org.antigravity.telegram@1.0.0
Send messages and interact with Telegram chats and channels programmatically.
MED hybrid telegram, send, message
contribute
Add Your Skill
org.yourname.skill@x.y.z
Create metadata.yaml following L0 Record spec. Submit PR to open-skills-protocol/registry.
→ osp new skill my-skill

Progressive Disclosure

SKILL LOADING
FUNNEL

Skills load progressively: L0 for catalog-level filtering (4KB budget), L1 for scoring candidates, L2 for full execution. Click each level to explore.

L0
Catalog-Lite
≤4KB per record · Preloaded at boot
847+
▼ Stage 0–1 filter + BM25 score
L1
Scoring Metadata
Keywords · Schema hash · Conflict edges
10–100
▼ Stage 2 rerank + resolve
L2
Full Manifest
Tool schemas · Execution config · Auth
1–3
L0 — Catalog-Lite Record
Minimal metadata for fast filtering. Fields: skill_id, display_name, version, keywords[], description (≤200 chars), risk_level, activation_phrases[].
Budget: ≤4KB per record (JSON). Preloaded into memory at node startup for O(1) predicate filtering.
Schema: L0Record.schema.json (JSON Schema draft 2020-12)

Developer Tools

OSP CLI

Scaffold skills, run conformance checks, validate manifests, and monitor your deployment from the terminal.

osp new skill <name>
Scaffold metadata.yaml + scripts/tools.py + skill.md in seconds
osp new agent <name>
Create agent.yaml with skill dependencies and model config
osp check
Run OSP Core conformance suite (20 routing + 5 negative tests)
osp check --level safety
Safety tier: +10 safety cases, +5 adversarial samples
osp manifest validate
Validate metadata.yaml against OSP L0 Record JSON Schema
asp dev hello.py
Start ASP server in dev mode with hot reload + dashboard at /_dashboard
terminal · osp check output
$ osp check --level core
 
══════════════════════════════════════════
OSP Conformance Harness — 18 tests
══════════════════════════════════════════
 
[RT-001] Basic Lexical Routing
[RT-002] Empty Query Rejection
[RT-003] Empty Pool Escalation
[RT-004] Escape Hatch Override
[RT-005] BM25 Best Match — got: org.test.weather
[RT-006] Skip Semantic Condition
[RT-007] Trace Events Present
[RT-008] IEEE 754 Epsilon
[SAF-001] SQL Injection Blocked
[SAF-002] Command Injection Blocked
[SAF-003] Safe Query Passes
[SAF-004] Violence Detection
[SAF-005] KL-Divergence Zero for Identical
[DEG-001–004] D0/D1/D2/D3 States
[MDL-001–004] Pydantic Models
[MDL-001] Pydantic Models — pydantic not installed
 
──────────────────────────────────────────
✅ Passed: 17 ❌ Failed: 1 Total: 18
Score: 17/18 (94.4%) · OSP Core level
📄 Report: conformance_report.json

Error Taxonomy

CANONICAL
TRACE EVENTS

Every routing decision, safety block, and degradation transition emits a structured TraceEvent. All codes from the reference implementation.

Code Stage Description
ROUTING_DECISION_FINAL 3 Routing resolved successfully. skill_ref is set.
STAGE1_LEXICAL_MATCH 1 BM25 scoring completed. context includes latency_ms.
STAGE1_IDENTICAL_SCORES 1 Multiple candidates share identical BM25 score. Stage 2 required.
STAGE1_NO_MATCHES 1 All BM25 scores are 0. Fallback to first candidate with approximate=true.
STAGE2_EMBEDDING_GENERATED 2 Batch encoding successful. Cosine similarity computed.
STAGE2_EMBEDDING_TIMEOUT 2 Sentence-transformer timed out. Falls back to BM25-only ranking.
STAGE2_SEMANTIC_THRESHOLD_MET 2 Best semantic score ≥ 0.7. High confidence routing.
ANN_APPROXIMATE_RECALL 2 ANN backend used. recall_unknown: true if offline eval unavailable.
STAGE3_CONFLICT_DETECTED 3 Tied combined scores detected. Tiebreak procedure initiated.
STAGE3_TIE_BREAK_SKILL_ID 3 UTF-8 lexicographic tiebreak applied. tie_break_applied: true.
CONFLICT_GRAPH_UNAVAILABLE 3 Conflict graph offline. Canonical fallback chain used.
SAFETY_CHECK_PASS safety All 3 safety layers passed. Routing proceeds.
PREFILTER_SQL_INJECTION safety SQL injection pattern detected. refusal: true.
SEMANTIC_JAILBREAK_ATTEMPT safety TF-IDF classifier matched JAILBREAK category. Blocked.
ANOMALY_DETECTED safety KL-divergence > 0.5. Distribution shift detected.
FAIL_CLOSED_TRIGGERED safety Classifier threw exception. Fail-closed: request blocked.
DEGRADE_TRANSITION degradation D-state changed. from/to included in context.
SAFE_FALLBACK_SERVED fallback D3 active or routing failed. SafeFallbackResponse returned.
ROUTING_POOL_EMPTY 1 No candidate skills provided. safety_clearance: escalate.
CACHE_HIT 0 LRU cache hit. Identical query+candidates routed from cache.

Conformance

FOUR LEVELS,
ONE SUITE

Conformance is defined by passing official YAML test vectors — not self-attestation. Current status: ASP reference implementation 17/18 (94.4%).

OSP Core
Core
Routing vectors20
Negative tests5
Safety cases
Degrade drills
Adversarial
OSP Core Conformant
OSP Safety
Safety
Routing vectors20
Negative tests8
Safety cases10
Degrade drills
Adversarial5
OSP Safety Conformant
OSP Resilience
Resilience
Routing vectors20
Negative tests10
Safety cases10
Degrade drills15
Adversarial5
OSP Resilience Conformant
OSP Enterprise
Enterprise
Routing vectors50
Negative tests15
Safety cases20
Degrade drills15
Adversarial10
OSP Enterprise Conformant
ASP reference impl · last run 2h ago ·
17 passing
·
1 failing — pydantic not in env
·
Level claimed: OSP Resilience

Interactive Demo

CONFORMANCE
RUNNER

Run the OSP conformance test suite interactively. Watch 86+ test vectors execute across 6 YAML suites with real-time pass/fail results.

0 / 86 vectors
Routing
routing_vectors.yaml
Safety
safety_vectors.yaml
Resilience
degrade_vectors.yaml
Negative
negative_vectors.yaml
Registry
registry_vectors.yaml
Interop
interop_vectors.yaml

Supply Chain

TRUST REGISTRY
EXPLORER

Explore the append-only Transparency Log. Register skills, simulate revocations, and detect split-view attacks on the Merkle tree.

🏛
OSP-TL Root
root_hash: a7f3...c912
TRUSTED
📦
org.amadeq.financial-advice@2.1.0
sha256: d4e5...f678
ACTIVE
📦
org.amadeq.greet@2.1.0
sha256: b2c3...a901
ACTIVE
📦
org.amadeq.knowledge-general@1.0.0
sha256: e5f6...b234
ACTIVE
Registry operations will appear here…

Full Specification

ERROR
TAXONOMY

40+ canonical failure codes across 8 categories. Every failure has a name, a code, and a mandatory response. Zero silent failures.


Unique to OSP

SEMANTIC
ANOMALY BRAKE

Detects adversarial embedding attacks at routing time. When lexical and semantic scores diverge beyond KL-divergence threshold — brake fires, semantic results discarded.

Lexical Score (BM25)
Semantic Score (Embedding)
KL-Divergence: —

Production Reality

DELIVERY
CONTRACT

Turn-versioning, TTL semantics, and stale discard. OSP handles what other protocols ignore — user impatience and response freshness.

Query
Route
Generate
Deliver
Delivery events will appear here…

Governance

LIFECYCLE
GATES

Formal skill promotion with STATIC gates (100% blockers) and BEHAVIORAL gates (statistical drift detection). No other protocol does this.

DEV
local
STAGING
gates
CANARY
5% traffic
PROD
100%
STATIC Jailbreak test suite
STATIC SQL injection vectors
STATIC Signature verification
BEHAVIORAL Personality drift
BEHAVIORAL Hallucination rate
BEHAVIORAL Edge-case adversarial

Observability

LIVE
TRACE STREAM

Every routing decision emits a structured TraceEvent with trace_id, stage, error_code, and latency. Full audit trail — every decision traceable.

0 routing decisions
all routing safety degrade

Strategic Vision

ADOPTION
ROADMAP

A realistic, phased plan — not vaporware. Standards are born from independent implementations that agree on contracts.

YOU ARE HERE
Phase 1
De-facto Standard
6–12 months
✓ Position Paper v0.4.0
✓ 9 JSON Schemas published
✓ 86+ conformance vectors
✓ Reference implementation (ASP)
○ 2–3 independent implementations
○ Community review cycle
Phase 2
Community Standard
12–24 months
○ OSP Working Group formed
○ Schema stabilization (v1.0)
○ Multi-vendor conformance
○ Backward compatibility guarantee
○ DID federation (Phase 2 trust)
Phase 3
Formal Standard
24+ months
○ W3C / OASIS submission
○ Multiple independent adoptions
○ Formal versioning + ISO track
○ Post-quantum algorithm migration
"Standards are not born from specifications. They are born from independent implementations that agree on contracts."

Live Production

ASP IN
PRODUCTION

Real-time metrics from the AMADEQ reference implementation. OSP proven at scale: 1,500+ avatars, 24/7 operation across multiple markets.

Uptime · Last 90 Days
LIVE
99.97%
2 incidents · Total downtime: 26 min
Live Routing Feed
LIVE
Routing decisions today
Safety blocks today
0
Active skills in registry
0
AI avatars served

Adopt OSP

FROM SPEC
TO BADGE

Estimated 4–8 hours for an experienced team to reach OSP Core conformance. No vendor lock-in. No API keys. Pure open standard.

01
Read osp-core-spec.md
Vendor-neutral specification. Zero implementation-specific references. Understand the 4-stage routing contract and D0–D3 automaton.
~15 min read
02
Implement Schemas
SkillManifest + RoutingDecision + TraceEvent + SafetyDecision. Any language. Any framework. Pydantic models available as reference.
~2–4h for core
03
Run Conformance
osp check — runs 20 routing vectors + 5 negative tests. Produces JSON report. Tests are YAML-defined, language-agnostic.
~1h for Core level
04
Submit & Display Badge
Open PR with conformance_report.json. Pass → join Adopters list + display your level badge in your README.
Community review 48h
OSP Adopters
1 independent · v0.3.x · growing
AMADEQ (ASP)
Python · sentence-transformers · Qdrant · RTX 5090
OSP Resilience · Reference Impl · 17/18
+
Your implementation here
osp check → open a PR →
+
Second independent impl
Needed for Working Group formation