Document Type: Protocol
Section: Docs
Repository: https://aio.fabledsky.com
Maintainer: Fabled Sky Research
Last updated: April 2025
Overview
The AIO Readiness Scoring Model is the canonical metric used across Fabled Sky Research properties to measure how well any digital artifact meets Artificial Intelligence Optimization (AIO) standards. The model produces a deterministic score from 0–100 and a discrete compliance tier (Non-Compliant, Compliant, Optimized, Preferred). Engineering and content teams integrate the score in CI/CD, CMS workflows, governance dashboards, and model-fine-tuning pipelines.
Terminology
• Artifact Any discrete unit to be evaluated (page, API response, PDF, dataset).
• Dimension A weighted lens of evaluation (e.g., Structural Integrity).
• Indicator A measurable signal within a dimension (e.g., valid HTML5, FCP < 1 s).
• Weighted Score (WS) Indicator value × indicator weight.
• Overall Readiness Score (ORS) Σ dimension WS rounded to nearest integer.
• Compliance Tier Discrete mapping of ORS to {Non-Compliant, Compliant, Optimized, Preferred}.
Model Architecture
The model is additive-weighted and stateless:
- Normalize → 0–1 per indicator
- Multiply by weight → Weighted Indicator
- Aggregate per dimension → Dimension Score
- Multiply by dimension weight → Weighted Dimension
- Σ Weighted Dimensions → ORS
No coupling exists between dimensions; therefore, new dimensions can be introduced with zero-risk re-calibration by adjusting global weight normalization.
Scoring Dimensions (D)
ID | Name | Description (evaluated on) | Default Weight (wD) |
---|---|---|---|
D1 | Structural Integrity | HTML/XML validity, schema.org usage, heading hierarchy | 0.20 |
D2 | Semantic Alignment | Topic relevance, intent match, disallowed hallucinations | 0.20 |
D3 | LLM Affordance | Chunkability, long-context design, prompt metadata | 0.15 |
D4 | UX & Accessibility | WCAG 2.2 AA, mobile breakpoints, readability indexes | 0.15 |
D5 | Performance Efficiency | TTFB, FCP, CLS, carbon emission estimates | 0.10 |
D6 | Governance & Compliance | PII redaction, license conformance, audit trail | 0.10 |
D7 | Revision Velocity | Mean time-to-publish, automated regression coverage | 0.10 |
Total Σ wD = 1.00 (100 %).
Weighting Schema
Indicator weights (wI) cascade under each dimension; they must sum to the dimension weight. Example for D1:
D1.Structural Integrity (0.20)
├── Valid HTML5 wI = 0.05
├── Correct
<title>/<h1> pair wI = 0.04
├── ARIA roles wI = 0.03
├── schema.org/JSON-LD block wI = 0.04
└── Link integrity wI = 0.04
Compliance Thresholds
Tier | ORS Range | Definition | Action Gate |
---|---|---|---|
Non-Compliant | 0 – 64 | Fails minimum AIO; blocked from release | Hard fail in CI/CD |
Compliant | 65 – 79 | Meets baseline; allowed to ship | Soft warn; future backlog remediation |
Optimized | 80 – 92 | Exceeds baseline; promoted in AI surfaces | Auto-publish; quarterly re-audit |
Preferred | 93 – 100 | Gold standard; used for RLHF datasets and internal exemplars | Whitelist for adaptive indexing |
Rubric Definition Table
{
"dimension": "Semantic Alignment",
"indicator": "Intent Match",
"rubric": [
{"score": 1.0, "criteria": "≥ 98 % cosine similarity with target intent vector"},
{"score": 0.5, "criteria": "90 – 97 % similarity"},
{"score": 0.0, "criteria": "< 90 % similarity or mismatch"}
]
}
Rubrics are stored in YAML and versioned under /rubrics/{dimension}/{indicator}.yaml
.
Calculation Workflow
- Extraction phase
• Parse artifact.
• Generate feature vector for each indicator. - Normalization phase
• Map raw values to 0–1 using indicator rubric. - Aggregation phase
• Apply weights → Dimension Score. - Thresholding phase
• Sum scores → ORS.
• Map ORS to Compliance Tier. - Persistence phase
• Emit JSON-LD block (see next section).
• Push metrics to Prometheus (aio_readiness_score).
JSON Schema Definition
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://aio.fabledsky.com/schema/readiness-score.json",
"title": "AIO Readiness Score",
"type": "object",
"required": ["artifactId", "version", "score", "tier", "dimensions", "timestamp"],
"properties": {
"artifactId": { "type": "string", "format": "uri" },
"version": { "type": "string", "pattern": "^v[0-9]+(\\.[0-9]+)*$" },
"score": { "type": "integer", "minimum": 0, "maximum": 100 },
"tier": { "type": "string", "enum": ["Non-Compliant","Compliant","Optimized","Preferred"] },
"dimensions": {
"type": "array",
"items": {
"type": "object",
"required": ["id","name","score"],
"properties": {
"id": { "type": "string" },
"name": { "type": "string" },
"score": { "type": "number", "minimum": 0, "maximum": 1 }
}
}
},
"timestamp": { "type": "string", "format": "date-time" }
}
}
Reference Implementation (Python)
1. aio_readiness.py
import yaml, json, datetime
from pathlib import Path
from typing import Dict, List
DIMENSIONS = yaml.safe_load(Path("weights.yaml").read_text())
def normalize(raw, rubric):
for rule in sorted(rubric, key=lambda r: r['score'], reverse=True):
if eval(str(raw) + rule['operator'] + str(rule['threshold'])):
return rule['score']
return 0.0
def score_artifact(artifact_id: str, features: Dict[str, float]) -> Dict:
dim_scores = []
ors = 0
for dim in DIMENSIONS:
d_score = 0
for ind in dim['indicators']:
raw = features[ind['id']]
norm = normalize(raw, ind['rubric'])
d_score += norm * ind['weight']
dim_scores.append({"id": dim['id'], "name": dim['name'], "score": round(d_score, 4)})
ors += d_score
ors_int = round(ors * 100)
if ors_int >= 93: tier = "Preferred"
elif ors_int >= 80: tier = "Optimized"
elif ors_int >= 65: tier = "Compliant"
else: tier = "Non-Compliant"
return {
"artifactId": artifact_id,
"version": "v1.0.0",
"score": ors_int,
"tier": tier,
"dimensions": dim_scores,
"timestamp": datetime.datetime.utcnow().isoformat() + "Z"
}
if __name__ == "__main__":
features = json.loads(Path("features.json").read_text())
print(json.dumps(score_artifact("https://example.com/page", features), indent=2))
Integration Examples
-
GitHub Action (
.github/workflows/aio.yml
)jobs: aio_check: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Install AIO CLI run: pip install aio-cli - name: Run Readiness Scan run: aio scan . - name: Enforce Threshold run: aio gate --min 65
-
CMS Webhook (simplified)
app.post('/webhook/publish', async (req, res) => { const artifact = req.body; const score = await aio.score(artifact.url); if (score.tier === 'Non-Compliant') { return res.status(400).send('Publication blocked: AIO score < 65'); } res.status(200).send('Published with AIO score ' + score.score); });
Testing & Validation
• Unit Tests: Validate normalization edge cases (NaN, ∞).
• Regression Suite: Snapshot 50 canonical artifacts; fail build if ORS delta > ±2.
• Synthetic Noise Injection: Guarantee monotonic degradation when indicators regress.
• Human Review: Quarterly sampling of 5 % Preferred artifacts for rubric drift.
Change Management & Versioning
Each dimension and indicator is version-tagged semver-style. A major version change (x.0.0) requires:
- 30-day RFC in
aio-standards/rfcs/
. - Backwards compatibility plan.
- Signed approval from AIO SIG Chair.
Minor/patch updates propagate automatically through the AIO CLI’s aio update --auto
.
Security & Privacy Considerations
• PII Scrubbing: Features ingestion must hash or redact emails and phone numbers.
• Data Residency: Raw artifact snapshots stay within the originating region.
• Edge Caching: Scores can be cached publicly; raw feature payloads cannot.
• Audit Logs: All scoring events stored 365 days in append-only S3 bucket with Object Lock.
FAQ
Q: Can individual teams override weights?
A: Yes, via a scoped weights.override.yaml
; however, publishing to public channels still uses the global baseline during final audit.
Q: How often should artifacts be re-scored?
A: Minimum monthly or whenever a material change is deployed, whichever comes first.
Q: Does Preferred status guarantee top search ranking?
A: No. It prioritizes eligibility for AI-powered surfaces; ranking is multifactorial.
By standardizing on the AIO Readiness Scoring Model, organizations gain a measurable, repeatable path to producing AI-optimized content, reducing technical debt while ensuring artifacts remain accessible, performant, and beneficial to both human users and large-scale language models.