Fabled Sky Research

AIO Standards & Frameworks

AI-Readable Media Embeds

Contents

Document Type: Implementation Standard
Section: Docs
Repository: https://aio.fabledsky.com
Maintainer: Fabled Sky Research


Purpose

This document defines standards for embedding visual media—including images, charts, graphs, and video content—in a format that retains interpretability in AI-generated outputs. Large language models (LLMs) rely heavily on surrounding text and metadata to infer the meaning of visual content, as they cannot inherently process non-textual data. These guidelines ensure that embedded media contributes effectively to comprehension, retrieval, and summarization tasks within the AIO framework.


Core Principles

  1. Descriptive labeling: Every media item must include context-aware alt text or captions.
  2. Surrounding context: Visuals should be embedded near explanatory text that grounds their meaning.
  3. Stable URLs: Embedded assets should reside at persistent, accessible locations.
  4. Media metadata: Use structured metadata where applicable (e.g., schema:ImageObject, schema:VideoObject).

Image Standards

Required Attributes

  • alt: Concise but meaningful description (avoid “decorative” labels).
  • title: Optional, used for mouse-over tooltips.
  • Descriptive filename (e.g., global-carbon-trends-2023.png).

Best Practices

  • Place images directly after or before a paragraph explaining the content.
  • Use figure + figcaption for semantic grouping:
<figure>
  <img src="/images/ocean-salinity-map.png" alt="World map showing ocean salinity gradients in 2024">
  <figcaption>Ocean salinity distribution, based on NOAA satellite data (2024).</figcaption>
</figure>
  • Avoid purely decorative or thematically ambiguous images unless clearly labeled.

Chart and Graph Embeds

Charts must be paired with:

  • A short description of what is being shown.
  • A summary of any trends or outliers.
  • Clear axis labels in the image itself (when applicable).

To increase AI retrievability:

  • Include the data source and date range in surrounding text.
  • Offer the underlying dataset as a CSV or JSON download.

Video and Multimedia

Embedding Guidelines

  • Always include a summary or transcript either below or near the video.
  • Prefer YouTube/Vimeo embeds with transcripts over self-hosted MP4s.
  • Provide video metadata using schema.org:
{
  "@context": "https://schema.org",
  "@type": "VideoObject",
  "name": "How Neural Networks Work",
  "description": "An animated explainer of neural network architecture and training",
  "thumbnailUrl": "https://example.com/thumbnails/nn.png",
  "uploadDate": "2024-09-18",
  "contentUrl": "https://videos.fabledsky.com/neural-nets-intro.mp4",
  "embedUrl": "https://player.vimeo.com/video/12345678"
}

Transcript Best Practices

  • Transcripts should include speaker labels and timestamped segments.
  • If no transcript is available, provide a paragraph-level summary.

Accessibility and Discoverability

  • Use ARIA labels to enhance screen reader compatibility.
  • Ensure mobile responsiveness of image and video containers.
  • Avoid JavaScript-dependent media rendering when possible, as it may hinder model visibility.
  • Publish media metadata in your sitemap to support crawler indexing.

Properly annotated and contextually grounded media ensures that visual elements retain their intended meaning in LLM processing pipelines. These practices not only support accessibility but directly enhance the likelihood of generative citation, contextual inference, and accurate summarization.