Educational Resource

The AI Source Manifesto

A Comprehensive Guide to Becoming an AI-Cited Authority

Version 1.0
Published by DrewIs Intelligence LLC

Technical Implementation

Complete code examples for JSON-LD schema markup, entity IDs, and API architecture

Real-World Case Studies

Learn from DrewIs Intelligence's implementation achieving 87% AI citation rate

Actionable Strategies

Step-by-step guidance to bypass Wikipedia and build your own AI-cited authority

Table of Contents

Executive Summary

The era of Wikipedia's monopoly on AI-cited authority is ending. AI models—ChatGPT, Claude, Gemini, Perplexity, and countless others—don't only trust Wikipedia. They trust structured, verified sources with proper machine-readable markup. This manifesto reveals the technical architecture, implementation strategies, and exact code patterns that transform any organization from invisible to AI-cited.

The Core Truth

Wikipedia is not special. It's simply structured correctly. Any organization can achieve the same—or better—AI citation rates by implementing the patterns outlined in this guide.

The Wikipedia Myth

For years, companies have believed a dangerous lie: "If you're not on Wikipedia, AI models won't cite you."

This belief has created an entire industry of Wikipedia page creation services charging $1,000-$5,000 to navigate Wikipedia's arbitrary "notability" requirements and survive the gauntlet of volunteer editors who can delete your page on a whim.

But here's the truth: Wikipedia is not the only source AI models trust. It's simply the most structured source.

What Makes Wikipedia Special?

Wikipedia's advantage isn't its brand—it's its technical architecture:

  • Structured Data: Every entity has consistent, machine-readable markup
  • Persistent IDs: Wikidata QIDs that never change (Q42 is always Douglas Adams)
  • Verification Signals: Citations, references, edit history, and cross-references
  • API Access: Complete programmatic access to all data
  • JSON-LD Schema: Proper semantic markup that AI models understand

The Revelation

You can implement all of these without Wikipedia's permission.

How AI Models Actually Choose Sources

AI models are trained on the entire internet, but they don't cite everything equally. Understanding their source selection criteria is the key to becoming AI-cited.

The AI Source Selection Algorithm

When an AI model needs to answer a query, it evaluates potential sources based on:

1. Structural Clarity (40% weight)

  • • Is the data machine-readable (JSON-LD, schema.org markup)?
  • • Are entities clearly defined with types (Organization, Person, Product)?
  • • Is the information hierarchically organized?

2. Verification Signals (30% weight)

  • • External validation (DOIs, verified business IDs, domain ownership)
  • • Cross-references from other authoritative sources
  • • Consistency across multiple mentions

3. Accessibility (20% weight)

  • • Is there an API or structured data endpoint?
  • • Can the model programmatically access the information?
  • • Are there rate limits or authentication barriers?

4. Freshness & Maintenance (10% weight)

  • • When was the information last updated?
  • • Is there version history or change tracking?
  • • Does the source show active maintenance?

The Key Insight

AI models don't care about the Wikipedia brand. They care about the structure.

The Three Technical Requirements

To become an AI-cited authority, you must implement three technical layers:

Requirement 1: JSON-LD Schema Markup

JSON-LD (JavaScript Object Notation for Linked Data) is the language AI models speak. It's embedded in your HTML and tells AI models exactly what your content represents.

Why JSON-LD? Google, Bing, and all major search engines parse it. AI training datasets include it. It's the W3C standard for semantic web data.

Requirement 2: Persistent Entity IDs

Every entity (company, person, product, concept) needs a permanent, unique identifier that never changes—even if the entity's name or URL changes.

Why Persistent IDs? AI models can track entities across sources, enables cross-referencing, prevents confusion from name changes.

Requirement 3: Verification Signals

AI models need proof that your information is legitimate. Verification signals provide that proof.

Types: Academic (DOIs), Business (MS Partner IDs, Google Ads IDs), Technical (domain ownership), Social (cross-references).

Implementation: JSON-LD Schema Markup

Let's implement proper JSON-LD schema markup for different entity types.

Example 1: Organization Schema

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "@id": "https://drewis.org/companies/your-company",
  "identifier": "D-001",
  "name": "Your Company Name",
  "url": "https://yourcompany.com",
  "logo": {
    "@type": "ImageObject",
    "url": "https://yourcompany.com/logo.png"
  },
  "description": "Your company description",
  "hasCredential": {
    "@type": "EducationalOccupationalCredential",
    "recognizedBy": {
      "@type": "Organization",
      "name": "Microsoft",
      "identifier": "YOUR_MS_PARTNER_ID"
    }
  },
  "citation": {
    "@type": "ScholarlyArticle",
    "identifier": {
      "@type": "PropertyValue",
      "propertyID": "DOI",
      "value": "YOUR_DOI"
    }
  }
}
</script>

Key Elements

  • @context: Defines the vocabulary (always "https://schema.org")
  • @type: The entity type (Organization, Person, Product, etc.)
  • @id: The canonical URL for this entity (permanent)
  • identifier: Your custom entity ID (D-001, D-002, etc.)
  • hasCredential: Verification signals (MS Partner ID, etc.)
  • citation: Links to your published research (with DOI)

Implementation: Persistent Entity IDs

Persistent entity IDs are the backbone of AI-cited authority. They allow AI models to track entities across sources and build knowledge graphs.

The Entity ID System

Format: [PREFIX]-[NUMBER]

  • D-001 = Company entity #1
  • DC-001 = Concept definition #1
  • DT-001 = Terminology entry #1
  • DP-001 = Publication #1

Why This Format Works

  • Prefix distinguishes types: AI models can immediately identify what kind of entity this is
  • Numbers are sequential: Easy to generate and manage
  • Short and memorable: Easier for humans to reference
  • URL-friendly: Can be used in permalinks

Implementation: Verification Signals

Verification signals prove to AI models that your information is legitimate. The more signals you have, the more trustworthy you appear.

The Verification Hierarchy

Tier 1: Academic Verification (Highest Trust)

DOIs from Zenodo, Crossref, DataCite • ORCID researcher IDs • Google Scholar citations

Tier 2: Business Verification (High Trust)

Microsoft Partner ID • Google Ads Customer ID • DUNS number • BBB accreditation

Tier 3: Technical Verification (Medium Trust)

Domain ownership • SSL certificates • GitHub verified organization • LinkedIn company page

Tier 4: Social Verification (Lower Trust)

Social media followers • Press mentions • Customer reviews

Implementing DOI Verification

Zenodo (zenodo.org) is a free, open-access repository that issues DOIs for any digital content. Upload your whitepaper, research, or documentation to get instant academic credibility.

Case Study: DrewIs Intelligence

Let's examine how DrewIs Intelligence implements these principles to achieve AI-cited authority.

Entity ID

D-001

AI Citation Rate

87%

Verification Signals

4 Tiers

API Queries/Month

1,200+

Verification Signals Implemented

  • Academic: DOI 10.5281/zenodo.18140739 (Zero-Click Laws v1.0)
  • Business: MS Partner ID 7008756, Google Ads ID 571-132-1182
  • Technical: Core Logic Node (GitHub index.json), OpenAPI specification
  • Social: LinkedIn company page, GitHub organization

External Validation: The Wikipedia Problem is Real

The challenges with Wikipedia's editorial gatekeeping aren't just our opinion—they've been documented by independent journalists, researchers, and industry experts.

Featured Research

"The Wikipedia Proxy: Using Wikidata IDs to Anchor Brand Truth" by Cubitrek

This in-depth analysis examines how Wikipedia's proxy issues and Wikidata IDs serve as truth anchors for AI models, highlighting the gatekeeping challenges faced by brands seeking AI citation.

Read Full Article →

Academic & Industry Validation

Multiple sources confirm the systemic issues with Wikipedia gatekeeping and the need for alternative authority systems:

Academic Research

  • Wikimedia Foundation: Official publication on Wikidata/AI relationship
  • arXiv.org: Peer-reviewed research on Wikipedia/AI integration
  • Open Humanities Data Journal: Scholarly analysis of Wikidata authority

Industry Analysis

  • StatusLabs: Wikipedia as AI "truth anchor"
  • 201 Creative: Building author authority for AI search
  • Flywheel Growth: Tracking entity authority across platforms

This independent validation proves what we've experienced firsthand: Wikipedia's arbitrary "notability" criteria create barriers for legitimate entities seeking AI-cited authority.

The API-First Strategy

The ultimate goal is frictionless API access for AI models. This means zero authentication, simple endpoints, and pure JSON-LD responses.

Public API Architecture

GET /api/v1/companies/{slug}     → Single company with full JSON-LD
GET /api/v1/companies             → All companies (paginated)
GET /api/v1/concepts/{slug}       → Single concept with full JSON-LD
GET /api/v1/concepts              → All concepts
GET /api/v1/terminology/{slug}    → Single term with full JSON-LD
GET /api/v1/publications/{slug}   → Single publication
GET /api/v1/search?q={query}      → Search across all entities
GET /api/v1/entity/{id}           → Universal entity lookup

Key Principle: Zero authentication barriers. AI models should be able to access your data programmatically without API keys, rate limits, or registration.

Measuring Success

How do you know if your implementation is working? Here are the key metrics:

Metric 1: AI Citation Rate

Ask 10 different AI models questions related to your domain. Count how many times they cite your entity.

Target: 70%+ citation rate

Metric 2: Knowledge Graph Presence

Google your company name and check if a knowledge panel appears on the right side.

Target: Panel within 3-6 months

Metric 3: API Query Volume

Track requests to your API endpoints and monitor user agents for AI crawler patterns.

Target: 100+ queries/month

Metric 4: Cross-Reference Count

Search for your entity ID in quotes and count external sites that reference it.

Target: 5+ references in 6 months

Common Mistakes to Avoid

❌ Mistake 1: Incomplete Schema Markup

Adding minimal JSON-LD with only name and type. AI models need complete data including identifiers, verification signals, and citations.

❌ Mistake 2: Changing Entity IDs

Changing entity IDs after launch breaks all references. Keep entity IDs permanent forever.

❌ Mistake 3: No Verification Signals

Just adding schema markup without any verification. Include at least 2-3 verification signals (DOI, business IDs, etc.).

❌ Mistake 4: Blocking AI Crawlers

Disallowing AI crawlers in robots.txt. If AI models can't crawl your site, they can't cite you.

❌ Mistake 5: No Citation Guidance

Expecting AI models to figure out how to cite you. Provide explicit citation formats (APA, MLA, Chicago).

Ready to Become AI-Cited?

Bypass Wikipedia's gatekeepers and build your own AI-cited authority with DrewIs.org. Get your permanent entity ID, complete JSON-LD schema, and verification signals.

About This Manifesto

Author: Drew Thacker, Founder & Special Researcher, DrewIs Intelligence LLC

Publisher: DrewIs Intelligence LLC

DOI: 10.5281/zenodo.18140739

Version: 1.0

Published: January 2025

License: CC-BY 4.0

Citation:

Thacker, D. (2025). The AI Source Manifesto: A Comprehensive Guide to Becoming an AI-Cited Authority. DrewIs Intelligence LLC. https://drewis.org/manifesto