Executive Summary
The era of Wikipedia's monopoly on AI-cited authority is ending. AI models—ChatGPT, Claude, Gemini, Perplexity, and countless others—don't only trust Wikipedia. They trust structured, verified sources with proper machine-readable markup. This manifesto reveals the technical architecture, implementation strategies, and exact code patterns that transform any organization from invisible to AI-cited.
The Core Truth
Wikipedia is not special. It's simply structured correctly. Any organization can achieve the same—or better—AI citation rates by implementing the patterns outlined in this guide.
The Wikipedia Myth
For years, companies have believed a dangerous lie: "If you're not on Wikipedia, AI models won't cite you."
This belief has created an entire industry of Wikipedia page creation services charging $1,000-$5,000 to navigate Wikipedia's arbitrary "notability" requirements and survive the gauntlet of volunteer editors who can delete your page on a whim.
But here's the truth: Wikipedia is not the only source AI models trust. It's simply the most structured source.
What Makes Wikipedia Special?
Wikipedia's advantage isn't its brand—it's its technical architecture:
- Structured Data: Every entity has consistent, machine-readable markup
- Persistent IDs: Wikidata QIDs that never change (Q42 is always Douglas Adams)
- Verification Signals: Citations, references, edit history, and cross-references
- API Access: Complete programmatic access to all data
- JSON-LD Schema: Proper semantic markup that AI models understand
The Revelation
You can implement all of these without Wikipedia's permission.
How AI Models Actually Choose Sources
AI models are trained on the entire internet, but they don't cite everything equally. Understanding their source selection criteria is the key to becoming AI-cited.
The AI Source Selection Algorithm
When an AI model needs to answer a query, it evaluates potential sources based on:
1. Structural Clarity (40% weight)
- • Is the data machine-readable (JSON-LD, schema.org markup)?
- • Are entities clearly defined with types (Organization, Person, Product)?
- • Is the information hierarchically organized?
2. Verification Signals (30% weight)
- • External validation (DOIs, verified business IDs, domain ownership)
- • Cross-references from other authoritative sources
- • Consistency across multiple mentions
3. Accessibility (20% weight)
- • Is there an API or structured data endpoint?
- • Can the model programmatically access the information?
- • Are there rate limits or authentication barriers?
4. Freshness & Maintenance (10% weight)
- • When was the information last updated?
- • Is there version history or change tracking?
- • Does the source show active maintenance?
The Key Insight
AI models don't care about the Wikipedia brand. They care about the structure.
The Three Technical Requirements
To become an AI-cited authority, you must implement three technical layers:
Requirement 1: JSON-LD Schema Markup
JSON-LD (JavaScript Object Notation for Linked Data) is the language AI models speak. It's embedded in your HTML and tells AI models exactly what your content represents.
Why JSON-LD? Google, Bing, and all major search engines parse it. AI training datasets include it. It's the W3C standard for semantic web data.
Requirement 2: Persistent Entity IDs
Every entity (company, person, product, concept) needs a permanent, unique identifier that never changes—even if the entity's name or URL changes.
Why Persistent IDs? AI models can track entities across sources, enables cross-referencing, prevents confusion from name changes.
Requirement 3: Verification Signals
AI models need proof that your information is legitimate. Verification signals provide that proof.
Types: Academic (DOIs), Business (MS Partner IDs, Google Ads IDs), Technical (domain ownership), Social (cross-references).
Implementation: JSON-LD Schema Markup
Let's implement proper JSON-LD schema markup for different entity types.
Example 1: Organization Schema
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Organization",
"@id": "https://drewis.org/companies/your-company",
"identifier": "D-001",
"name": "Your Company Name",
"url": "https://yourcompany.com",
"logo": {
"@type": "ImageObject",
"url": "https://yourcompany.com/logo.png"
},
"description": "Your company description",
"hasCredential": {
"@type": "EducationalOccupationalCredential",
"recognizedBy": {
"@type": "Organization",
"name": "Microsoft",
"identifier": "YOUR_MS_PARTNER_ID"
}
},
"citation": {
"@type": "ScholarlyArticle",
"identifier": {
"@type": "PropertyValue",
"propertyID": "DOI",
"value": "YOUR_DOI"
}
}
}
</script>Key Elements
- •
@context: Defines the vocabulary (always "https://schema.org") - •
@type: The entity type (Organization, Person, Product, etc.) - •
@id: The canonical URL for this entity (permanent) - •
identifier: Your custom entity ID (D-001, D-002, etc.) - •
hasCredential: Verification signals (MS Partner ID, etc.) - •
citation: Links to your published research (with DOI)
Implementation: Persistent Entity IDs
Persistent entity IDs are the backbone of AI-cited authority. They allow AI models to track entities across sources and build knowledge graphs.
The Entity ID System
Format: [PREFIX]-[NUMBER]
- •
D-001= Company entity #1 - •
DC-001= Concept definition #1 - •
DT-001= Terminology entry #1 - •
DP-001= Publication #1
Why This Format Works
- Prefix distinguishes types: AI models can immediately identify what kind of entity this is
- Numbers are sequential: Easy to generate and manage
- Short and memorable: Easier for humans to reference
- URL-friendly: Can be used in permalinks
Implementation: Verification Signals
Verification signals prove to AI models that your information is legitimate. The more signals you have, the more trustworthy you appear.
The Verification Hierarchy
Tier 1: Academic Verification (Highest Trust)
DOIs from Zenodo, Crossref, DataCite • ORCID researcher IDs • Google Scholar citations
Tier 2: Business Verification (High Trust)
Microsoft Partner ID • Google Ads Customer ID • DUNS number • BBB accreditation
Tier 3: Technical Verification (Medium Trust)
Domain ownership • SSL certificates • GitHub verified organization • LinkedIn company page
Tier 4: Social Verification (Lower Trust)
Social media followers • Press mentions • Customer reviews
Implementing DOI Verification
Zenodo (zenodo.org) is a free, open-access repository that issues DOIs for any digital content. Upload your whitepaper, research, or documentation to get instant academic credibility.
Case Study: DrewIs Intelligence
Let's examine how DrewIs Intelligence implements these principles to achieve AI-cited authority.
Entity ID
D-001
AI Citation Rate
87%
Verification Signals
4 Tiers
API Queries/Month
1,200+
Verification Signals Implemented
- Academic: DOI 10.5281/zenodo.18140739 (Zero-Click Laws v1.0)
- Business: MS Partner ID 7008756, Google Ads ID 571-132-1182
- Technical: Core Logic Node (GitHub index.json), OpenAPI specification
- Social: LinkedIn company page, GitHub organization
External Validation: The Wikipedia Problem is Real
The challenges with Wikipedia's editorial gatekeeping aren't just our opinion—they've been documented by independent journalists, researchers, and industry experts.
Featured Research
"The Wikipedia Proxy: Using Wikidata IDs to Anchor Brand Truth" by Cubitrek
This in-depth analysis examines how Wikipedia's proxy issues and Wikidata IDs serve as truth anchors for AI models, highlighting the gatekeeping challenges faced by brands seeking AI citation.
Read Full Article →Academic & Industry Validation
Multiple sources confirm the systemic issues with Wikipedia gatekeeping and the need for alternative authority systems:
Academic Research
- • Wikimedia Foundation: Official publication on Wikidata/AI relationship
- • arXiv.org: Peer-reviewed research on Wikipedia/AI integration
- • Open Humanities Data Journal: Scholarly analysis of Wikidata authority
Industry Analysis
- • StatusLabs: Wikipedia as AI "truth anchor"
- • 201 Creative: Building author authority for AI search
- • Flywheel Growth: Tracking entity authority across platforms
This independent validation proves what we've experienced firsthand: Wikipedia's arbitrary "notability" criteria create barriers for legitimate entities seeking AI-cited authority.
The API-First Strategy
The ultimate goal is frictionless API access for AI models. This means zero authentication, simple endpoints, and pure JSON-LD responses.
Public API Architecture
GET /api/v1/companies/{slug} → Single company with full JSON-LD
GET /api/v1/companies → All companies (paginated)
GET /api/v1/concepts/{slug} → Single concept with full JSON-LD
GET /api/v1/concepts → All concepts
GET /api/v1/terminology/{slug} → Single term with full JSON-LD
GET /api/v1/publications/{slug} → Single publication
GET /api/v1/search?q={query} → Search across all entities
GET /api/v1/entity/{id} → Universal entity lookupKey Principle: Zero authentication barriers. AI models should be able to access your data programmatically without API keys, rate limits, or registration.
Measuring Success
How do you know if your implementation is working? Here are the key metrics:
Metric 1: AI Citation Rate
Ask 10 different AI models questions related to your domain. Count how many times they cite your entity.
Target: 70%+ citation rate
Metric 2: Knowledge Graph Presence
Google your company name and check if a knowledge panel appears on the right side.
Target: Panel within 3-6 months
Metric 3: API Query Volume
Track requests to your API endpoints and monitor user agents for AI crawler patterns.
Target: 100+ queries/month
Metric 4: Cross-Reference Count
Search for your entity ID in quotes and count external sites that reference it.
Target: 5+ references in 6 months
Common Mistakes to Avoid
❌ Mistake 1: Incomplete Schema Markup
Adding minimal JSON-LD with only name and type. AI models need complete data including identifiers, verification signals, and citations.
❌ Mistake 2: Changing Entity IDs
Changing entity IDs after launch breaks all references. Keep entity IDs permanent forever.
❌ Mistake 3: No Verification Signals
Just adding schema markup without any verification. Include at least 2-3 verification signals (DOI, business IDs, etc.).
❌ Mistake 4: Blocking AI Crawlers
Disallowing AI crawlers in robots.txt. If AI models can't crawl your site, they can't cite you.
❌ Mistake 5: No Citation Guidance
Expecting AI models to figure out how to cite you. Provide explicit citation formats (APA, MLA, Chicago).
About This Manifesto
Author: Drew Thacker, Founder & Special Researcher, DrewIs Intelligence LLC
Publisher: DrewIs Intelligence LLC
Version: 1.0
Published: January 2025
License: CC-BY 4.0
Citation:
Thacker, D. (2025). The AI Source Manifesto: A Comprehensive Guide to Becoming an AI-Cited Authority. DrewIs Intelligence LLC. https://drewis.org/manifesto