Methodology
How we score.
About seventy percent of the audit is deterministic — DOM parsing, schema validation, lexical statistics. The LLM only touches the rewrites and the human-language finding explanations. That gives you consistent scores, fast audits, and a path to running the whole thing on weights you control.
Dimension · 1 of 4
LLMEO · the main reading
Will ChatGPT, Perplexity, and Gemini cite this draft? We score the parts of writing that decide.
Structure
20%
Single H1, sequential heading levels, lists or tables present, paragraphs ≤ 4 sentences. LLMs reach for content with structured chunks they can quote verbatim.
Direct-answer formatting
20%
TL;DR or "in short" in the first 150 words; section openers that lead with the answer ("X is …", "Y means …"); question-style headings. LLMs lean heavily on the first 100–200 words to decide what the piece is about.
Schema.org markup
15%
Presence and validity of Article, BlogPosting, FAQPage, HowTo, Author, and Organization JSON-LD. This is the single most reliable way to tell AI engines what a page is and who wrote it. Live-URL audits only.
Citability
15%
External-link count and quality. Links to primary-source domains (.gov, .edu, major journals, well-known publishers) and attribution markers ("according to", direct quotes with attribution) count for more than blogroll links.
Snippet-friendliness
10%
What fraction of your sentences fall in the 12–28 word range that AI engines tend to quote. Definition-style openers earn a bonus.
Freshness signals
10%
datePublished and dateModified in schema, "Updated YYYY" cues in the body, recent year mentions. Live-URL audits only.
E-E-A-T
10%
Author byline + Person schema with credentials, "About the author" block, recognizable attribution. Live-URL audits only.
For pasted drafts we score the four parts you control while writing — structure, direct-answer formatting, citability, snippet-friendliness — and surface the rest after publishing as a re-audit on the live URL.
Dimension · 2 of 4
SEO
The classics — title, meta, headings, schema, links — for the readers who still arrive via Google.
Heading structure
20%
Exactly one H1; no skipped levels (H1 → H3 with no H2); H2 sub-headings in any article over ~400 words.
Content depth
15%
Word count bands. Under 150 words is thin; 400–800 hits the sweet spot for most articles; 800–2,500 scores best.
Title tag
15%
Present and 30–65 characters. Google truncates around 60 — the tail of longer titles is dropped from the SERP. Live-URL audits only.
Meta description
10%
Present, 100–170 characters. Live-URL audits only.
Linking
10%
External link count, plus internal links on URL audits. Zero outbound citations in a 300+ word piece is a real penalty.
Canonical
8%
rel="canonical" present to prevent duplicate-URL competition. Live-URL audits only.
Open Graph
8%
og:title, og:description, og:image. Affects how the page renders when shared on social or in Slack. Live-URL audits only.
Mobile viewport
7%
meta viewport tag for responsive rendering. Live-URL audits only.
Image alts
7%
Alt-text coverage across all images. Live-URL audits only.
Dimension · 3 of 4
Readability
Flesch-Kincaid, sentence variance, passive voice. Plain English, measured. The Hemingway part.
Flesch-Kincaid grade
40%
Target band is grade 8–11 for general professional readers. Below 6 reads childish; above 14 is dissertation territory.
Sentence variance
20%
Stddev / mean of sentence-length distribution. Variance below 0.3 sounds metronomic — a hallmark of AI-generated prose.
Average sentence length
15%
12–22 words is the comfortable band for most prose.
Passive voice
15%
Heuristic match of be-verbs + past participles. Under 10% of sentences is fine; 20%+ blurs agency and hurts AI extraction.
Coleman-Liau
10%
Letter-based readability index. Triangulates with Flesch-Kincaid.
Dimension · 4 of 4
Originality & Voice
Internal repetition, burstiness, cliché density. We tell you when it sounds like a robot.
Internal repetition
30%
3-, 4-, and 5-gram repetition. About 4%+ of n-grams repeating is a writing problem; 15%+ means the piece is looping.
Sentence variance
20%
Same burstiness measure as Readability, weighted toward voice rather than comprehension.
Cliché density
20%
Curated list of LLM-tell phrases — "delve into", "navigate the landscape", "tapestry of", "in today's fast-paced world", "leverage", "synergy", "harness the power of", and ~60 more.
Lexical diversity
15%
Type-token ratio (unique tokens / total tokens). Below 0.4 on a 500+ word doc is narrow vocabulary.
Citation density
15%
External links + attribution markers per 1,000 words. Original work usually engages with other work.
This is "Originality & Voice", not "Plagiarism". We don't have a web-scale corpus to compare against — so we don't pretend. We score the signals we *can* measure honestly.
Grade scale
A through F.
A
90+
B
80–89
C
70–79
D
60–69
F
<60
What we don’t do
And won’t pretend to.
- Web-scale plagiarism check. We don’t have an index. The Originality score measures internal signals, not similarity to anything you didn’t write.
- AI-generated text detection. Public detectors are known unreliable. We surface telltale phrasing patterns; we don’t hand you a probability.
- JS-rendered SPA scraping. Our URL fetcher reads the HTML the server returns. Pages that render their entire body via client-side JS will look thin to us — we’ll warn you and recommend the draft-paste path.
- Real-time citation probes. We don’t query ChatGPT or Perplexity directly with your prompts and check whether your page appears. Coming as a paid feature in v1.1.