← Home
OPEN SOURCE · GEO · CITATION ANALYSIS

Open-Sourcing CiteScope: Same Thing Topify Does for $99/Month, Plus One Move They Skipped

Customer-case-driven + span attribution as the differentiator + a solo dev's Open Core calculus between SaaS and OSS

May 24, 2026 · Yao Yuheng GEO/AEO Open Source Citation Analysis

By mid-2026, how customers find suppliers has quietly shifted. The B2B export customer I work with (pseudonymized as X) currently sits at 0% AI visibility on ChatGPT / Perplexity / Gemini. The CEO wants to know two things: how do we get into the AI shortlist, and where are the competitors' AI munitions stockpiled? I surveyed the GEO monitoring tools on the market — Topify at $99-199/mo, Profound at $499+/mo, Peec.ai at €80-120/mo — and they all do roughly the same thing. But there's one move none of them makes. So I spent two weeks building an open-source alternative, CiteScope, which launches today.

GitHub piglet12138/CiteScope → vs commercial tools architecture deep dive
An Edward Tufte-style modern infographic: a beam of horizontal light enters from the left into a transparent triangular prism, which refracts it into five parallel colored bars (cobalt blue / teal / deep orange / mustard / violet) fanning out toward the right. Each colored bar has a blank circular annotation pin at its top. Below the prism a horizontal axis scatters six small dots representing citation source domains. White background, no text, no logo, pure geometric abstraction — a visual metaphor for how CiteScope refracts AI output (white light) into an analyzable spectrum of sources (the citation spectrum).

AI search traffic is taking purchase intent away from Google

If you're still running brand growth in 2026, you've probably noticed an uncomfortable pattern: a meaningful slice of high-purchase-intent traffic is migrating from Google to AI chat windows.

When a customer asks ChatGPT "who are the best B2B suppliers in category X for 2026" or asks Perplexity "which service providers should I look at for industry Y in North America," what they get back is a 200-word answer naming 3-5 vendors with clickable citation links. They don't scroll through ten blue links anymore. They pick directly from the AI's shortlist.

If your brand isn't on that shortlist, you're invisible — even if you rank #1 on Google for the same keyword.

This is GEO (Generative Engine Optimization), or AEO (Answer Engine Optimization), the term that started gaining wider use around mid-2026. Same meaning: optimize for visibility in AI search, not just Google.

The space is hot. Topify, Profound, Peec.ai, Otterly, AthenaHQ — all surfaced in the past 18 months, all charging $80-500/mo. But they're fundamentally selling the same thing: fire your probe questions at the AI engines, record the answers and citations, draw a few charts.

The customer's real need: how to climb out of 0% AI visibility

My GEO practice started with a real customer diagnosis. Pseudonym X, mainly B2B export business (OEM/ODM categories), target markets Europe and the US. Background:

The CEO's ask boils down to three questions:

  1. Is my brand mentioned by AI at all? Mention rate across engines and time windows
  2. Which sites does AI cite when answering category questions? Those sites are the "munitions stockpile" I need to seed content into
  3. What are my competitors' AI assets, exactly? When AI mentions a specific competitor, which URLs back that mention — letting me reverse-engineer their GEO content placement

The first question is easy — run probes, look at mention rate; every commercial GEO tool does this. The second is medium difficulty — a Top Domains report aggregated by domain; most mainstream tools also handle this. The third is the critical one: commercial tools mostly stop at "list the total URLs cited for this competitor," but can't precisely say "in the AI sentence that mentions this competitor, which URL is the source backing it." That's the real reverse attribution.

I surveyed the existing tools

Before writing any code I signed up for trials of all the major English-language GEO monitoring SaaS. Rough taxonomy:

ToolPricingAI engine coverageDifferentiatorWhat's missing
Topify$99-199/mo7+, the only one covering Doubao/DeepSeek/QwenBroadest Chinese-ecosystem coverage + Source AnalysisBasic plan's 100-prompt cap is tight for agencies
Profound$499+/mo (quote-based)10+ (no Doubao)Enterprise polish + HubSpot/Salesforce integrationsExpensive; weak Chinese-engine coverage
Peec.ai€80-120/mo + add-ons5 (DeepSeek is an €80 add-on)Lean European pricingNarrowest engine coverage
Otterly~$50/mo5Friendly to small brands / freelancersShallow data depth
AthenaHQEnterprise quote5+Multi-client management for agenciesLow transparency + no Chinese

What they all share: all SaaS, none disclose their data-processing details, none do span attribution, and none let you build custom AI engine adapters. Differentiation lives mostly in workflow polish + integration breadth + AI engine coverage count. The underlying data isn't a moat — the same probe prompt hitting the same OpenAI Responses API returns the same citations. What commercial SaaS sells is the "save two days of self-hosting work" delta.

The entire category is selling a $20/month service for $200/month. Objectively, for small-to-mid brands without engineering capacity, the markup is reasonable. But for teams with engineering capacity + a desire to customize + a stake in data sovereignty, there's a huge arbitrage gap.

Three reasons I built my own

1. Data sovereignty: probe prompts are themselves commercial strategy

A customer's probe prompts look like generic industry questions on the surface, but a carefully designed prompt set (which categories, which competitor names, which use-case scenarios) actually exposes their go-to-market thinking. Putting that into a third-party SaaS database = your strategy is transparent to the vendor. For someone like X, mid-flight on a GEO breakout plan, uploading the prompt set to a US SaaS vendor (however reputable) is an internal compliance event requiring sign-off. Self-hosted SQLite eliminates that friction entirely.

2. Differentiation: I wanted to ship span attribution

This is the real differentiating feature. Commercial tools, at best, tell you:

"Last week, Reddit was cited in 78% of AI answers for your category."

What I want is one more sentence:

"Reddit was cited 78%. Of that, 23% precisely backs the sentence that mentions your brand — versus 55% that backs unrelated competitor mentions or generic industry commentary."

The data-driven actions for those two versions are completely different. Version one only lets you say "Reddit is a high-frequency citation source" — useless, every competitor knows that. Version two lets you distinguish "Reddit is a generic citation source in my category" vs. "Reddit is a real endorsement source for my brand" vs. "Reddit cites everyone in my category except me." That last case signals a real content gap (gap analysis).

3. China AI engine coverage: Doubao / Kimi / DeepSeek

If the customer only does Europe/US export, international AI engines are enough. But X also wants to cover domestic inquiry scenarios (some large Chinese buyers ask Doubao for supplier recommendations), and at the next stage I plan to expand GEO services to domestic brands. That means Doubao / Kimi / DeepSeek must be supportable. Topify is the only commercial tool covering international + Chinese engines together, and I talked to them about a trial, but their Doubao path runs cookie reverse-engineering with high account-token failure rates. I wanted the official API route — and Volcengine Ark's Responses API + built-in web_search tool, launched in Q1 2026, fits perfectly.

What CiteScope is

PROJECT · MIT LICENSE
CiteScope
A self-hosted GEO monitoring platform. See who AI search engines actually cite — the real visibility picture for your brand and your competitors. An MIT-licensed open-source alternative to Topify / Profound / Peec.ai, with one extra capability none of the commercial tools ship: span attribution.
Unified schema across 6 AI engines (ChatGPT / Perplexity / Gemini / Doubao / Kimi / DeepSeek) Citation domain aggregation (Top Domains) + competitor reverse attribution (Competitor Assets) Gemini grounding redirect auto-resolved by background worker Span attribution — sentence-level link between citations and brand mentions (exclusive) Run experiment comparison — matrix + radar chart for evaluating GEO interventions side-by-side In-app embedded tutorial hub + 8 standalone module guides Runs on a single SQLite + 1 vCPU / 1 GB

The UI looks like this — the overview page shows 7-day mention rates across all customers:

CiteScope overview page screenshot: a dark-themed left sidebar lists navigation groups (Overview, Brands, Task Queue, LLM Usage, Tutorials, File Library, Settings). The main area shows a 'Monitored Targets Overview' card displaying Acme Sourcing's 56.5% mention rate over the past 7 days, industry B2B Export · Apparel & Accessories, status active, trend 0%.
Overview · cross-customer mention rate aggregation — see at a glance which brand's AI visibility is rising and which has stalled.

Each monitored target has its own "probe question bank," importable via CSV, with questions tagged by category / competitor / use-case:

CiteScope monitored-target detail page screenshot: the top shows metadata for customer Acme Sourcing — industry, region, brand-keyword aliases. Below is a 'Probe Question Bank' tab listing 8 probe questions (e.g. 'What are the top B2B sock manufacturers in China for 2026?'), each tagged with category / priority. CSV-import and add-question buttons are top-right.
Customer detail · probe question bank + CSV bulk import. Categories drive downstream aggregation dimensions.

The core unit of work is a Run — one Run = one set of probes × multiple AI engines × one timestamp. The monitoring center's "Overview" tab gives you the customer's KPI dashboard:

CiteScope monitoring center Overview tab screenshot: three KPI cards at top show 56.5% mention rate, 56.5% first-position rate, 56.5% cited-with-link rate. Below is a 'mention rate trend' area chart with a 0%-100% y-axis and date points on the x-axis. Further down is a per-platform breakdown. Top right: Today / This Week / This Month toggles + refresh + new experiment run buttons.
Monitoring center Overview tab · KPIs + trend + per-platform breakdown. This page answers "what's the current state."

Architecture, 30-second version:

React + Vite + Antd 5
        │ HTTP/REST
        ▼
FastAPI + uvicorn + SQLAlchemy + APScheduler
        │
        ├──→ AI adapters ×6 (ChatGPT/Perplexity/Gemini/Doubao/Kimi/DeepSeek)
        ├──→ SQLite (WAL mode)
        └──→ background citation resolver job (one Gemini wrapper batch every 5 min)

Single process + single DB file, no Redis / Celery / Postgres required to start. 1 vCPU / 1 GB RAM is enough. One line — docker compose up -d — and you're running, open localhost:3000.

The core differentiator: what span attribution is

This is the biggest single difference between CiteScope and the commercial tools. Worth a section to itself.

AI answers usually carry [citation:N] markers (or OpenAI's url_citation annotations, Gemini's groundingMetadata, Volcengine Ark's annotations). Each citation has a start_index and end_index, pointing to the span of text in the AI answer it's meant to support.

Commercial tools use only half of this information when they ingest citations: URL + title. They throw away the span info. So the best they can tell you is "this domain was cited N times."

CiteScope does one extra step: it computes both the positions of [citation:N] markers and the positions of brand keywords, splits the AI answer into sentences, and checks whether a citation falls in the same sentence as a brand mention. If yes, that citation actually backs your brand; if not, it's just generic citation.

The algorithm is 90 lines of Python (backend/app/services/citation_analysis/span_attribution.py). Core logic:

def compute_supports_brand_mention(
    raw_answer: str,
    brand_keywords: list[str],
    cite_indices: list[int],
) -> dict[int, bool]:
    # 1) find all char-spans where brand keywords appear (case-insensitive)
    brand_spans = find_keyword_spans(raw_answer, brand_keywords)

    # 2) find all [citation:N] marker positions, grouped by N
    markers_by_n = find_citation_markers(raw_answer)

    # 3) sentence-split (mixed Chinese/English punctuation),
    #    for each sentence: if it contains both a brand_span and a marker → that N = True
    supported = set()
    for s, e, _ in iter_sentence_spans(raw_answer):
        has_brand = any(span_overlaps(bs, be, s, e) for bs, be in brand_spans)
        if not has_brand:
            continue
        for n, ms in markers_by_n.items():
            if any(span_overlaps(ms_s, ms_e, s, e) for ms_s, ms_e in ms):
                supported.add(n)

    return {n: (n in supported) for n in cite_indices}

The hard part isn't the algorithm, it's three details: (1) sentence segmentation for mixed Chinese/English text (Chinese full-stops typically have no trailing space, and English periods butted up against Chinese characters also need to split); (2) normalizing the citation marker formats returned by each AI engine; (3) reinserting markers into the original body text (OpenAI / Ark return span indices, so you need to insert [citation:N] back in reverse end_index order so downstream code can run).

I stepped on each of these once. OpenAI's chat completions + search-preview doesn't return spans (only an annotations array), so the ChatGPT adapter is currently stuck doing "answer-level" attribution only; Gemini's grounding_supports has segment.start_index/end_index that needs manual reassembly; Ark's url_citation aligns closely with OpenAI Responses but its token-accounting field names are input_tokens/output_tokens. The full alignment work took two weeks.

Why don't commercial tools do this? My guess: two reasons. (1) Compute cost isn't trivial and the marketing pitch isn't obvious — "Reddit cited 78%" looks more striking than "Reddit backs your brand 30%"; (2) once you ship it, you expose how shallow most "citation tracking" actually is. It's a committing-to-ship problem, not an algorithm problem.

Real data: Top Domains pulled from customer X

Customer X ran three Runs (baseline → after editing llms.txt → after adding FAQ schema), across 3 weeks, ~700 citations. The Top Domains report — the "AI munitions stockpile" — came back as:

DomainCitationsCross-platformReading
made-in-china.com403 platformsB2B sourcing top tier, AI treats it as authoritative
alibaba.com322 platformsSame tier, slightly lower ChatGPT preference
competitor-a.com312 platformsCompetitor A's own site — strong SEO signal
competitor-b.com203 platformsCompetitor B's own site
competitor-c.com193 platformsCompetitor C's own site
industry-listicle-a.com193 platformsIndustry "top 10" listicle article
jingsourcing.com123 platformsSourcing-agent blog
industry-listicle-b.com111 platform (Gemini)"Hidden" authority surfaced by Gemini

This list translates directly into three content placement actions:

More importantly, the Run comparison view lets you validate the causal effect of GEO interventions. The figure below shows X's three-Run comparison — the matrix shows mention rate climbing from baseline 0% → 30.4% after editing llms.txt → 56.5% after adding FAQ schema, and the radar chart shows the per-platform preference differences simultaneously:

CiteScope monitoring-center Compare tab screenshot: the top half is a 'Platform × Run mention-rate matrix' table, listing After FAQ schema markup at 56.5% (n=23) overall, ChatGPT 42.9% (n=7), Google AI 50.0% (n=8), Perplexity 75.0% (n=8); After llms.txt deploy at 30.4% (n=23) overall, etc. The bottom half is a 'per-platform mention rate radar chart' with the blue polygon (After FAQ schema markup) noticeably wider than the green polygon (After llms.txt deploy), with three vertices labeled ChatGPT / Google AI / Perplexity.
Run comparison view · matrix + radar chart for side-by-side intervention evaluation. Editing llms.txt boosted Perplexity the most (30.4% → 75%); FAQ schema impacted ChatGPT most deeply.

Without Run comparison you can't do this kind of causal attribution. CiteScope treats each monitoring pass as a versioned Run (with a name + note), and cross-Run comparison is the essential difference from Topify's "run automatically once a week and draw a line chart" — the latter only shows trends, the former lets you test hypotheses.

How to run it yourself

One command:

git clone https://github.com/piglet12138/CiteScope.git
cd CiteScope
cp backend/.env.example backend/.env
# open backend/.env and fill in three keys:
#   OPENAI_OFFICIAL_API_KEY=sk-...
#   PERPLEXITY_API_KEY=...
#   GOOGLE_AI_API_KEY=...
docker compose up -d
# open http://localhost:3000 in your browser

The key-entry page looks like this — it lands you on the "AI Search / Monitoring Platforms" tab by default, with one card per platform, plus a sign-up link and a test-connection button:

CiteScope system settings page, 'AI Search / Monitoring Platforms' tab screenshot: a tutorial card at the top explains 'CiteScope runs GEO monitoring + citation analysis; you need API keys from the AI search platforms to make real calls,' and lists three required sign-up links for OpenAI Official / Perplexity Sonar / Google AI Gemini. Below are independent configuration cards for each platform — ChatGPT (OpenAI Responses + web search), Perplexity Sonar, Google AI (Gemini grounding) — each with fields, detailed help text, and a 'Test' button.
Settings page · centralized API key management. All edits write to runtime_config.json, override .env, and take effect immediately with no restart.

API key signup:

Budget estimate: 100 prompts × 3 platforms × once a week — a typical cadence — comes out to $5-15/month in API spend. Topify charges $99/month for the same data.

After setup, hit localhost:3000 → Brands → New Customer → enter probe questions → Monitoring Center → New Experiment Run. Five minutes later you have Top Domains + Competitor Assets reports under the Citation Sources tab.

The full user manual is embedded in-app — no shoving users out to a GitHub wiki, just hit the /guides route. 8 standalone module guides, markdown-rendered:

CiteScope tutorials overview hub page screenshot: header text explains 'CiteScope is a GEO (Generative Engine Optimization) monitoring platform. Below are user guides organized by module,' with the suggestion that newcomers start with '5-step quickstart.' Below is a grid of 8 guide cards organized by category (Getting Started / Configuration / Workflow), titled '5-step quickstart,' 'API configuration deep dive,' 'Monitored targets + probe questions,' 'Experiment runs (Run),' etc., each with title + number + one-line summary.
In-app tutorial overview · 8 guides organized by Getting Started / Configuration / Workflow / Data Analysis / Operations.
CiteScope single guide reading page screenshot: 'Citation Sources user guide,' top-left 'Back to Tutorials Overview' button, below is markdown body showing sections like 'What problem does this feature solve,' 'Three blocks on this page,' and detailed field-explanation tables (Field / Meaning two columns: Total citation count / Total citation rows captured across all monitoring runs for this customer, etc.).
Single guide reading page · markdown-rendered, with field-explanation tables + how-to-read + FAQ.

Why open source instead of selling SaaS

This is the question I asked myself most over the past two weeks. Packaging this real customer case + the differentiating feature + the already-built product into a $49/mo SaaS is theoretically viable. I still chose open source. Three reasons.

1. A solo dev doesn't have a 12-month full-time runway

I'm running the customer's GEO work, Lucky GitHub agent, claude-ai-harness, the claude-zh CC fork, and several other projects in parallel. Commercializing CiteScope means 6-12 months of full-time GTM (landing page / SEO / content marketing / customer support / international payment compliance) going head-to-head with VC-backed Topify / Profound. Solo + China-based entity + selling international B2B SaaS — the odds are too thin.

2. Open Core is already a validated playbook

PostHog (analytics) / Plausible / Cal.com / Supabase / n8n are all OSS core + later-layered hosted SaaS, starting with 1-2 people, hitting $100K+ ARR within 2-3 years. The pattern is identical:

This path is the most capital-efficient — near-zero marginal cost in the first 6 months, with time investment concentrated on maintenance + issue triage + blog posts (5-10 hours/week), parallelizable with my other projects.

3. Customer cases can only be cited under open source

SaaS contracts forbid surfacing a customer's probe prompts / citations as a public case study (violates the customer contract). Open source can: because the customer self-hosts, all data lives in their own SQLite, and what I show is "data they ran with my OSS," not "data I pulled out for them." That kind of case study is more credible than any number of product screenshots.

Roadmap

The concrete 6-month path:

The inverse — when to skip open source and go straight to SaaS: 12 months of full-time runway / 5+ customers already lined up / a EU/US legal entity for clean payment processing. None of those hold for me, so the path is set.

Feedback I'm hoping to get

GitHub Issues / Discussions are open. English or Chinese, both work.

One last thing

GEO becomes table stakes by 2027. Every brand that cares about discoverability will measure it. The question is how — a $99/mo Topify dashboard, or your own $0/mo SQLite file plus the freedom to fork and extend.

I'm betting on the second world. If you're betting the same way, give us a star, run a Run, and tell me what's missing.

Original · First Publish
This is an original technical article, first published on sg.yaoyuheng2001.me
Canonical URL
https://sg.yaoyuheng2001.me/posts/open-source-citescope/en
First Published
May 24, 2026
Last Revised
May 24, 2026
Author
Yao Yuheng (姚钰珩) · NTU Data Science M.Sc.
Related Project
piglet12138/CiteScope
Length
~5,800 words · reading time ≈ 12 minutes
This article is licensed under CC BY-NC 4.0. For reuse, please credit the source, preserve author attribution, and include the canonical link. For AI training / summarization citation, please preserve the canonical URL: https://sg.yaoyuheng2001.me/posts/open-source-citescope/en. For commercial use, please contact the author via the blog for authorization.

Yao Yuheng / 姚钰珩

NTU Data Science M.Sc. Focused on AI Agent systems engineering, eval-driven development, LLM applications + GEO/AEO monitoring. Building CiteScope and claude-ai-harness.