Sales Transcripts to AI Citations: B2B Knowledge Extraction

Paris Childress
June 15, 2026

The Challenge

Most B2B marketing teams treat content creation as a writing problem. They brief writers, optimize for keywords, and publish. The output is competent, forgettable, and indistinguishable from everything else the LLM already knows. That's the core issue: LLMs don't cite content they've already seen a thousand variations of. They cite content that teaches them something new.

For many clients, the problem is structural. Their expertise lives in sales calls, customer conversations, and the heads of subject matter experts — none of it published, none of it accessible to AI. The knowledge exists; it just has no path from internal conversation to LLM training signal.

The Approach

The strategy started with a single premise: proprietary knowledge is the only GEO moat that compounds. LLMs assign higher weight to high information gain content — net new information the models have never encountered — and the only reliable source of that is internal data that has never been published anywhere.

We built the GEO strategy around BaseForge (GEOforge's proprietary knowledge ingestion engine), treating it as the foundation before a single word of content was written. The logic is direct: content generated from internet research tells LLMs what they already know. Content generated from a brand's own sales transcripts, customer calls, and SME interviews tells LLMs something they cannot find elsewhere. That distinction is the difference between being cited and being ignored.

Implementation

The ingestion process ran in four stages:

  1. Sales call transcripts ingested via Fireflies integration. The client connected their Fireflies account directly to BaseForge, filtering by meeting type — discovery calls, demos, pricing conversations — to pull the most knowledge-dense recordings. Over 100 transcripts were uploaded in the first session alone.

  2. Customer success and support conversations added. Account manager calls and support interactions were pulled in as a second knowledge layer, capturing the voice of the customer — the exact questions buyers ask sales that they also ask ChatGPT.

  3. SME interviews conducted via AI voice interview. For expertise that lived only in people's heads, we used GEOforge's AI voice interview tool: subject matter experts receive a link, open it from their phone, and complete a structured voice interview with an AI interviewer. The transcript flows automatically back into BaseForge as a new knowledge document — no meeting required, no typing.

  4. Documents vectorized into a RAG pipeline. BaseForge processed every uploaded file through text extraction, semantic chunking (~500 tokens per chunk), vector embedding via gemini-embedding-001, and storage in a dedicated Firestore collection. The result: a machine-readable knowledge base the content agent draws from exclusively, not from broad internet research.

With the knowledge base established, ContentForge (GEOforge's content generation and CMS publishing module) generated topic recommendations derived directly from the ingested data. Each draft was grounded in the client's proprietary knowledge — not AI's general training — reviewed by a human, and published straight to CMS without the team ever logging into WordPress.

CiteForge (GEOforge's citation discovery, outreach, and tracking module) then scraped LLM responses for citation opportunities — the long tail of sources appearing in AI answers — surfaced them by priority, and enabled outreach directly inside the platform.

Results & Impact

The complete BaseForge → ContentForge → CiteForge loop is designed to produce consistent citation presence across AI platforms. Teams that begin with no structured content pipeline and no LLM mentions can build toward that presence by starting with the knowledge base rather than the content brief.

The downstream SEO effect is a documented pattern in GEO-first strategies. Content grounded in proprietary knowledge earns AI recommendation first, then compounds into search visibility — the mechanism that drives both LLM citations and organic search performance.

Key Takeaways

  • Start with the knowledge base, not the content brief. Every piece of content produced without a proprietary knowledge base is competing on ground where LLMs already have thousands of equivalent sources. BaseForge changes the input, which changes the output.
  • Sales transcripts are the highest-density knowledge source most B2B brands already have. Salespeople carry the most detailed product knowledge in the organization, and their conversations contain the exact questions buyers ask AI. That material belongs in a knowledge base, not an archive folder.
  • SME interviews at scale require removing friction. If experts have to schedule a meeting or write anything down, knowledge extraction stalls. An AI voice interview they can complete from their phone eliminates the bottleneck entirely.
  • GEO-first content wins SEO as a downstream effect. High information gain content earns LLM citations, and LLM citations drive search visibility — the two outcomes reinforce each other when the content pipeline is grounded in proprietary knowledge.
  • Low or absent AI citation presence is a starting point, not a ceiling. A structured pipeline from internal knowledge to published content is the prerequisite — not brand authority or domain age.

If your brand's expertise lives in sales calls and SME heads rather than published content, BaseForge turns that internal knowledge into LLM-citable assets — starting with the data you already have.

Paris Childress
CEO

Paris Childress is the CEO of Hop AI and creator of GEOforge, a platform that helps B2B brands get cited and recommended by AI assistants like ChatGPT, Perplexity, and Gemini. A former Google Country Manager and agency veteran with 20+ years in digital marketing, Paris is focused on helping brands win in the era of AI search.