The knowledge base grows without manual audit
What it does
The AI agent regularly scans the stream of customer requests in the helpdesk and matches them against the current knowledge base content. The result is a list of topics that customers ask about but where documentation is missing or outdated. The support team gets a prioritized article backlog and ready-made drafts instead of the abstract feeling of 'we need to update the KB'.
Automation turns the repetitive audit work into a background process that runs on a schedule. The team's focus shifts from finding gaps to reviewing and refining the content that customers actually need right now. This is especially noticeable in SaaS products, where the pace of releases outpaces the pace of documentation updates.
The key difference from a manual audit is regularity. The team does not search for gaps once a quarter when someone remembers the KB. The AI agent runs continuously and raises a flag as soon as a new topic reaches a critical volume of requests. This eliminates the drift between customer reality and documentation content.
What automation does, step by step
- Collects tickets, chats, and calls from the helpdesk for a specified period — a day, a week, or a quarter.
- Filters out operational requests (billing, access, bugs), leaving content-related topics for analysis.
- Normalizes the data: removes personal information and brings it to a unified format.
- Groups requests by topic using clustering — similar questions are placed in the same group.
- For each topic, extracts the key intent: what exactly the customer wanted to learn or solve.
- Searches for a matching article in the knowledge base CMS by title, content, tags, and metadata.
- Evaluates topic coverage: whether an article exists, whether it is sufficiently complete, and whether it is outdated.
- Generates a list of gaps, ordered by request frequency and business priority.
- Generates an article draft based on real customer conversations and support agent responses.
- Sends the draft to the responsible editor via the knowledge manager or a review ticket.
What automation does not do
- Does not publish articles without human review — drafts always go through an editor and a product expert.
- Does not replace the product knowledge of support agents: the AI agent relies on existing answers in tickets, not on invented ones.
- Does not resolve the problem of outdated articles automatically — it identifies candidates for updating, but the decision to rewrite remains with the team.
How it works
The automation is built as custom-code using an LLM with integration into helpdesk and knowledge base CMS. Runs on a schedule — once a day or week depending on ticket volume. The architecture is divided into three layers: data collection, analysis, and artifact generation.
Technical flow
- Data extraction. The worker retrieves closed tickets from helpdesk via API for the selected period. Fields: subject, description, correspondence, category, CSAT, close date.
- Cleaning and anonymization. The script removes PII (names, addresses, numbers), normalizes the text, and splits it into chunks for processing.
- Ticket clustering. Embeddings via a text-embedding model, grouping by cosine similarity. Output — topics with ticket counts and average CSAT.
- Knowledge base search. For each topic, a query to the CMS via API or RAG on top of the article export. Returns the top-3 candidates by relevance.
- Coverage assessment. The LLM analyzes the topic and retrieved articles, and outputs a verdict: covered, partially covered, gap, outdated.
- Prioritization. Ranking by formula: ticket frequency × negative CSAT × absence of coverage.
- Draft generation. The LLM creates an article structure based on real conversations and sample agent responses.
- Delivery. The draft lands in the CMS as a draft or in the ticket tracker as a task for the knowledge manager.
Step-by-step implementation
- Deploy the worker in the cloud (AWS Lambda, Cloud Run, or a self-hosted container).
- Configure API access to helpdesk and CMS, prepare a service user with the required permissions.
- Collect a historical ticket sample covering 3-6 months to calibrate clustering.
- Index the current knowledge base — export articles and build the vector index.
- Configure prompts for the LLM: coverage assessment, draft generation, formatting.
- Test on historical data, compare the results against a manual team audit.
- Run a pilot on one product or ticket category.
- Expand to the full base after validating draft quality.
Solution components
Component | Purpose |
|---|---|
Helpdesk API | Source of tickets and ticket metadata |
CMS / content API | Source of KB articles and draft publication channel |
LLM | Clustering, coverage assessment, text generation |
Vector store | KB article index for fast search |
Scheduler | Scheduled execution and queue management |
Dashboard | Viewing gaps and draft status by the editor |
For the first version, a minimal stack is sufficient: scheduler, ticket collection script, LLM call for clustering and assessment, a simple dashboard or email distribution of drafts. Complex optimizations — editor feedback loop, continuous learning on accepted drafts — are added after the pilot.
Result quality depends on how cleanly ticket categories are labeled in the helpdesk. If categorization is chaotic, the first step is to agree on a taxonomy with the support team before launching the pilot.
Prerequisites
To launch automation, the team needs access to the systems and a minimum level of knowledge base readiness.
Data and access
- API access to the helpdesk with read permission for tickets from the last 3-6 months.
- API access or knowledge base export from the CMS: titles, content, tags, update dates.
- A service account with permission to create drafts in the CMS or create tasks in the ticket tracker.
- A deployment environment for the worker: cloud or internal infrastructure.
- LLM provider keys.
Team readiness
- A designated knowledge manager or editor who accepts drafts and takes them through to publication.
- An agreed-upon taxonomy of helpdesk ticket categories — without a clean tag structure, the output will be noisy.
- A review rule: who checks the generated draft before publication and within what timeframe.
- Product expertise is available to clarify technical details in articles.
Timelines
A pilot with one product or category — 2-4 weeks. Includes API integration, knowledge base indexing, prompt configuration, and validation on historical data. Expansion to the full knowledge base — an additional 1-2 weeks after the pilot. Timelines increase if ticket categorization in the helpdesk requires preliminary cleanup.
Pain points
- Review — bottleneck
- Knowledge in heads, not in documents
FAQ
How long does implementation take?
A pilot with one product or category takes 2-4 weeks: one week for helpdesk API and CMS connection, one week for knowledge base indexing and clustering calibration, another 1-2 weeks for draft validation against historical data. Expanding to the full knowledge base adds 1-2 weeks. Timelines grow if ticket categorization in the helpdesk requires preliminary cleanup.
What if we don't have a structured knowledge base?
Automation works with fragmented documentation too — Notion pages, Google Docs, PDF, Confluence. But the less structured the source, the noisier the output at the start. A practical path: collect an export of everything available, run a pilot, then use the identified gaps as a reason to organize the taxonomy. A fully functional CMS emerges naturally as the knowledge base grows.
What are the risks and what breaks in practice?
The main risk is low draft quality when ticket categorization is noisy: an LLM generates an article based on contradictory conversations, and the editor receives garbage. The second risk is PII leakage if anonymization is configured poorly. The third is dependency on a paid LLM, with costs growing with ticket volume. All three are mitigated in the pilot: taxonomy, anonymization audit, LLM budget.
Does automation work in our industry?
Automation is universal for companies with developed customer support — especially in SaaS / Tech, where customers actively reach out on documentable issues. In industries with regulated content (healthcare, finance) a compliance review layer is added before publication. For products with rare, complex cases the effect is lower — there, gaps are closed through agent interviews rather than through ticket volume.
How do you know automation is producing results?
Baseline metrics: number of gaps closed per month, share of tickets with topics already in the KB, time from gap detection to article publication. A leading indicator is the share of drafts accepted by the editor without significant edits. Within 2-3 months a shift becomes visible: the knowledge base covers more requests, agents less often respond manually to the same question.
Will automation replace the knowledge manager?
No. The AI agent removes the routine part from the knowledge manager — finding gaps, producing the initial draft — and leaves the expert part: alignment with the product team, stylistic editing, decisions on publication priorities. Without a human in the loop, knowledge base quality degrades: the LLM does not see product context and does not decide which articles matter for the brand and support strategy.
Want this in your business?
Book a free audit — we'll show how this automation will work for you.