#51Product & Engineering

AI triage GitHub/Jira issues

AI triage of GitHub/Jira issues automates classification and routing of incoming tickets in the Product & Engineering department, reducing time-to-label from 18 hours to 2 hours. An AI agent powered by an AI model reads each new issue, extracts key entities — component, type, priority, affected module — applies labels, semantically searches for duplicates among open tickets from the past 6-12 months, and assigns a responsible owner according to the team's ownership rules. The automation removes repetitive routine from the senior engineer: 3 hours per week were spent on triaging incoming tickets — now reduced to 20 minutes of quick review of edge cases. Suited for SaaS and product teams with an active issue flow, where manual triage turns into constant context switching and a source of labeling errors. Does not replace engineering judgment on disputed cases — triage applies initial labeling and links duplicates, final decisions remain with the tech lead. Implementation takes 2-4 weeks with ready API access to GitHub or Jira and an approved label taxonomy.

Expected effect
90%· Triage time
Complexity
Week (1-5 days)
Tool type
Custom code
ROI
Time saved
Industries
SaaS / Tech, Other / Horizontal
Integrations
Code repository, Issue tracking
Patterns
Extraction from Unstructured, Classification and Routing

What it does

An AI agent powered by an AI model connects to your GitHub repository and/or Jira instance and processes each new issue at the moment it is created. The system extracts meaning from the unstructured text of the title and description, classifies the ticket according to the team's internal taxonomy, and performs standard triage actions without engineer involvement.

What automation does in practice:

  1. Picks up a new issue via webhook (GitHub Issues API or Jira Webhooks) immediately after creation.
  2. Extracts key entities from the description: type (bug/feature/question), affected component or module, mentioned versions, environment, stacktrace.
  3. Determines priority according to team rules: severity, affected users, business impact.
  4. Sets labels and components in accordance with the approved taxonomy.
  5. Searches for duplicates — semantic search across already open issues over the past 6-12 months.
  6. Assigns an owner: by component, by module ownership, by round-robin within the team.
  7. Leaves a comment with a concise structured summary for the engineer — what broke, where, how to reproduce, similar tickets.
  8. Escalates to the Slack team channel if the issue is marked as critical or resembles a production incident.

Results from deployment practice: a senior engineer was spending 3 hours a week on manual triage — it became 20 minutes of quick review of edge cases. Time-to-label dropped from 18 hours to 2 hours. Duplicates are caught automatically — previously it took up to 1-2 days before someone noticed a repeat.

What automation does NOT do

Honest boundaries of AI agent responsibility:

  • Does not make engineering decisions. Triage sets the labels and assigns an owner, but does not decide 'fix now or not' and 'which sprint' — that stays with the tech lead.
  • Does not close issues. Even obvious duplicates are flagged and linked by the agent, but the final closure is done by a human — this is a safeguard against false matches.
  • Does not work with private architecture without context. The agent needs a populated labels taxonomy, module ownership map, and examples of correctly labeled tickets.

How it works

Architecturally, the automation is built as a slim service between the issue tracker and the LLM: a webhook catches the event, context is assembled (issue text, similar open tickets, module ownership, previously labeled examples), the LLM returns a structured JSON with the classification, and the response is applied to the issue via the platform API. Everything is wrapped in retry logic and human-in-the-loop for contested cases.

Technical sequence

  1. A webhook from GitHub or Jira arrives at the triage service (FastAPI or Node) on events issues.opened and issues.edited.
  2. The service assembles context: title, description, author, existing labels, list of potential duplicates (top-5 by embedding similarity from the vector index).
  3. The service builds a prompt for the AI model: the label taxonomy, module ownership map, few-shot examples of previously correctly labeled issues, and the text of the new ticket are passed in.
  4. Claude returns JSON: { labels, priority, component, owner, duplicate_candidates, summary, confidence }.
  5. The service validates the JSON against a schema: if the confidence level is below the threshold (e.g., 0.75) — it marks the issue with the tag needs-human-triage and does not assign an owner.
  6. For confident cases, the service applies labels and the assignee via the GitHub or Jira API, and leaves a comment with a summary and links to similar tickets.
  7. Critical issues are escalated to the team's Slack channel via an incoming webhook.
  8. Every action is written to an audit log — tracking what the agent changed and why.

Solution components

Component

Role

Webhook receiver

Receiving events from GitHub Issues and Jira Webhooks

Context builder

Collecting description, similar tickets, ownership map

Vector index

Storing embeddings of open issues for duplicate search

LLM client (AI model)

Classification and entity extraction

Schema validator

Validating JSON response and confidence level

Action executor

Writing labels, assignee, comment via API

Slack notifier

Escalating critical tickets

Audit log

Full history of agent decisions

Why custom-code, not ready-made no-code

For issue triage, three things matter that are poorly covered by ready-made builders: precise control over the prompt and few-shot examples (your taxonomy is specific), a duplicate vector index with persistent store embeddings, and a transparent audit log. All of this is handled in ~500-800 lines of Python or TypeScript with an LLM client, a vector DB (pgvector, Qdrant), and two API integrations. Typical stack: FastAPI + PostgreSQL with pgvector + AI model via Anthropic SDK + Octokit or Jira REST.

Human-in-the-loop as the default

The agent does not operate on full trust. Two mechanisms ensure safety:

  1. Confidence threshold — when the level is below the threshold, the decision is marked needs-human-triage, labels are not applied.
  2. Weekly review — once a week, the tech lead checks 20 random agent decisions against reality, few-shot examples are updated, and quality does not degrade.

Result: the majority of issues are labeled without engineer involvement, contested cases go to manual triage — but already with an agent-prepared summary and duplicate candidates, which still speeds up the work.

Prerequisites

To start triage, the team needs ready data, access credentials, and minimal organizational preparation.

Data and access:

  • GitHub Personal Access Token (scope repo) or a Jira API token with write permissions for labels, assignee, and comments.
  • Approved label taxonomy — a list of types (bug/feature/task), priorities, and components. Usually 15-40 categories.
  • Module ownership map: which person or team is responsible for which component or module.
  • 100-200 correctly labeled issues from the past 6-12 months — for few-shot examples and building a vector index of duplicates.
  • Anthropic API key for the AI model.

Infrastructure:

  • A server or managed platform for the triage service (VPS, Render, Railway, AWS Lambda — any option will work).
  • PostgreSQL with pgvector or Qdrant/Weaviate for the vector index.
  • A Slack workspace and incoming webhook if escalation of critical issues is needed.

Team readiness:

  • A tech lead or senior engineer as the automation owner — aligns the taxonomy and reviews quality for the first 2-3 weeks.
  • 30-40 minutes per week from the owner for a weekly review of the agent's decisions in the first month, then 15 minutes.
  • The team accepts the rules: duplicates are linked automatically, but a person closes them.

Implementation timeline:

For complexity «week» the expected timeline is 2-4 weeks. Week 1 — taxonomy alignment and few-shot dataset preparation. Week 2 — service implementation and webhook connection. Week 3 — shadow mode (the agent writes decisions to log but does not apply them), confidence threshold calibration. Week 4 — transition to production with human review.

Pain points

  • Errors in Manual Operations
  • Repetitive Routine Tasks
  • Constant context switching

FAQ

How long does implementation take from start to production?

For a team with a prepared labels taxonomy and ready API access — 2-4 weeks. Week 1 goes to aligning categories and collecting a few-shot dataset, week 2 — to service implementation and webhooks, week 3 — shadow mode for calibration, week 4 — launch with human review. If no taxonomy exists yet — add 1-2 weeks for its formalization with the tech lead.

What should we do if we don't have a clear labels taxonomy?

This is a normal situation for teams that have grown organically. Before launching automation, a short labeling workshop is held — 2-3 one-hour sessions with the tech lead and senior engineers, where the list of types, priorities, and components is formalized. Coverage is verified against the last 200-300 issues. Without this step, the AI agent will not be able to deliver stable labeling results.

What are the risks and what breaks with incorrect configuration?

Three main risks: misclassification with a weak taxonomy (addressed by few-shot examples and weekly review), false positives on duplicates (resolved by confidence threshold and the fact that closing is always done by a human), Slack overload with overly aggressive escalation. All three are mitigated by shadow mode in the first week — the agent writes decisions to a log but does not apply them until the tech lead confirms quality.

Is automation suitable for non-SaaS teams — agencies, internal IT, product teams in corporations?

Yes. Triage works wherever there is an issue tracker with an active ticket flow — GitHub Issues, Jira, Linear, GitLab. SaaS and product teams get the most value due to incoming volume, but agencies with multiple client projects and internal IT departments adopt a similar approach — the taxonomy simply becomes two-level (client + category).

Can it be used with Linear or GitLab instead of GitHub/Jira?

Yes, the architecture is tracker-agnostic. Linear and GitLab provide similar webhooks and REST/GraphQL API for writing labels, assignee, and comments. Adaptation of the webhook receiver and action executor will be required — 1-2 days of additional work. The confidence threshold semantics, prompt template, and duplicate vector index are reused without changes.

What about private data in issues — can information leak externally?

Data is passed to the AI model via the Anthropic API in accordance with their data usage policy for the commercial API. For strict compliance requirements, a redaction step is added: the service removes sensitive fields (emails, tokens, stacktraces with PII) before sending to the LLM. A full audit log of the agent's actions is written locally in your infrastructure and is available for audit.

Want this in your business?

Book a free audit — we'll show how this automation will work for you.

Related automations

#52 · Product & Engineering

AI code review for every PR

AI code review for every PR automates initial code review in the Product & Engineering department and achieves 110% growth in PR throughput (from 11.4 to 23.9 PRs per developer). The automation connects to the Git repository and triggers an AI agent on every pull request: it reviews code against the team's rubric, leaves inline comments, suggests improvements, and escalates complex cases to a human. As a result, seniors spend less time on mechanical checks, PR size decreases by 82% — developers shift to small incremental commits. Post-review changes decrease by 39%, bugs per developer — by 20%. Suited for SaaS teams and tech startups of 5–50 people, where code review has become a bottleneck slowing down the release cycle. Grow2.ai builds the automation around your codebase: a rubric aligned to your team's rules, connection to your existing Git provider, CI/CD integration, and a dashboard with review metrics.

110%· PR throughput
Weekend (1-2 days)Vertical SaaSQuality improved
#53 · Product & Engineering

Release notes from git commits and PRs

Release notes from git commits and PRs automates the process of preparing release notes in the Product & Engineering department and achieves the effect: release notes are prepared in minutes instead of 1–2 hours of manual work per release. An AI agent based on an AI model collects commits and merged pull requests from the repository since the last release, groups changes by category (features, fixes, breaking changes, internal), filters out technical noise, and generates a human-readable draft for different audiences — the technical team, management, and customers. An engineer reviews the final text and publishes it. The solution is suitable for SaaS companies with regular releases (weekly sprints or continuous delivery) and teams where a tech lead or product manager spends an hour or two on manually compiling the changelog after each deployment, providing regular updates to management, and writing manual progress reports.

Release notes are prepared in minutes instead of 1–2 hours per release.

Weekend (1-2 days)Custom codeTime saved
#54 · Product & Engineering

User feedback synthesis into feature priorities

User feedback synthesis into feature priorities automates collection, classification, and summarization of user feedback from multiple channels in Product & Engineering and delivers quality prioritization: the Product Manager sees real pain points based on data, not anecdotal evidence from the last conversation. The AI agent pulls raw feedback from helpdesk tickets, communication channels, and interview records, classifies each mention by topic and user segment, summarizes recurring patterns into structured insights. Output is a ranked list of pain points with mention frequency, quote examples, and links to source references. The roadmap is built on data, not on who complains loudest in Slack. The solution fits SaaS / Tech teams and horizontal products with an active user feedback stream and unstructured sources. Automation eliminates two specific pain points: time on manual feedback reports and user knowledge stuck in the heads of individual support staff or PMs.

PM sees real pain points, not anecdotal evidence. Solution roadmap based on data.

Week (1-5 days)Custom codeQuality improved
#55 · Product & Engineering

Automated bug fix (from message to prod)

Automated bug fix (from message to prod) automates the full defect resolution cycle — from a user message in chat or a helpdesk ticket to deployment of the fix in production — in the Product & Engineering department, and achieves a median of 90 seconds from message to prod with 95% deployable code and 98% triage accuracy. The AI agent receives a signal from Slack, Intercom, Zendesk, or GitHub Issues, extracts a structured description of the problem, identifies the guilty commit, reproduces the defect in a sandbox, generates a patch, runs tests, and creates a pull request with an explanation. For simple, localized errors the cycle runs autonomously; for architectural ones — it routes the ticket to an engineer with ready context and a draft solution. API cost is approximately $0.08 per fix. Automation reduces customer response time, removes minor bug-fixes from the engineer's backlog, frees the team for product work, and reduces accumulated tech debt on minor defects.

90 s· Message to deployed fix
Month (2-4 weeks)Agent frameworkTime saved
Take the AI-audit (2 min)