Predicting how long a bug will take to resolve isn't just a curiosity—it can optimize planning, improve SLAs, and alert engineering leaders early about bottlenecks.
In this post, we'll walk through how to build a production-grade microservice using FastAPI and XGBoost that predicts bug resolution time from Jira-like data.

🔧 Technologies Used:
- 🐍 Python 3.10+
- 🚀 FastAPI
- 📦 XGBoost
- 📄 Pandas + scikit-learn
- 🐳 Docker (optional for deployment)
🧠 Why Predict Resolution Time?
- ⏰ Set expectations across teams and stakeholders
- 📈 Improve sprint forecasting
- 🚨 Detect risks early in QA pipelines
- 💡 Feed into auto-prioritization models
According to Atlassian, inaccurate estimates are one of the top causes of missed sprint goals. ML can help you move from guesswork to insight.
📁 Step 1: Prepare Your Dataset
Start with a CSV export from your issue tracker (like Jira):
Required Columns:
- summary
- description
- priority
- created
- resolved
Calculate Resolution Time:
import pandas as pd
df = pd.read_csv("jira_issues.csv")
df['created'] = pd.to_datetime(df['created'])
df['resolved'] = pd.to_datetime(df['resolved'])
df['resolution_time_hours'] = (df['resolved'] - df['created']).dt.total_seconds() / 3600
df = df[df['resolution_time_hours'] > 0]
🔍 Step 2: Feature Engineering
import numpy as np
from sklearn.preprocessing import LabelEncoder
# Text length as a proxy
df['text'] = df['summary'].fillna('') + ' ' + df['description'].fillna('')
df['text_len'] = df['text'].str.len()
# Encode categorical
le = LabelEncoder()
df['priority_encoded'] = le.fit_transform(df['priority'])
# Final features
features = ['text_len', 'priority_encoded']
target = 'resolution_time_hours'
⚙️ Step 3: Train the XGBoost Model
from xgboost import XGBRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error
X = df[features]
y = df[target]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = XGBRegressor(n_estimators=100, max_depth=5, learning_rate=0.1)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("MAE:", mean_absolute_error(y_test, y_pred))
📦 Step 4: Package It as a FastAPI Microservice
Install FastAPI:
pip install fastapi uvicorn joblib
Save the model and encoders:
import joblib
joblib.dump(model, "xgb_model.pkl")
joblib.dump(le, "label_encoder.pkl")
Create app.py:
from fastapi import FastAPIfrom pydantic import BaseModelimport joblibimport numpy as npapp = FastAPI()model = joblib.load("xgb_model.pkl")le = joblib.load("label_encoder.pkl")class BugInput(BaseModel):
summary: str
description: str
priority: [email protected]("/predict-resolution-time")def predict_resolution_time(bug: BugInput):
text = (bug.summary or "") + " " + (bug.description or "")
text_len = len(text)
priority_encoded = le.transform([bug.priority])[0]
features = np.array([[text_len, priority_encoded]])
pred_hours = model.predict(features)[0]
return {
"predicted_resolution_time_hours": round(pred_hours, 2)
}
🧪 Step 5: Test It
Run the API:
uvicorn app:app --reload
Test using curl or Swagger UI:
curl -X POST "http://localhost:8000/predict-resolution-time" \
-H "Content-Type: application/json" \
-d '{"summary": "API failure when clicking Save", "description": "Internal server error with stack trace in logs", "priority": "High"}'
✅ You’ll get a JSON response like:
{
"predicted_resolution_time_hours": 14.26
}
🧪 Optional: Dockerize It
Here’s a sample Dockerfile:
FROM python:3.11
WORKDIR /app
COPY . .
RUN pip install fastapi uvicorn scikit-learn xgboost joblib
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
🧠 Real-World Use
- GitHub Engineering experimented with ML to predict issue completion time.
- Microsoft Research showed how SVM and gradient boosting models could reduce issue estimation error by 36%.
- At Bugflows, we’ve trained domain-specific models that integrate priority, issue type, and historical bug complexity to forecast accurate timelines with <20% average deviation.
🔮 What’s Next?
- Integrate sprint-level metadata (e.g., workload, velocity)
- Use text embeddings (BERT) for better modeling
- Push predictions to Slack or Jira using webhooks
📬 TL;DR
- ✅ Predict resolution time for bugs
- ✅ Train on your own historical issue data
- ✅ Deploy as a real-time microservice
🚀 Want to Skip the Setup?
Bugflows offers plug-and-play models like this out-of-the-box—just connect your Jira or GitHub, and you're ready to go.