Skip to main content
GET
/
monitoring
QA Monitoring
curl --request GET \
  --url https://api.example.com/monitoring
{
  "requests_total": 123,
  "avg_latency_ms": 123,
  "avg_retrieval_accuracy": 123,
  "hallucination_rate": 123
}

QA Service Monitoring

Returns aggregated metrics for the QA service including request counts, latency, accuracy, and hallucination rates.

Endpoint

GET /monitoring

Response

requests_total
integer
required
Total number of QA requests processed since service startup.
avg_latency_ms
number
required
Average response latency across all requests in milliseconds.
avg_retrieval_accuracy
number
required
Average retrieval accuracy score (0.0 to 1.0). Measures how well retrieved documents match the query.
hallucination_rate
number
required
Rate of responses flagged as potential hallucinations (0.0 to 1.0). Higher values indicate more responses without supporting citations.

Example Request

cURL
curl http://localhost:8000/monitoring

Example Response

{
  "requests_total": 1250,
  "avg_latency_ms": 856.3421,
  "avg_retrieval_accuracy": 0.7823,
  "hallucination_rate": 0.0456
}

Metrics Interpretation

Latency

  • < 500ms: Excellent performance
  • 500-1000ms: Good performance
  • > 1000ms: Consider optimization (reduce chunk size, use faster embeddings)

Retrieval Accuracy

  • > 0.8: High quality matches
  • 0.6-0.8: Moderate quality, acceptable
  • < 0.6: Poor retrieval, review document chunking strategy

Hallucination Rate

  • < 0.05: Low risk, model stays grounded
  • 0.05-0.10: Moderate risk, monitor closely
  • > 0.10: High risk, review prompts and citation validation

Implementation

The monitoring metrics are updated after each QA request (see src/qa_api.py:70-78 and src/qa_api.py:211-219):
monitoring: Dict[str, Any] = {
    'requests_total': 0,
    'latency_ms_total': 0.0,
    'retrieval_accuracy_total': 0.0,
    'hallucination_count': 0,
}
Metrics reset on service restart.
  • QA Ask - Submit questions to the QA system
  • QA Stream - Stream answers in real-time
  • QA Health - Check if QA service is ready

Build docs developers (and LLMs) love