AI Categorization

Overview

FinAI uses Google Gemini 2.5 Flash to automatically categorize your transactions. The AI learns from YOUR personal category structure and suggests the most appropriate category with a confidence score.

How It Works

The AI categorization engine is powered by Google’s Gemini 2.5 Flash model through the official Google GenAI SDK:

# From app/ai_service.py:15-22
class ExpenseAI:
    def __init__(self):
        api_key = os.environ.get('GEMINI_API_KEY')
        self.client = genai.Client(api_key=api_key)
        self.model_name = 'gemini-2.5-flash'

The system uses Gemini 2.5 Flash (not 2.0) as specified in app/ai_service.py:22. This model provides fast, accurate categorization with low latency.

Prediction Flow

User Enters Transaction Description

When creating a transaction, you type a description like “Mua xăng ở Petrolimex”

Frontend Calls Prediction API

The UI sends a request to the prediction endpoint:

POST /api/predict-category
Content-Type: application/json

{
  "description": "Mua xăng ở Petrolimex"
}

System Fetches User's Categories

The backend retrieves YOUR personal category list (app/routes/ai.py:36-37):

user_cats = Category.query.filter_by(user_id=user_id).all()
cat_names = [cat.name for cat in user_cats]
# Example: ['Ăn uống', 'Xăng xe', 'Đóng họ', 'Giải trí']

AI Analyzes and Predicts

The AI engine receives the description and your category menu, then predicts the best match.

Response with Confidence Score

The API returns the predicted category with confidence:

{
  "status": "success",
  "category_id": "cat789",
  "category_name": "Xăng xe",
  "category_type": "chi",
  "confidence": 95
}

The Prediction Prompt

The AI receives a carefully crafted prompt that enforces strict categorization rules:

# From app/ai_service.py:33-47
prompt = f"""
Bạn là một hệ thống phân loại tài chính tự động. 
Nhiệm vụ: Phân loại giao dịch sau vào đúng MỘT trong các danh mục người dùng đã tạo.

Danh mục hiện có: {', '.join(user_categories)}
Giao dịch cần phân loại: "{text}"

Yêu cầu BẮT BUỘC: 
- Chỉ trả về định dạng JSON hợp lệ. Không giải thích gì thêm.
- Nếu không có danh mục nào phù hợp, category hãy để là "Khác".
- confidence là độ tự tin của bạn (từ 0 đến 100).

Định dạng trả về:
{{"category": "Tên danh mục", "confidence": 95}}
"""

Enforcing JSON Response

To ensure consistent parsing, the system uses structured JSON output mode:

# From app/ai_service.py:50-56
response = self.client.models.generate_content(
    model=self.model_name,
    contents=prompt,
    config=types.GenerateContentConfig(
        response_mime_type="application/json"  # Forces JSON output
    )
)

This prevents the AI from returning conversational text and ensures reliable parsing.

Safety Checks

The system validates AI predictions to prevent errors:

# From app/ai_service.py:59-63
result = json.loads(response.text.strip())

# Ensure AI doesn't hallucinate non-existent categories
if result.get('category') not in user_categories:
    result['category'] = "Khác"

If the AI suggests a category that doesn’t exist in your personal list, the system automatically defaults to “Khác” (Other).

Personalized Learning

Unlike generic expense trackers, FinAI’s AI is personalized to YOUR categories:

Example: Two users, different results

User A’s categories: Ăn uống, Xăng xe, Học phíTransaction: “Mua sách lập trình”AI prediction: “Học phí” (95% confidence)

User B’s categories: Ăn uống, Xăng xe, Sách vở, Học phíTransaction: “Mua sách lập trình”AI prediction: “Sách vở” (98% confidence)

The AI adapts to each user’s personal categorization system, providing more relevant suggestions.

Integration with Transactions

When you create a transaction, the AI prediction is stored alongside your final choice:

# From app/routes/transaction.py:51-63
new_trans = Transaction(
    id=str(uuid.uuid4())[:8],
    user_id=user_id,
    category_id=data.get('category_id'),      # Your final choice
    ai_category_id=data.get('ai_category_id'), # What AI suggested
    ai_confidence=data.get('ai_confidence'),   # AI confidence score
    # ... other fields
)

This creates an audit trail for AI accuracy and enables future improvements.

AI Accuracy Tracking

The system includes an AILog model to track prediction accuracy:

# From app/models.py:134-143
class AILog(db.Model):
    __tablename__ = 'ai_lichsu'
    id = db.Column('MaAI_Log', db.String(8), primary_key=True)
    user_id = db.Column('MaNguoiDung', db.String(8))
    transaction_id = db.Column('MaGiaoDich', db.String(8))
    predicted_cat = db.Column('DanhMucDuDoan', db.String(8))
    actual_cat = db.Column('DanhMucChinhXac', db.String(8))
    confidence = db.Column('DoTinCay', db.Float)
    feedback = db.Column('PhanHoi', db.String(50)) # 'dung', 'sai'

This data can be used to measure AI performance and identify areas for improvement.

Error Handling

If the AI service fails, the system gracefully degrades:

# From app/ai_service.py:67-69
except Exception as e:
    print(f"Gemini Predict Error: {e}")
    return None

When prediction fails:

The API returns {"status": "no_match"} (app/routes/ai.py:61)
Users can manually select a category
The transaction still saves successfully

API Reference

Predict Category

POST /api/predict-category
Content-Type: application/json
Authentication: Required (session-based)

{
  "description": "Mua cafe tại Highlands Coffee"
}

Success Response:

{
  "status": "success",
  "category_id": "cat123",
  "category_name": "Ăn uống",
  "category_type": "chi",
  "confidence": 92
}

No Match Response:

{
  "status": "no_match"
}

Error Response:

{
  "status": "error",
  "message": "No description"
}

Configuration

The AI service requires a Google Gemini API key in your environment:

# .env file
GEMINI_API_KEY=your_api_key_here

The system loads this automatically using python-dotenv (app/ai_service.py:13).

Performance Characteristics

Model

Google Gemini 2.5 Flash

Response Time

~200-500ms average

Output Format

Structured JSON

Context Length

User categories + description

Best Practices

Create Descriptive Categories

The more specific your categories, the better the AI can distinguish between them.Good: Cafe, Nhà hàng, Siêu thịLess effective: Ăn uống (too broad)

Provide Clear Descriptions

Include specific details in transaction descriptions:Good: “Mua xăng tại Petrolimex Q1”Less effective: “Đổ xăng”

Review AI Suggestions

Always review the AI’s suggestion before confirming. The confidence score helps you gauge reliability:

90-100%: Very likely correct
70-89%: Probably correct, worth reviewing
Below 70%: Less certain, should verify

Transactions - Where AI predictions are applied
Chatbot - Natural language financial assistant
Reports - Analyze categorized spending patterns

Get Started

Core Features

User Guide

Deployment

AI Categorization

Overview

How It Works

Prediction Flow

The Prediction Prompt

Enforcing JSON Response

Safety Checks

Personalized Learning

Integration with Transactions

AI Accuracy Tracking

Error Handling

API Reference

Predict Category

Configuration

Performance Characteristics

Model

Response Time

Output Format

Context Length

Best Practices

Build docs developers (and LLMs) love

Get Started

Core Features

User Guide

Deployment

​Overview

​How It Works

​Prediction Flow

​The Prediction Prompt

​Enforcing JSON Response

​Safety Checks

​Personalized Learning

​Integration with Transactions

​AI Accuracy Tracking

​Error Handling

​API Reference

​Predict Category

​Configuration

​Performance Characteristics

Model

Response Time

Output Format

Context Length

​Best Practices

​Related Features

Build docs developers (and LLMs) love

Overview

How It Works

Prediction Flow

The Prediction Prompt

Enforcing JSON Response

Safety Checks

Personalized Learning

Integration with Transactions

AI Accuracy Tracking

Error Handling

API Reference

Predict Category

Configuration

Performance Characteristics

Best Practices

Related Features