Skip to main content

Introduction

The Gemini sample apps showcase end-to-end, production-ready implementations of generative AI applications on Google Cloud. Each application demonstrates real-world architectures, integration patterns, and best practices for building AI-powered solutions.

GenWealth

Financial advisory platform with AlloyDB AI, semantic search, and RAG chatbot

FixMyCar

Automotive assistant using RAG with Vertex AI Search on GKE

Finance Advisor

Multi-modal search with Spanner’s full-text, vector, and graph capabilities

Live Telephony

Real-time voice AI with Gemini Live API and Twilio integration

Common Architecture Patterns

Database-Integrated AI

Multiple sample apps demonstrate how to leverage database-native AI capabilities:
  • AlloyDB AI (GenWealth): Semantic search and embeddings directly in PostgreSQL
  • Spanner ML (Finance Advisor): Full-text search, vector similarity, and graph traversal
  • Vertex AI Search (FixMyCar): Managed search with OCR and document processing

RAG Implementation Strategies

All sample apps implement Retrieval-Augmented Generation with different approaches:
-- Hybrid search combining embeddings with filters
SELECT first_name, last_name, email, age, risk_profile, bio,
  bio_embedding <=> google_ml.embedding('text-embedding-005', 
    'young aggressive investor')::vector AS distance
FROM user_profiles
WHERE risk_profile = 'high'
  AND age BETWEEN 18 AND 50
ORDER BY distance
LIMIT 50;

Deployment Architectures

Sample apps demonstrate various deployment patterns:
ApplicationRuntimeKey Components
GenWealthCloud RunAlloyDB, Cloud Functions, Eventarc
FixMyCarGKE AutopilotVertex AI Search, Java Spring
Finance AdvisorCloud RunSpanner, Streamlit
Live TelephonyCloud RunFastAPI, Twilio, Gemini Live API

Tech Stack Overview

AI & ML Services

  • Gemini Models: 2.0 Flash for text/multimodal generation
  • Vertex AI Embeddings: text-embedding-005, textembeddings-gecko@003
  • Vertex AI Search: Document AI OCR processor integration
  • Gemini Live API: Real-time streaming audio/voice

Data & Storage

  • AlloyDB for PostgreSQL: Vector embeddings, LLM integration
  • Cloud Spanner: Full-text search, vector indexes, graph queries
  • Cloud Storage: Document ingestion pipelines
  • Vertex AI Vector Search: Managed ANN search

Application Frameworks

  • Frontend: Angular, Streamlit, React
  • Backend: TypeScript/Node, Java Spring Boot, Python FastAPI
  • Orchestration: Cloud Functions, Eventarc, Pub/Sub

Document Processing Pipelines

GenWealth and FixMyCar showcase automated document ingestion:
1

Document Upload

Upload PDFs to Cloud Storage bucket
2

OCR Processing

Document AI extracts text with layout preservation
3

Chunking & Embedding

LangChain splits text and generates embeddings
4

Vector Storage

Store in AlloyDB or Vertex AI Search datastore
5

Analysis & Enrichment

Generate summaries and structured metadata with Gemini

GenWealth Pipeline Architecture

Conversational AI Patterns

Stateful Chat Management

-- Store and retrieve chat context
SELECT llm_prompt, llm_response
FROM llm(
  prompt => 'I have $25250 to invest. What do you suggest?',
  llm_role => 'You are a financial chatbot named Penny',
  mission => 'Assist clients with financial education and advice',
  enable_history => true
);

Getting Started

Each sample app is designed as an isolated demo environment. They are not production-hardened and should be customized for security, reliability, and scale before production deployment.

Prerequisites

  • Google Cloud project with billing enabled
  • Vertex AI API enabled
  • Cloud Shell or local gcloud CLI
  • Docker for container builds (where applicable)

Common Setup Pattern

1

Clone Repository

git clone https://github.com/GoogleCloudPlatform/generative-ai.git
cd generative-ai/gemini/sample-apps/
2

Choose Sample App

Navigate to specific app directory (genwealth, fixmycar, etc.)
3

Configure Environment

Update environment variables in env.sh or .env file
4

Run Installation Script

Execute automated deployment script (typically takes 20-35 minutes)
5

Explore Features

Access deployed UI and experiment with AI capabilities

Performance & Scale Considerations

Latency Optimization

  • GenWealth: AlloyDB private service connect for sub-10ms queries
  • FixMyCar: GKE Autopilot autoscaling with Vertex AI Search caching
  • Live Telephony: Cloud Run min-instances=1, session affinity, CPU throttling disabled

Cost Management

  • Use AlloyDB zonal instances for dev/test (production should be regional)
  • GKE Autopilot scales to zero when idle
  • Cloud Run scales based on concurrency settings
  • Vertex AI Search pricing based on queries and data volume

Architecture Decision Records

Why AlloyDB over Cloud SQL for GenWealth?

  • Native vector similarity search with pgvector
  • Direct Vertex AI LLM integration via google_ml extension
  • Superior performance for OLTP + analytics hybrid workloads
  • Built-in embeddings generation without external API calls

Why GKE for FixMyCar?

  • Java Spring Boot application with custom resource requirements
  • Persistent connections to Vertex AI Search
  • Service mesh capabilities for observability
  • Workload Identity for fine-grained IAM

Why Spanner for Finance Advisor?

  • Multi-modal search: full-text, semantic (vector), and graph in single query
  • Unlimited scale for global financial services workloads
  • Strong consistency with 99.999% availability
  • Native graph support for fund-of-funds relationships

Next Steps

Deploy GenWealth

Build a trustworthy financial advisor with AlloyDB AI

Explore FixMyCar

Implement RAG for automotive troubleshooting

Try Finance Advisor

Experience Spanner’s multi-modal search capabilities

Voice AI with Telephony

Build real-time conversational AI over the phone

Additional Resources

Build docs developers (and LLMs) love