Profile Analysis allows you to interrogate individual CVs using a RAG (Retrieval-Augmented Generation) chain. The system reads a PDF, creates vector embeddings, and answers questions about the candidate’s profile.
Use LangChain’s PyPDFLoader to extract text from the CV:
from langchain_community.document_loaders import PyPDFLoaderloader = PyPDFLoader("path/to/cv.pdf")docs = loader.load()
2
Create Vector Store
Convert the document into embeddings and store in FAISS:
from langchain_community.vectorstores import FAISSvectorstore = FAISS.from_documents(docs, embeddings)retriever = vectorstore.as_retriever()
3
Define Prompt Template
Create a specialized prompt for career mentoring:
template = """Eres un Mentor de Carrera Tecnológica y experto en empleabilidad joven.Tu misión es analizar el perfil de este estudiante basándote SOLO en el siguiente contexto (su CV):{context}Pregunta: {question}"""
from langchain_community.document_loaders import PyPDFLoaderfrom langchain_community.vectorstores import FAISSfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_core.output_parsers import StrOutputParserfrom langchain_core.runnables import RunnablePassthroughimport os, random# 1. SELECTIONcarpeta_fuente = "cvs_estudiantes_final"# Verify files existarchivos_disponibles = os.listdir(carpeta_fuente)if not archivos_disponibles: raise Exception("No CVs found. Make sure to run CV generation first.")# Choose one randomlyarchivo_elegido = random.choice(archivos_disponibles)ruta_archivo = f"{carpeta_fuente}/{archivo_elegido}"print(f"📂 Selected student profile: '{archivo_elegido}'")print("⏳ Reading PDF and creating vectors...")# 2. LOAD AND VECTORIZATIONloader = PyPDFLoader(ruta_archivo)docs = loader.load()vectorstore = FAISS.from_documents(docs, embeddings)retriever = vectorstore.as_retriever()# 3. PROMPTtemplate = """Eres un Mentor de Carrera Tecnológica y experto en empleabilidad joven.Tu misión es analizar el perfil de este estudiante basándote SOLO en el siguiente contexto (su CV):{context}Pregunta: {question}"""prompt = ChatPromptTemplate.from_template(template)# 4. EXECUTIONchain = ( {"context": retriever, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser())# 5. QUESTION (Focused on potential and skills, not years of experience)pregunta = "¿Qué proyectos destacados o experiencia académica tiene este estudiante y cuál es su stack tecnológico principal?"respuesta = chain.invoke(pregunta)print(f"\n MENTOR QUESTION: {pregunta}")print("-" * 50)print(f"🤖 PROFILE ANALYSIS:\n{respuesta}")
The system uses a Career Mentor persona to focus on potential rather than experience:
Mentor Prompt
Technical Prompt
Skills Assessment
template = """Eres un Mentor de Carrera Tecnológica y experto en empleabilidad joven.Tu misión es analizar el perfil de este estudiante basándote SOLO en el siguiente contexto (su CV):{context}Pregunta: {question}"""
This prompt:
Sets expertise context (tech career mentoring)
Focuses on young talent employability
Constrains responses to CV content only
template = """You are a Technical Recruiter specializing in junior developers.Analyze this student's CV and focus on:- Academic projects and practical experience- Technical skills and tools- Learning potential and growth mindsetCV Content:{context}Question: {question}"""
template = """As a Skills Assessment Expert, evaluate this student's profile.Rate their proficiency in:- Programming languages- Frameworks and tools- Project complexity- Team collaborationStudent CV:{context}Question: {question}"""
Here are effective questions to ask about student profiles:
Projects & Experience
questions = [ "¿Qué proyectos destacados tiene este estudiante?", "What academic projects has this candidate completed?", "Describe the most impressive achievement in this CV", "What practical experience does this student have?"]
Technical Skills
questions = [ "¿Cuál es su stack tecnológico principal?", "What programming languages does this candidate know?", "List all frameworks and tools mentioned", "What is their strongest technical area?"]
Potential Assessment
questions = [ "¿Por qué deberíamos contratar a este estudiante como practicante?", "What makes this candidate stand out?", "Assess this student's learning potential", "What role would be the best fit for this profile?"]
Cultural Fit
questions = [ "¿Qué tipo de equipo se ajustaría mejor a este perfil?", "Does this student show leadership potential?", "What are their collaboration skills based on the CV?", "Describe this candidate's work style"]
Customize how the system retrieves relevant information:
# Basic retrieverretriever = vectorstore.as_retriever()# With custom parametersretriever = vectorstore.as_retriever( search_type="similarity", search_kwargs={"k": 3} # Return top 3 most relevant chunks)# MMR (Maximum Marginal Relevance) for diversityretriever = vectorstore.as_retriever( search_type="mmr", search_kwargs={ "k": 4, "fetch_k": 10, # Fetch 10, return 4 most diverse "lambda_mult": 0.5 # Balance relevance vs diversity })# Similarity with score thresholdretriever = vectorstore.as_retriever( search_type="similarity_score_threshold", search_kwargs={ "score_threshold": 0.8, # Only high-confidence matches "k": 5 })
Best Practice: For student CVs, use k=3 with similarity search. Student CVs are typically short (1-2 pages), so fewer chunks with higher relevance work better than diverse results.
Based on the CV, here's the profile analysis:### Outstanding Projects and Academic ExperienceFernanda Paredes is a 9th-semester Software Engineering student (UTP) seeking her first professional opportunity as a Data Analyst Trainee.**Experience and projects:**1. **Academic Project as Data Analyst Trainee (Jun 2025 - Feb 2026):** Focus on data analysis2. **Outstanding Achievement (Hackathon):** First place in university Hackathon for developing a recycling app. Demonstrates ability to work under pressure, innovation, and practical application of software development knowledge.### Main Technology Stack| Area | Technologies || :--- | :--- || **Data Analysis / BI** | Python, PowerBI || **Software Development** | Java, Spring Boot |**Mentor's Conclusion:**Fernanda has a solid foundation in development tools (Java, Spring Boot) and has shown proactivity in the data area (Python, PowerBI), which aligns with her goal of becoming a Data Analyst Trainee. Winning a Hackathon is a strong indicator of high potential and execution capability.
import oscarpeta_fuente = "cvs_estudiantes_final"# Check if directory existsif not os.path.exists(carpeta_fuente): raise FileNotFoundError(f"Directory '{carpeta_fuente}' not found. Run CV generation first.")# Check if files existarchivos_disponibles = [f for f in os.listdir(carpeta_fuente) if f.endswith('.pdf')]if not archivos_disponibles: raise Exception("No PDF files found. Generate CVs before running analysis.")print(f"Found {len(archivos_disponibles)} CVs")