Overview
ThePerfilEstudiante (Student Profile) model is a Pydantic BaseModel that defines the schema for extracting and validating structured data from student resumes. This model is designed specifically for student and early-career talent, emphasizing academic achievements, projects, and technical potential over traditional work experience.
This model is optimized for student recruitment scenarios where academic projects, hackathon achievements, and technical skills are more valuable indicators than years of experience.
Model Definition
The model is defined in the notebook atsource/notebook/Talent_Scout_3000x.ipynb:902-919.
Field Reference
Personal Information
Full name of the student as it appears on the CV.Example:
"Fernanda Paredes"University or personal email address. Typically includes university domain for students.Example:
"[email protected]"City and country location of the candidate.Example:
"Lima, Perú"Academic Profile
Name of the university or technical institute where the student is enrolled.Valid values:
"UTP", "UPC", "UNI", "San Marcos", "U. Lima", "Senati", "Cibertec"Example: "UTP"Academic major or degree program the student is pursuing.Example:
"Ingeniería de Software", "Ciencias de la Computación"Current semester or cycle in the degree program. Can also indicate “Egresado” (graduated) status.Format:
"7mo Ciclo", "VI Ciclo", "Egresado"Example: "9no Ciclo"Technical Talent
List of top 5 programming languages, frameworks, or technologies the student has proficiency in. Extracted from projects and experience sections.Example:
Names of notable academic projects, thesis work, freelance projects, or hackathon achievements. Focus on what was built, not company names.Example:
Profile Evaluation
Classification of the student’s technical specialization based on their skill stack.Valid values:
"Backend"— Java + Spring, Python APIs, database focus"Frontend"— React, Vue, Angular, UI/UX tools"Data"— Python + Pandas, PowerBI, SQL analytics"Fullstack"— Both frontend and backend technologies"Gestión"— Administrative, business, or non-technical roles
- Python + Pandas/PowerBI →
"Data" - React + Node.js →
"Fullstack" - Java + Spring Boot →
"Backend"
"Data"A brief justification (1-2 sentences) explaining why this student is a strong candidate for an internship or junior role. Should highlight potential over experience.Example:
Validation Rules
- String fields cannot be empty
- List fields must contain at least one element
- tipo_perfil should match one of the five predefined categories
- ciclo_actual should follow Spanish ordinal format (e.g., “7mo Ciclo”)
Usage Example
Basic Instantiation
With JSON Output Parser
The model is typically used with LangChain’sJsonOutputParser for structured extraction from CV text:
Batch Processing
Process multiple CVs and convert to DataFrame:Real-World Example
Here’s an actual extracted profile from the notebook execution:Related Models
Extraction Schema
Learn how to configure the JsonOutputParser for CV extraction
Talent Mining Guide
Step-by-step guide for batch CV processing
Design Philosophy
Why emphasize projects over experience?
Why emphasize projects over experience?
For student recruitment, academic projects and hackathon achievements are stronger indicators of technical capability and learning agility than months of work experience. A student who won first place in a university hackathon demonstrates problem-solving, teamwork, and execution skills.
Why include potencial_contratacion?
Why include potencial_contratacion?
This field forces the LLM to synthesize its understanding of the candidate into a hiring justification, providing explainability for recruitment decisions. It goes beyond data extraction to reasoning.
Why use Spanish field names?
Why use Spanish field names?
The original project targets Latin American university students (Peru), so Spanish field names maintain cultural context and reduce translation errors in CV parsing.