Overview
The Resume Generator extracts text from uploaded resume files (PDF and DOCX) for AI processing. This guide helps you troubleshoot parsing errors and file format issues.How Resume Parsing Works
The parsing logic is implemented inbackend/src/controllers/interview.controller.js:38-84:
- File Type Detection - Uses magic bytes (file signatures), MIME type, and file extension
- Text Extraction - PDF files use
pdf-parse, DOCX files usemammoth - Fallback Strategy - If local parsing fails, sends original file to Google Gemini AI
- Error Handling - Returns clear error messages for unsupported or unreadable files
The application accepts files up to 5MB in size (
backend/src/middlewares/file.middleware.js:6-8).Common Parsing Errors
Unable to read the uploaded PDF
Unable to read the uploaded PDF
Error Message:
"Unable to read the uploaded resume. Please upload a valid PDF/DOCX or provide 'selfDescription'."Error Location: backend/src/controllers/interview.controller.js:79-82Causes:- Scanned PDF (image-only, no text layer)
- Encrypted/password-protected PDF
- Corrupted PDF file
- Non-standard PDF format
1. Check if PDF is Scanned (Image-Only)
Scanned PDFs contain images of text, not actual text:- Try selecting text in the PDF - if you can’t select it, it’s scanned
- Use OCR (Optical Character Recognition) to convert to text-based PDF:
- Adobe Acrobat: Tools → Scan & OCR → Recognize Text
- Online tools: Smallpdf OCR, Adobe Online
2. Remove Password Protection
- Open PDF in Adobe Reader or Preview
- If prompted for password, unlock it
- Save as new unencrypted PDF
- Upload the new file
3. Re-export/Re-save the PDF
- Open the PDF in your PDF reader
- Print to PDF or Save As → PDF
- This often fixes corruption issues
4. Convert from DOCX
If you have the original Word document:- Upload the
.docxfile directly (better parsing) - Or export fresh PDF from Word: File → Save As → PDF
5. Use Self Description Instead
If the file cannot be parsed, provide your information manually:Fallback Behavior: If local parsing fails but the file format is valid (PDF/DOCX), the application sends the original file to Google Gemini AI, which may successfully extract text even from scanned documents.
Could not extract text from the resume file
Could not extract text from the resume file
Error Message:
"Could not extract text from the resume file. Please upload a text-based PDF/DOCX or provide 'selfDescription'."Error Location: backend/src/controllers/interview.controller.js:86-90Cause: The PDF/DOCX was parsed successfully, but no text was extracted (completely empty file or only images).Solutions:1. Verify File Has Content
- Open the file and ensure it contains text
- Check it’s not just a blank page or image
2. Check for Image-Only Content
- If resume is just images/graphics without text, it won’t extract
- Recreate resume with actual text, not images of text
3. Export from Original Source
- If created in Canva, LinkedIn, or design tools:
- Use export options that preserve text
- Or manually type content into Word/Google Docs
4. Use Self Description
Provide the resume content as text:Unsupported resume file type
Unsupported resume file type
Error Message: Supported Formats:
"Unsupported resume file type. Please upload a PDF or DOCX."Error Location: backend/src/controllers/interview.controller.js:62-66Cause: The uploaded file is not a PDF or DOCX format.Detection Logic (backend/src/controllers/interview.controller.js:44-53):- ✅ PDF (
.pdf) - ✅ Microsoft Word 2007+ (
.docx)
- ❌ Old Word format (
.doc) - convert to.docx - ❌ Text files (
.txt) - copy content toselfDescription - ❌ Images (
.jpg,.png) - use OCR first or create text document - ❌ Rich Text Format (
.rtf) - convert to.docxor.pdf - ❌ Google Docs - export as
.docxor.pdffirst
Convert to Supported Format:
From .doc (Old Word):- Open in Microsoft Word
- File → Save As → Word Document (
.docx)
- File → Download → Microsoft Word (
.docx) - Or File → Download → PDF Document (
.pdf)
- Open in word processor
- Save As →
.docxor export to PDF
- Use OCR tool to extract text
- Create new document with extracted text
- Save as
.docxor.pdf
File size exceeds limit
File size exceeds limit
Symptom: Upload fails or times out, no specific error message.Cause: File exceeds 5MB size limit.Limit Configuration: Solutions:
backend/src/middlewares/file.middleware.js:6-81. Compress PDF
- Online tools: Smallpdf Compress, iLovePDF
- Adobe Acrobat: File → Save As Other → Reduced Size PDF
- Preview (Mac): File → Export → Reduce File Size
2. Reduce DOCX Size
- Compress images in document:
- Select image → Format → Compress Pictures
- Choose “Email (96 ppi)” or “Web (150 ppi)”
- Remove unnecessary formatting
- Delete embedded objects
3. Simplify Resume
- Remove high-resolution images/photos
- Use simple formatting instead of complex graphics
- Consider text-only version for parsing
4. Split Information
- Upload simplified resume file
- Add detailed information in
selfDescriptionfield:
DOCX parsing errors
DOCX parsing errors
Error: Various errors when uploading DOCX filesParser Used:
mammoth library (backend/src/controllers/interview.controller.js:60-61)Common Issues:1. Complex Formatting
- Heavily formatted documents may not parse correctly
- Solution: Simplify formatting before upload
2. Tables and Columns
- Complex table structures may lose formatting
- Solution: Convert tables to simple bullet points
3. Embedded Objects
- Charts, SmartArt, embedded files may cause issues
- Solution: Remove or replace with text descriptions
4. Track Changes/Comments
- Documents with tracked changes may not parse cleanly
- Solution: Accept all changes before saving
- Review → Accept → Accept All Changes
5. Old .doc Format
.docfiles are not supported (only.docx)- Solution: Save As →
.docxformat
- Use simple, clean formatting
- Avoid complex layouts (multiple columns, text boxes)
- Stick to standard fonts and styles
- Remove headers/footers if possible
- Save in latest Word format (
.docx)
AI Fallback Processing
When local parsing fails, the application uses an intelligent fallback: Fallback Logic (backend/src/controllers/interview.controller.js:67-83):
- Google Gemini can often read scanned PDFs (with OCR)
- Handles encrypted or complex formatted documents
- Provides better parsing for image-heavy resumes
- Graceful degradation - tries local parsing first, then AI
- Requires valid PDF or DOCX file
- Still needs
selfDescriptionif file is completely unreadable - May be slower than local parsing
Best Resume File Practices
Format
- Use PDF or DOCX format
- Ensure text is selectable (not scanned)
- Keep file size under 5MB
- Use standard fonts
Content
- Use simple, clean formatting
- Avoid complex tables/columns
- Don’t embed large images
- Save from original source (Word, Google Docs)
Testing Your Resume File
Quick Test Checklist:
-
Can you select text?
- Open PDF and try highlighting text
- If you can’t select text → it’s scanned, use OCR
-
Is it under 5MB?
- Check file properties
- If over 5MB → compress file
-
Is it .pdf or .docx?
- Check file extension
- If not → convert to supported format
-
Does it open normally?
- Open in PDF reader or Word
- No password prompts or errors?
-
Is formatting simple?
- Mostly text with basic formatting
- Not heavy on images/graphics
Alternative: Self Description
If you continue to have parsing issues, you can provide resume information as text:Providing both a resume file AND self description gives the AI the most context and typically produces the best interview reports.
Related Documentation
Common Issues
General troubleshooting including auth, API, and validation errors
API Reference
Detailed API documentation for interview report generation