Overview
The IPS data sources module manages literature references from scientific databases. It supports bulk import, deduplication, and automatic KPI calculation for tracking literature coverage.Supported Sources
- PMC - PubMed Central
- LILACS - Latin American and Caribbean Health Sciences Literature
- SCIELO-WEB - Scientific Electronic Library Online
- OTRA - Other sources
List Literature Sources
Path Parameters
IPS report ID
Response
Returns an array of literature sources sorted by year (descending) and title:Response Example
Source record ID
Database source:
PMC, LILACS, SCIELO-WEB, or OTRAArticle title
Full bibliographic citation
Article URL
Publication year
Journal name
Article abstract or summary
Number of times this article was detected in searches
Save Literature Sources (Bulk)
Save multiple literature references with optional server-side translation.Path Parameters
IPS report ID
Request Body
Array of literature items to save
Enable server-side translation of titles and abstracts
Target language code for translation (e.g., âesâ, âenâ)
Item Schema
Each item in theitems array should contain:
Article title
Source database:
PMC, LILACS, SCIELO, or otherAlternative to
collection - normalized source identifierArticle URL
Publication year (0-9999)
Journal name
Article abstract or summary
Pre-formatted citation (auto-generated if not provided)
Number of times detected in search results
Response
Number of items processed in this request
Updated KPI metrics for literature coverage
Total number of sources for this report after save
Deduplication
The endpoint automatically deduplicates based on(url, title, year) combinations:
- Exact duplicates are skipped
- Existing entries are updated if new data is richer (longer citation, has abstract, etc.)
- Trash entries (navigation links, T&C pages) are filtered out
Save Literature Sources (Simple)
Compatibility endpoint for simpler bulk save operations.Path Parameters
IPS report ID
Query Parameters
If
true, delete all existing sources before inserting new onesRequest Body
Array of source items (same schema as bulk endpoint).Response
Delete All Sources
Path Parameters
IPS report ID
Response
Cleanup Citations
Clean and normalize citation formatting for all sources in a report.Path Parameters
IPS report ID
Response
Number of citations cleaned/normalized
What Gets Cleaned
- Standardizes author formatting
- Normalizes journal abbreviations
- Formats DOI links consistently
- Removes duplicate whitespace
- Standardizes year formatting
KPI Metrics
The system automatically calculates and updates these KPI metrics when sources are saved:Report-Level KPIs
Stored in theips_report table:
literaturas_detected- Total literature items detectedpmc_detected- Items from PubMed Centrallilacs_detected- Items from LILACSscielo_detected- Items from SciELO
Access KPIs
KPIs are automatically updated when saving sources and included in report responses:Citation Auto-Generation
If a citation is not provided, the system auto-generates one using available fields: Format:Authors. (Year). Title. Journal. DOI/URL
Example:
- Use
citationif provided - Use
citeorapafield if available - Build from:
authors/creators+year+title+journal+doi/url
Source Normalization
Collection names are normalized to standard codes:| Input | Normalized |
|---|---|
| PMC, PubMed, PubMed Central | PMC |
| LILACS, lilacs | LILACS |
| SCIELO, SciELO, Scielo | SCIELO-WEB |
| Any other value | OTRA |
Translation Support
Whentranslate: true is set, the backend can automatically translate titles and abstracts:
- Translation service must be configured at
/api/v1/nlp/translate/batch - Uses same authentication token
- Timeout: 40 seconds
- Falls back to original text if translation fails
Best Practices
Bulk Import
- Use
/fuentes/bulkfor large imports with deduplication - Set
replace: falseto merge with existing data - Include
detected_totalto track search result frequencies - Provide DOIs when available for better citation quality
Citation Quality
- Include full journal names, not abbreviations
- Provide year for proper sorting
- Include abstracts for better context
- Use DOIs instead of PMIDs when possible
Performance
- Batch imports in chunks of 100-500 items
- Use translation sparingly (adds 40s overhead)
- Run cleanup after bulk imports, not during
- Query with pagination for reports with >1000 sources