Why Syft Space for researchers
Private by default
Your research data never leaves your control. Share insights, not raw data.
Instant recall
Query years of papers and notes in seconds with natural language.
Collaborate safely
Let collaborators query your knowledge base without sharing raw files.
Preserve context
AI responses cite specific papers and sections, preserving academic rigor.
Use cases
Personal research assistant
Create a searchable AI assistant from all your research materials. What to index:- Published papers and preprints
- Research notes and lab notebooks
- Literature review collections
- Conference presentations
- Grant proposals and reports
- “What methods did I use for analyzing protein structures in 2022?”
- “Summarize all papers I’ve read about CRISPR gene editing”
- “What were the key findings from the Smith et al. dataset?”
- Never lose track of previous work
- Quickly reference methodology from past papers
- Onboard new lab members with instant knowledge access
Lab knowledge base
Make your entire lab’s research searchable for all team members. What to index:- All lab publications
- Standard operating procedures
- Equipment manuals and protocols
- Meeting notes and decisions
- Experimental results and datasets
- “How do we calibrate the mass spectrometer?”
- “What experiments have we done with this particular compound?”
- “Summarize our findings on drug resistance mechanisms”
- Reduce repetitive questions
- Preserve institutional knowledge when people leave
- Speed up literature reviews
- Improve research reproducibility
Literature review assistant
Index relevant papers to create a domain-specific search engine. What to index:- Papers from your field
- Key review articles
- Relevant preprints
- Books and chapters
- “What are the main approaches to quantum error correction?”
- “Compare the effectiveness of different vaccine platforms”
- “What gaps exist in current climate models?”
- Stay current with the literature
- Find connections across papers
- Speed up systematic reviews
- Identify research gaps
Dataset querying
Make research datasets queryable without exposing raw data. What to index:- Dataset documentation
- Analysis reports
- Methodology papers
- Data dictionaries
- “What variables are available in the genomic dataset?”
- “How was the patient cohort selected?”
- “What preprocessing steps were applied?”
- Share dataset insights with collaborators
- Comply with data privacy requirements
- Enable meta-analyses
- Reduce data access requests
Getting started
Install Syft Space
Create a dataset
Organize your research materials:Place your PDFs and documents in the ingestion path. Syft Space automatically indexes them.
Start querying
Best practices
Organizing research materials
Create topic-specific datasets
Create topic-specific datasets
Organize by research area or project:
protein-folding- Papers on protein structurelab-protocols- SOPs and procedureslit-review-2025- Papers for current reviewgrant-materials- Proposals and reports
Include document metadata
Include document metadata
Add context to help AI understand your materials:
- Add frontmatter with dates, authors, keywords
- Use descriptive filenames:
smith_2024_crispr_methods.pdf - Include README files explaining dataset contents
Keep materials updated
Keep materials updated
Syft Space watches for new files:
- Drop new papers in your ingestion folder
- Automatic indexing happens in the background
- Update existing files to refresh the index
Privacy and sharing
Keep it private
Don’t publish endpoints if you want to keep research private. Query locally only.
Share with collaborators
Use access control policies to limit who can query your endpoint.
Public knowledge
Publish to SyftHub to share insights with the research community.
Rate limit queries
Prevent abuse by limiting queries per user or per day.
Query optimization
Be specific in queries
Better: “What methods did Smith et al. use for RNA sequencing?”Worse: “Tell me about RNA”
Adjust similarity threshold
- Lower threshold (0.5-0.7): Broader search, more results
- Higher threshold (0.8-0.9): Stricter matching, fewer results
Example: PhD researcher
Setup:- 200 papers from literature review
- 50 personal research notes
- 10 published papers
- 3 lab protocols
- Drop all PDFs in
~/research/papers/ - Create one dataset:
phd-research - Query before writing:
- “What have I found about X?”
- “Which papers support this claim?”
- “What methods are commonly used?”
- Wrote thesis 30% faster
- Never lost track of sources
- Easily answered reviewer questions
- Shared knowledge base with advisor
Example: Research lab
Setup:- 1,000+ lab papers and protocols
- 5 years of meeting notes
- Equipment manuals
- Experimental procedures
- Lab server accessible to all members
- Everyone can query: “How do we…?”
- New members onboard by querying the knowledge base
- Reduce time spent answering repetitive questions
- Onboarding time reduced by 50%
- Institutional knowledge preserved
- More time for actual research
- Better protocol compliance
Advanced features
Multiple endpoints for different audiences
Integration with research tools
Syft Space provides a REST API that can integrate with:- Jupyter notebooks for interactive queries
- Zotero or Mendeley for reference management
- Notion or Obsidian for note-taking
- Custom research dashboards
Learn more
Datasets
Learn about managing research materials
Models
Choose the right AI model for your needs
Endpoints
Create and configure query endpoints
API reference
Build custom integrations
Questions? Join our community or check the installation guide.