Skip to main content

Vision

nanobot aims to be the simplest, most hackable AI agent framework while maintaining full functionality. We’re not trying to be the biggest or most feature-rich — we’re optimizing for clarity, simplicity, and research-readiness.

Design Goals

  1. Keep it tiny: Target ~5,000 core agent lines (currently ~4,000)
  2. Stay readable: Every line should be understandable
  3. Make it hackable: Easy to modify and extend
  4. Remain practical: Real features, not toy examples

Current Status (v0.1.4.post3)

✅ What’s Working

  • 10 chat channels: Telegram, Discord, WhatsApp, Feishu, Email, Slack, QQ, DingTalk, Matrix, Mochat
  • 15+ LLM providers: OpenRouter, Anthropic, OpenAI, DeepSeek, Gemini, Groq, and more
  • 8 built-in tools: Shell, filesystem, web, spawn, cron, message, MCP
  • MCP support: Model Context Protocol integration
  • Multi-modal: Images, voice transcription (Groq Whisper)
  • Memory system: Persistent MEMORY.md
  • Subagents: Background task spawning
  • Scheduled tasks: Cron-based scheduling + heartbeat
  • Session isolation: Per-user/thread conversations
  • Prompt caching: Anthropic/OpenRouter support
  • OAuth providers: OpenAI Codex, GitHub Copilot
  • Thinking mode: Experimental reasoning support

🔧 Current Limitations

  • Manual testing only: No automated test suite
  • Basic memory: Simple markdown, no vector search
  • Limited multimodal: Images receive-only (most channels)
  • No streaming: Responses sent after completion
  • Simple context: No advanced retrieval

Roadmap

Phase 1: Enhanced Multi-Modal (Q2 2026)

Goal: See, hear, and create media. Features:
  • Vision: Image understanding (GPT-4V, Claude 3)
  • Image generation: DALL-E, Stable Diffusion integration
  • Video support: Receive and analyze videos
  • Voice output: Text-to-speech responses
  • Audio analysis: Analyze audio files beyond transcription
Channels impacted:
  • Telegram (send images)
  • Discord (send images)
  • WhatsApp (send images)
  • All channels (receive images for vision)
Tools:
  • generate_image(prompt: str) -> image_path
  • analyze_image(image_path: str) -> description
  • speak(text: str) -> audio_path
Estimated complexity: Medium (LiteLLM supports vision, need tool wrappers)

Phase 2: Long-Term Memory (Q3 2026)

Goal: Never forget important context. Features:
  • Vector search: Semantic memory retrieval
  • Automatic summarization: Compress old conversations
  • Entity tracking: Remember people, places, facts
  • Memory importance scoring: Prioritize key information
  • Multi-document memory: Organize by topic/project
Implementation ideas:
  • Use lightweight vector DB (ChromaDB, DuckDB)
  • Auto-summarize conversations > N messages
  • Extract entities with LLM calls
  • Store in ~/.nanobot/memory/ with indexes
Tools:
  • remember(fact: str, importance: int)
  • recall(query: str) -> relevant_facts
  • forget(fact_id: str)
Estimated complexity: High (vector DB integration, summarization logic)

Phase 3: Better Reasoning (Q4 2026)

Goal: Multi-step planning and self-reflection. Features:
  • Chain-of-thought: Explicit reasoning steps
  • Task decomposition: Break complex tasks into subtasks
  • Self-critique: Evaluate and revise outputs
  • Plan visualization: Show reasoning tree to user
  • Alternative exploration: Consider multiple approaches
Implementation ideas:
  • Add reason() tool for internal thinking
  • Multi-pass agent loop (plan → execute → reflect)
  • Reasoning prompt templates
  • Visualization in web UI (future)
Tools:
  • plan(goal: str) -> steps[]
  • critique(output: str) -> improvements[]
  • reflect() -> insights
Estimated complexity: High (new agent loop patterns)

Phase 4: More Integrations (Ongoing)

Goal: Work with more platforms and tools. Channels:
  • Twitter/X: Post tweets, reply to mentions
  • LinkedIn: Messaging integration
  • SMS: Twilio integration
  • Mastodon: Fediverse support
  • iMessage: Apple Messages (via bridge)
  • Signal: Private messaging
  • Zulip: Team chat
  • Mattermost: Open-source Slack alternative
Providers:
  • Together AI: Fast inference
  • Fireworks: Model zoo
  • Replicate: Run any model
  • Cohere: Command models
  • AI21: Jurassic models
  • Mistral: Mistral AI (if not via OpenRouter)
Tools:
  • Calendar: Google Calendar, Outlook integration
  • Email send: Proactive email sending
  • File sync: Dropbox, Google Drive
  • Database: SQL query execution
  • Code execution: Jupyter kernels
  • Browser: Playwright/Selenium automation
Estimated complexity: Medium per integration

Phase 5: Self-Improvement (2027)

Goal: Learn from feedback and mistakes. Features:
  • User feedback loop: Rate responses, agent learns
  • Error tracking: Log and analyze failures
  • Automatic retries: Fix mistakes without user intervention
  • Preference learning: Adapt to user style
  • Skill discovery: Auto-install useful skills
Implementation ideas:
  • Feedback storage in ~/.nanobot/feedback/
  • Error pattern detection
  • Preference profiles in config
  • Skill marketplace integration (ClawHub)
Tools:
  • rate_response(rating: int, feedback: str)
  • analyze_errors() -> patterns[]
  • adjust_preferences(key: str, value: any)
Estimated complexity: Very High (ML components, feedback loops)

Community Priorities

Based on GitHub discussions and Discord feedback:

High Demand

  1. Web UI: Browser-based interface (like OpenWebUI)
  2. Streaming responses: Real-time output
  3. Function calling improvements: Parallel tool execution
  4. Better error messages: More helpful diagnostics
  5. RAG support: Document Q&A

Medium Demand

  1. Plugin system: Third-party tool installation
  2. Multi-agent coordination: Agents working together
  3. Custom prompts: User-defined system prompts
  4. Voice UI: Speak to agent directly
  5. Mobile app: iOS/Android companion

Low Demand (but interesting)

  1. Agent marketplace: Share and download agents
  2. Blockchain integration: Web3 tools
  3. IoT control: Smart home integration
  4. AR/VR: Spatial computing

Non-Goals

What we’re not building (to keep nanobot simple): Enterprise features: SSO, multi-tenancy, admin panels
Complex UIs: Rich web dashboards (keep it CLI-first)
Heavy dependencies: Avoid large frameworks (Django, etc.)
Monolithic architecture: Stay modular and hackable
Kitchen sink: Don’t add every possible feature
If you need these, consider building on top of nanobot or using a different framework.

How to Contribute

Want to help with the roadmap?
  1. Pick an item from the roadmap above
  2. Open a GitHub Discussion to discuss your approach
  3. Create a PR with your implementation
  4. Get feedback from maintainers
  5. Iterate until it’s ready to merge
See the Contributing Guide for details.

Versioning Strategy

Current: v0.1.x (Alpha)

  • Rapid iteration
  • Breaking changes allowed
  • Focus on core features

Future: v0.2.x (Beta)

  • Stable API
  • Deprecation warnings before breaking changes
  • Focus on polish and reliability

Long-term: v1.0.0 (Stable)

  • Production-ready
  • Semantic versioning
  • Long-term support

Release Cadence

  • Patch releases (v0.1.4.post1): As needed (bug fixes)
  • Minor releases (v0.1.5): Every 1-2 weeks (new features)
  • Major releases (v0.2.0): When API changes significantly

Feature Requests

Have an idea? Here’s how to suggest it:
  1. Check existing issues/discussions: Might already be planned
  2. Open a GitHub Discussion: Describe the feature and use case
  3. Gauge community interest: See if others want it too
  4. Estimate complexity: How many lines of code?
  5. Propose implementation: How would it fit into nanobot?
Good feature requests:
  • Align with nanobot’s goals (simple, hackable)
  • Have clear use cases
  • Don’t add excessive complexity
  • Can be implemented in less than 500 lines
Bad feature requests:
  • “Add everything from framework X”
  • Niche features used by less than 1% of users
  • Require heavy dependencies
  • Violate the “keep it simple” principle

Research Areas

For academic/research use:
  1. Memory architectures: Better long-term memory designs
  2. Multi-agent systems: Agent communication protocols
  3. Tool learning: Automatic tool discovery and composition
  4. Context optimization: Smarter prompt compression
  5. Reasoning methods: Novel planning and reflection techniques

Metrics

How we measure success:
  • Lines of code: Keep core agent under 5,000 lines
  • Startup time: CLI mode under 1 second
  • Dependencies: Minimize third-party packages
  • Documentation: Every feature documented
  • Community: Active Discord, GitHub discussions
  • Real usage: People actually use it daily

Timeline

QuarterFocusKey Features
Q2 2026Multi-modalVision, image generation, voice output
Q3 2026MemoryVector search, summarization, entity tracking
Q4 2026ReasoningChain-of-thought, task decomposition
Q1 2027IntegrationsNew channels, providers, tools
Q2 2027Self-improvementFeedback loops, error learning
Q3 2027PolishWeb UI, streaming, better UX
Q4 2027v1.0Production-ready release

Long-Term Vision (2028+)

  • Autonomous agents: Proactively help without prompting
  • Agent collaboration: Multiple agents working together
  • Continuous learning: Improve over time from usage
  • Universal interface: Control anything via natural language
  • Personal AI OS: nanobot as your digital assistant layer

Get Involved

Frequently Asked Questions

Will nanobot always be free?

Yes. MIT licensed, forever.

Will nanobot stay lightweight?

That’s the core principle. If something makes it too heavy, we won’t add it.

Can I use nanobot for commercial projects?

Yes, MIT license allows commercial use.

Will there be a hosted version?

No plans currently. Self-hosting keeps it simple.

How can I sponsor development?

Contributions and feedback are the best support. Star the repo on GitHub!

What if my favorite feature isn’t on the roadmap?

Open a discussion! If there’s community interest and it aligns with nanobot’s goals, it might get added.

Build docs developers (and LLMs) love