Understanding content channels, metadata structures, import/sync operations, and content distribution in Kolibri
Kolibri manages educational content through a channel-based system. Content is organized into channels, each containing a hierarchical tree of content nodes representing videos, exercises, documents, and other learning materials.
Each channel is stored as a separate SQLite database file:
Located in KOLIBRI_HOME/content/databases/
Named using the channel ID: {channel_id}.sqlite3
Contains all content metadata for that channel
Schema version tracked for backwards compatibility
def get_channel_ids_for_content_database_dir(content_database_dir): """Returns a list of channel IDs for channel databases in a directory.""" if not os.path.isdir(content_database_dir): return [] # Get all database files db_list = fnmatch.filter(os.listdir(content_database_dir), "*.sqlite3") db_names = [db.split(".sqlite3", 1)[0] for db in db_list] # Validate UUIDs valid_db_names = [name for name in db_names if is_valid_uuid(name)] return valid_db_names
Channel databases are read-only after import. Updates require importing a new version of the channel.
Kolibri distinguishes between two types of identifiers:
Node ID (id): Unique identifier for this specific content node instance
Content ID (content_id): Shared identifier for substantially similar content
# Multiple nodes can share the same content_id# This allows tracking user progress across duplicate contentnode1 = ContentNode.objects.get(id='abc-123')node2 = ContentNode.objects.get(id='def-456')# Both might have the same content_idassert node1.content_id == node2.content_id# User interactions with either node are tracked together
Use content_id when tracking user interactions to properly handle duplicate content across channels.
Content nodes form a tree hierarchy using MPTT (Modified Preorder Tree Traversal):
# Get all descendants of a topictopic = ContentNode.objects.get(id='topic-id')all_children = topic.get_descendants()# Get only direct childrendirect_children = topic.get_children()# Get ancestors (breadcrumb trail)path = node.get_ancestors(include_self=True)
class LocalFile(models.Model): """Tracks the local state of files on device storage.""" # ID is the checksum of the file id = models.CharField(max_length=32, primary_key=True) extension = models.CharField(max_length=40, choices=file_formats.choices) available = models.BooleanField(default=False) file_size = models.IntegerField(blank=True, null=True)
Multiple File objects can reference the same LocalFile, enabling deduplication when the same file is used across multiple content nodes.
Download/Copy Channel Database: Obtain the channel SQLite database
Read Metadata: Extract channel metadata and content tree structure
Copy to Local Database: Import metadata into Kolibri’s database
Download Content Files: Fetch associated media files
Update Availability: Mark content as available once files are present
# Import a channel from Kolibri Studiokolibri manage importchannel network {channel_id}# Import from a local drivekolibri manage importchannel disk {channel_id} /path/to/drive# Import specific content nodeskolibri manage importcontent network {channel_id} --node_ids={node_id}
Kolibri tracks which content is available on the device:
class ContentNode(base_models.ContentNode): # Total available resources on device under this topic on_device_resources = models.IntegerField(default=0, null=True, blank=True) # Total coach-only resources num_coach_contents = models.IntegerField(default=0, null=True, blank=True)
class ChannelMetadata(base_models.ChannelMetadata): # Indicates not all content is imported partial = models.BooleanField(default=False) # Tracks which nodes are available published_size = models.IntegerField(default=0) total_resource_count = models.IntegerField(default=0) included_languages = JSONField(default=[])