Meikipop includes several built-in OCR providers optimized for Japanese text recognition. Each provider offers different trade-offs between accuracy, speed, cost, and resource requirements.
The dummy provider is designed as a template for creating custom providers. It returns fixed mock data for testing.
src/ocr/providers/dummy/provider.py
class DummyProvider(OcrProvider): """ A template for creating new OCR providers. When this provider is selected, it returns a fixed set of Japanese text to allow for testing of the popup window without a real OCR backend. """ NAME = "Dummy OCR (Developer Template)"
Implementation highlights:
Returns hardcoded Japanese text with both horizontal and vertical examples
Demonstrates proper coordinate normalization
Shows character-level and word-level Word objects
Fully commented for educational purposes
Use cases:
Developing and testing UI without a real OCR backend
Template for creating custom providers
Understanding the data transformation process
Example output:
# Returns two paragraphs:# 1. Horizontal: "これは横書きテキストです"# 2. Vertical: "縦書き"
The meikiocr provider uses a high-performance local model specifically optimized for Japanese video game text.
src/ocr/providers/meikiocr/provider.py
class MeikiOcrProvider(OcrProvider): """ An OCR provider that uses the high-performance meikiocr library. This provider is specifically optimized for recognizing Japanese text from video games. """ NAME = "meikiocr (local)" def __init__(self): self.ocr_client = MeikiOCR() logger.info(f"Running on: {self.ocr_client.active_provider}")
Implementation highlights:
Uses the meikiocr Python library
Converts PIL images to NumPy RGB arrays
Returns character-level boxes for precise lookups
Groups individual lines into paragraphs using postprocessing
The owocr provider connects to a running owocr daemon via WebSocket, allowing flexible deployment options.
src/ocr/providers/owocr/provider.py
class OwocrWebsocketProvider(OcrProvider): """ An OCR provider that connects to a running owocr instance via websockets. This provider uses the synchronous websockets client to maintain a persistent connection. """ NAME = "owocr (Websocket)"
Implementation highlights:
Maintains persistent WebSocket connection
Automatic reconnection on connection loss
Uses direct IP (127.0.0.1) to avoid localhost resolution delays
Converts PIL Image to BMP format and sends as binary.
2
Receive acknowledgment
Waits for “True” confirmation (5 second timeout).
3
Receive results
Waits for JSON response with OCR results (30 second timeout).
4
Transform data
Converts owocr’s format to meikipop’s Paragraph objects.
Retry logic:
for attempt in range(2): try: if self.websocket is None: if not self._connect(): return None # ... perform scan except ConnectionClosed: logger.warning("Websocket connection lost. Will attempt to reconnect...") self.websocket = None if attempt == 0: continue # Retry once
This provider uses Chrome’s Screen AI component for local, offline OCR processing.
src/ocr/providers/screenai/provider.py
class ScreenAiOcr(OcrProvider): NAME = "Chrome Screen AI (local)" # Class-level variables to ensure the native DLL is only initialized ONCE _is_initialized = False _lib = None
Implementation highlights:
Uses Chrome’s native Screen AI library via ctypes
Singleton pattern for library initialization (once per app lifetime)
Most providers use the shared group_lines_into_paragraphs() utility:
from src.ocr.providers.postprocessing import group_lines_into_paragraphs# After converting to line-level Paragraph objectsraw_lines: List[Paragraph] = [...]final_paragraphs = group_lines_into_paragraphs(raw_lines)
import reJAPANESE_REGEX = re.compile(r'[\u3040-\u309F\u30A0-\u30FF\u4E00-\u9FAF]')line_has_japanese = any(JAPANESE_REGEX.search(w.plain_text) for w in line.words)if not line_has_japanese: continue