Utils module

The utils module provides essential utility functions for the manga downloader, including filename sanitization, asynchronous image downloading, and PDF generation.

clean_filename()

Sanitizes strings to create valid file or directory names compatible with Windows and Linux.

def clean_filename(text: str) -> str

Parameters

text

str

required

The raw title string to be sanitized.

Returns

return

str

A safe filename string with HTML tags stripped and reserved characters removed. Returns "untitled" if the input is empty or results in an empty string after sanitization.

Behavior

Strips HTML tags using regex pattern <[^>]+>
Removes reserved characters: \ / * ? : " < > |
Trims leading and trailing whitespace
Returns "untitled" as fallback for empty inputs

Usage example

from core.utils import clean_filename

# Basic sanitization
title = clean_filename("My Manga: Chapter 1")
print(title)  # Output: "My Manga Chapter 1"

# HTML tag removal
title = clean_filename("<div>Manga Title</div>")
print(title)  # Output: "Manga Title"

# Empty input handling
title = clean_filename("")
print(title)  # Output: "untitled"

download_image()

Downloads a single image asynchronously and saves it to a local directory with proper ordering.

async def download_image(
    session: aiohttp.ClientSession,
    url: str,
    folder: str,
    index: int,
    log_callback: Callable[[str], None],
    headers: dict
) -> Optional[str]

Parameters

session

aiohttp.ClientSession

required

The active asynchronous HTTP session for making requests.

url

str

required

The direct image URL to download.

folder

str

required

The local directory path to save the image (typically a temporary folder).

index

int

required

The page order index used for filename generation. Ensures correct sorting in PDF creation.

log_callback

Callable[[str], None]

required

Callback function to send error messages and logs to the frontend.

headers

dict

required

HTTP headers to bypass bot protections (Cloudflare, User-Agent, Referer, etc.).

Returns

return

Optional[str]

The absolute path to the downloaded image file if successful, otherwise None.

Behavior

Automatically detects file extension from URL (.jpg, .png, .webp, .jpeg, .avif)
Generates zero-padded filenames (e.g., 001.jpg, 042.png)
Downloads with provided headers to bypass protections
Logs errors through callback on failure

Usage example

import aiohttp
from core.utils import download_image
from core.config import HEADERS_TMO

async def download_pages():
    async with aiohttp.ClientSession() as session:
        path = await download_image(
            session=session,
            url="https://example.com/page1.jpg",
            folder="/path/to/temp",
            index=1,
            log_callback=print,
            headers=HEADERS_TMO
        )
        if path:
            print(f"Downloaded: {path}")

create_pdf()

Compiles a list of image paths into a single PDF file using img2pdf (preferred) or Pillow (fallback).

def create_pdf(
    image_paths: List[str],
    output_pdf: str,
    log_callback: Callable[[str], None]
) -> bool

Parameters

image_paths

List[str]

required

List of absolute file paths to images to be compiled into the PDF.

output_pdf

str

required

Absolute path where the generated PDF should be saved.

log_callback

Callable[[str], None]

required

Callback function to emit log messages about the PDF generation process.

Returns

return

bool

True if PDF generation succeeded, False if it failed.

Behavior

Image conversion: Automatically converts RGBA, LA, and transparent images to RGB/JPEG
Primary method: Uses img2pdf library for lossless PDF generation
Fallback method: Uses Pillow if img2pdf fails or is unavailable
Error handling: Logs detailed error messages through callback
Path logging: Displays relative path from PDF folder for cleaner logs

Usage example

from core.utils import create_pdf

image_files = [
    "/temp/001.jpg",
    "/temp/002.png",
    "/temp/003.webp"
]

success = create_pdf(
    image_paths=image_files,
    output_pdf="/output/manga_chapter1.pdf",
    log_callback=print
)

if success:
    print("PDF created successfully")

finalize_pdf_flow()

Orchestrates the final steps of PDF creation: generates the PDF, optionally opens it, and cleans up temporary files.

def finalize_pdf_flow(
    image_paths: List[str],
    pdf_name: str,
    log_callback: Callable[[str], None],
    temp_dir: Optional[str] = None,
    open_result: bool = True
)

Parameters

image_paths

List[str]

required

List of downloaded image file paths to compile.

pdf_name

str

required

Filename for the output PDF (not a full path, just the filename).

log_callback

Callable[[str], None]

required

Callback function to emit status messages.

temp_dir

Optional[str]

Optional path to temporary directory to delete after PDF creation.

open_result

bool

default:"True"

Whether to automatically open the PDF file and its folder after creation (Windows only).

Returns

This function does not return a value.

Behavior

Creates PDF folder if it doesn’t exist
Generates PDF using create_pdf()
Opens PDF folder and file if open_result=True (Windows only via os.startfile())
Deletes temporary directory if provided
Logs completion status

Usage example

from core.utils import finalize_pdf_flow

finalize_pdf_flow(
    image_paths=downloaded_images,
    pdf_name="My_Manga_Chapter_1.pdf",
    log_callback=print,
    temp_dir="/temp/manga_download",
    open_result=True
)

download_and_make_pdf()

Main orchestration function that handles the complete workflow: downloads images in batches, creates PDF, and cleans up.

async def download_and_make_pdf(
    image_urls: List[str],
    output_name: str,
    headers: dict,
    log_callback: Callable[[str], None],
    check_cancel: Callable[[], bool],
    progress_callback: Optional[Callable[[int, int], None]] = None,
    is_path: bool = False,
    open_result: bool = True
) -> None

Parameters

image_urls

List[str]

required

List of image URLs to download and compile into PDF.

output_name

str

required

Output filename (or full path if is_path=True).

headers

dict

required

HTTP headers for image download requests.

log_callback

Callable[[str], None]

required

Callback function for log messages.

check_cancel

Callable[[], bool]

required

Function that returns True if the user cancelled the operation.

progress_callback

Optional[Callable[[int, int], None]]

Optional callback for progress updates with (current, total) parameters.

is_path

bool

default:"False"

If True, treats output_name as a full path instead of just a filename.

open_result

bool

default:"True"

Whether to automatically open the PDF after creation.

Returns

This function does not return a value.

Behavior

Temp folder setup: Creates/cleans temporary download folder
Batch downloading: Downloads images in chunks defined by BATCH_SIZE
Cancellation check: Checks for user cancellation between batches
Progress updates: Calls progress callback after each batch
PDF creation: Compiles downloaded images into PDF
Cleanup: Removes temporary folder
Completion logging: Emits [DONE] Finished. message

Usage example

from core.utils import download_and_make_pdf
from core.config import HEADERS_TMO

image_urls = [
    "https://example.com/page1.jpg",
    "https://example.com/page2.jpg",
    "https://example.com/page3.jpg"
]

def log(msg):
    print(msg)

def check_cancelled():
    return False  # Check your cancellation state

def update_progress(current, total):
    print(f"Downloaded {current}/{total} images")

await download_and_make_pdf(
    image_urls=image_urls,
    output_name="manga_chapter.pdf",
    headers=HEADERS_TMO,
    log_callback=log,
    check_cancel=check_cancelled,
    progress_callback=update_progress,
    open_result=True
)

Complete workflow example

Here’s how the utility functions work together in a typical download scenario:

import asyncio
from core.utils import clean_filename, download_and_make_pdf
from core.config import HEADERS_TMO, OPEN_RESULT_ON_FINISH

async def download_manga_chapter(raw_title: str, image_urls: list):
    # Sanitize the title for use as filename
    safe_title = clean_filename(raw_title)
    pdf_filename = f"{safe_title}.pdf"
    
    # Download all images and create PDF
    await download_and_make_pdf(
        image_urls=image_urls,
        output_name=pdf_filename,
        headers=HEADERS_TMO,
        log_callback=print,
        check_cancel=lambda: False,
        progress_callback=lambda c, t: print(f"{c}/{t}"),
        open_result=OPEN_RESULT_ON_FINISH
    )

# Run the download
await download_manga_chapter(
    raw_title="<div>My Manga: Chapter 1</div>",
    image_urls=[...]  # List of image URLs
)

Core Module

Site Handlers

Web Server

clean_filename()

Parameters

Returns

Behavior

Usage example

download_image()

Parameters

Returns

Behavior

Usage example

create_pdf()

Parameters

Returns

Behavior

Usage example

finalize_pdf_flow()

Parameters

Returns

Behavior

Usage example

download_and_make_pdf()

Parameters

Returns

Behavior

Usage example

Complete workflow example

Build docs developers (and LLMs) love

Core Module

Site Handlers

Web Server

​clean_filename()

​Parameters

​Returns

​Behavior

​Usage example

​download_image()

​Parameters

​Returns

​Behavior

​Usage example

​create_pdf()

​Parameters

​Returns

​Behavior

​Usage example

​finalize_pdf_flow()

​Parameters

​Returns

​Behavior

​Usage example

​download_and_make_pdf()

​Parameters

​Returns

​Behavior

​Usage example

​Complete workflow example

Build docs developers (and LLMs) love

clean_filename()

Parameters

Returns

Behavior

Usage example

download_image()

Parameters

Returns

Behavior

Usage example

create_pdf()

Parameters

Returns

Behavior

Usage example

finalize_pdf_flow()

Parameters

Returns

Behavior

Usage example

download_and_make_pdf()

Parameters

Returns

Behavior

Usage example

Complete workflow example