Syntax
Description
Thefile command identifies the type of files by examining their contents (magic bytes) and file extensions. It’s useful for determining what kind of data a file contains, especially when the extension is missing or misleading.
Nash’s
file implementation uses a combination of magic byte detection (file signatures) and extension-based heuristics as a fallback.Parameters
One or more files to identify. Supports multiple files in a single command.
Behavior
- Magic byte detection: Checks file headers for known signatures (ELF, PNG, JPEG, PDF, etc.)
- Shebang detection: Identifies script files by their
#!line - Extension fallback: Uses file extension when magic bytes don’t match
- UTF-8 heuristic: Classifies valid UTF-8 as “ASCII text”
- Binary data: Anything that’s not valid UTF-8 is “binary data”
- Directories: Identified as “directory”
Supported File Types
Magic Byte Detection
Files starting with
\x7fELFFiles starting with
\x89PNGFiles starting with
\xff\xd8\xffFiles starting with
GIF8Files starting with
PK\x03\x04Files starting with
%PDFFiles starting with
#! (shebang), shows interpreter pathExtension-Based Detection
| Extension | Identified Type |
|---|---|
.rs | Rust source code |
.toml | TOML configuration |
.json | JSON data |
.yaml, .yml | YAML data |
.md | Markdown document |
.txt | ASCII text |
.sh | shell script |
.py | Python script |
.js | JavaScript source |
.html | HTML document |
.css | CSS stylesheet |
Examples
Identify a text file
Identify by extension
Identify a script with shebang
Identify multiple files
Identify a directory
Identify binary data
Check file types in directory
Detection Priority
- Directory check: If path is a directory, return “directory”
- Magic bytes: Check file header for known signatures
- Shebang: If file starts with
#!, identify as script - Extension: Match file extension against known types
- UTF-8 validation: If content is valid UTF-8, return “ASCII text”
- Fallback: Return “binary data”
Error Handling
Missing operand
1
File doesn’t exist
Cannot read file
If a file exists but cannot be read, the VFS error is caught and reported.Implementation Details
- Source:
src/builtins/file.rs - Uses
vfs.exists()andvfs.is_dir()for basic checks - Reads file bytes with
vfs.read()for magic byte detection - Function
detect_type()contains detection logic - Checks up to first 64 bytes for shebang lines
- Uses Rust’s
std::str::from_utf8()for UTF-8 validation - Extension extracted using
rsplit('.')and converted to lowercase
The magic byte detection is limited to common formats. Specialized or less common file types may fall back to extension-based detection or “binary data”.
Differences from Unix file
| Feature | Unix file | Nash file |
|---|---|---|
| Magic bytes | 1000+ types | ~7 types |
| MIME types | ✅ With -i | ❌ Not supported |
| Compressed files | Looks inside | Detects as Zip |
| Detailed info | Very detailed | Basic type |
| Database | Uses magic.mgc | Hardcoded |
| Symlinks | Follows/shows | VFS has no symlinks |
-b (brief) | ✅ Supported | Always brief |
-i (MIME) | ✅ Supported | ❌ Not supported |
