Skip to main content

Syntax

file FILE...

Description

The file command identifies the type of files by examining their contents (magic bytes) and file extensions. It’s useful for determining what kind of data a file contains, especially when the extension is missing or misleading.
Nash’s file implementation uses a combination of magic byte detection (file signatures) and extension-based heuristics as a fallback.

Parameters

FILE
string
required
One or more files to identify. Supports multiple files in a single command.

Behavior

  • Magic byte detection: Checks file headers for known signatures (ELF, PNG, JPEG, PDF, etc.)
  • Shebang detection: Identifies script files by their #! line
  • Extension fallback: Uses file extension when magic bytes don’t match
  • UTF-8 heuristic: Classifies valid UTF-8 as “ASCII text”
  • Binary data: Anything that’s not valid UTF-8 is “binary data”
  • Directories: Identified as “directory”

Supported File Types

Magic Byte Detection

ELF executable
detection
Files starting with \x7fELF
PNG image data
detection
Files starting with \x89PNG
JPEG image data
detection
Files starting with \xff\xd8\xff
GIF image data
detection
Files starting with GIF8
Zip archive data
detection
Files starting with PK\x03\x04
PDF document
detection
Files starting with %PDF
script
detection
Files starting with #! (shebang), shows interpreter path

Extension-Based Detection

ExtensionIdentified Type
.rsRust source code
.tomlTOML configuration
.jsonJSON data
.yaml, .ymlYAML data
.mdMarkdown document
.txtASCII text
.shshell script
.pyPython script
.jsJavaScript source
.htmlHTML document
.cssCSS stylesheet

Examples

Identify a text file

file welcome.txt
welcome.txt: ASCII text

Identify by extension

file config.json
config.json: JSON data

Identify a script with shebang

echo "#!/bin/nash" > script.sh
echo "echo hello" >> script.sh
file script.sh
script.sh: script (#!/bin/nash)

Identify multiple files

file report.txt data.json image.png
report.txt: ASCII text
data.json: JSON data
image.png: PNG image data

Identify a directory

file Documents/
Documents/: directory

Identify binary data

# Create a file with binary content
file unknown.bin
unknown.bin: binary data

Check file types in directory

ls | while read f; do file "$f"; done
Desktop: directory
Documents: directory
config.json: JSON data
welcome.txt: ASCII text

Detection Priority

  1. Directory check: If path is a directory, return “directory”
  2. Magic bytes: Check file header for known signatures
  3. Shebang: If file starts with #!, identify as script
  4. Extension: Match file extension against known types
  5. UTF-8 validation: If content is valid UTF-8, return “ASCII text”
  6. Fallback: Return “binary data”

Error Handling

Missing operand

file
file: missing operand
Exit code: 1

File doesn’t exist

file nonexistent.txt
nonexistent.txt: cannot open (No such file or directory)

Cannot read file

If a file exists but cannot be read, the VFS error is caught and reported.

Implementation Details

  • Source: src/builtins/file.rs
  • Uses vfs.exists() and vfs.is_dir() for basic checks
  • Reads file bytes with vfs.read() for magic byte detection
  • Function detect_type() contains detection logic
  • Checks up to first 64 bytes for shebang lines
  • Uses Rust’s std::str::from_utf8() for UTF-8 validation
  • Extension extracted using rsplit('.') and converted to lowercase
The magic byte detection is limited to common formats. Specialized or less common file types may fall back to extension-based detection or “binary data”.

Differences from Unix file

FeatureUnix fileNash file
Magic bytes1000+ types~7 types
MIME types✅ With -i❌ Not supported
Compressed filesLooks insideDetects as Zip
Detailed infoVery detailedBasic type
DatabaseUses magic.mgcHardcoded
SymlinksFollows/showsVFS has no symlinks
-b (brief)✅ SupportedAlways brief
-i (MIME)✅ Supported❌ Not supported

Use Cases

Verify file before processing

if file data.txt | grep -q "ASCII text"; then
  cat data.txt | grep ERROR
fi

Filter by file type

for f in *; do
  if file "$f" | grep -q "JSON data"; then
    echo "Processing JSON: $f"
    jq . "$f"
  fi
done

Identify scripts

find . -type f | while read f; do
  if file "$f" | grep -q "script"; then
    echo "Script found: $f"
  fi
done

Check uploaded file type

# nash --bind ./uploads:/uploads
file /uploads/new_file

Distinguish text from binary

if file document.pdf | grep -q "binary\|PDF"; then
  echo "Cannot cat binary file"
else
  cat document.pdf
fi

Common Patterns

Type-based processing

for file in *; do
  type=$(file "$file" | cut -d: -f2)
  echo "$file is $type"
done

Validate file types before import

#!/bin/nash
for upload in uploads/*; do
  type=$(file "$upload")
  case "$type" in
    *"JSON"*)
      jq . "$upload" > validated/$(basename "$upload")
      ;;
    *"ASCII text"*)
      cat "$upload" > validated/$(basename "$upload")
      ;;
    *)
      echo "Unsupported type: $type"
      ;;
  esac
done

Find all images

find . -type f | while read f; do
  if file "$f" | grep -q "image"; then
    echo "$f"
  fi
done

Safe file operations

if test -f input.dat; then
  filetype=$(file input.dat | cut -d: -f2)
  echo "Processing file of type: $filetype"
fi

Detection Examples

Various file types

user@nash:/home/user$ file *
Documents: directory
config.json: JSON data
data.csv: ASCII text
image.png: PNG image data
script.sh: shell script
README.md: Markdown document
Cargo.toml: TOML configuration
main.rs: Rust source code
binary.dat: binary data

Shebang detection

user@nash:/home/user$ cat deploy.sh
#!/bin/bash
echo "Deploying..."

user@nash:/home/user$ file deploy.sh
deploy.sh: script (#!/bin/bash)
  • stat - Display file status and size
  • ls - List files
  • cat - Display file contents
  • grep - Search file contents
  • find - Find files by criteria

Build docs developers (and LLMs) love