Note: I built this tool to help me yank files off my CDN into Hugo Page Bundles
A Rust utility that automatically downloads images referenced in text-based files like HTML, Markdown, and CSS documents.
- Processes individual files or entire directories recursively
- Supports multiple file formats:
- HTML (.html, .htm)
- Markdown (.md)
- CSS (.css)
- Plain text (.txt)
- XML (.xml)
- Handles various image formats: JPG, JPEG, PNG, SVG, and WebP
- Supports both relative and absolute URLs
- Maintains original file structure
- Skips existing files to avoid duplicate downloads
- Follows symbolic links when scanning directories
- Rust (latest stable version)
- Cargo package manager
[dependencies]
tokio = { version = "1.0", features = ["full"] }
reqwest = "0.11"
anyhow = "1.0"
regex = "1.0"
url = "2.0"
walkdir = "2.0"
- Clone the repository:
git clone [repository-url]
cd imgdown
- Build the project:
cargo build --release
The compiled binary will be available in target/release/
.
The application can process either a single file or an entire directory:
# Process a single file
./imgdown path/to/file.html
# Process an entire directory
./imgdown path/to/directory
./imgdown ./docs/blog
This will:
- Scan all supported files in the
./docs/blog
directory - Find image references in these files
- Download the images to the same directory structure as their referencing files
- Skip any images that have already been downloaded
- The program accepts a file or directory path as input
- For directories, it recursively scans for supported file types
- For each file, it:
- Reads the content
- Uses regular expressions to find image references
- Downloads images from valid URLs
- Preserves the directory structure
- Skips existing files
- Invalid paths result in appropriate error messages
- Download failures are logged but don't stop the process
- File access issues are reported with detailed error messages
- Only processes files with supported extensions
- Requires valid URL formatting in source files
- Does not validate image content
- Does not process JavaScript-generated image references
Contributions are welcome! Here are some ways you can contribute:
- Report bugs
- Suggest new features
- Add support for more file types
- Improve error handling
- Enhance documentation
Chris Short chrisshort@duck.com
- Created using Anthropic Claude 3.5 Sonnet