Skip to contents

Downloads files from URLs (HTTP/HTTPS/FTP/SFTP) with robust error handling, retry mechanisms, and advanced features like resume, bandwidth limiting, and auto-extraction.

Usage

download_url(
  url,
  dest = basename(url),
  overwrite = FALSE,
  unzip = FALSE,
  verbose = TRUE,
  timeout = 600,
  headers = NULL,
  resume = FALSE,
  speed_limit = NULL,
  retries = 3
)

Arguments

url

Character string. Full URL to the file to download. Supports HTTP, HTTPS, FTP, and SFTP protocols.

dest

Character string. Destination file path. If not specified, uses the basename of the URL. Default: basename(url).

overwrite

Logical. Whether to overwrite existing files. Default: FALSE.

unzip

Logical. Whether to automatically extract compressed files after download. Supports .zip, .gz, .tar.gz formats. Default: FALSE.

verbose

Logical. Whether to show download progress and status messages. Default: TRUE.

timeout

Numeric. Download timeout in seconds. Default: 600 (10 minutes).

headers

Named list. Custom HTTP headers for the request (e.g., list(Authorization = "Bearer token")). Default: NULL.

resume

Logical. Whether to attempt resuming interrupted downloads if a partial file exists. Default: FALSE.

speed_limit

Numeric. Bandwidth limit in bytes per second (e.g., 500000 = 500KB/s). Default: NULL (no limit).

retries

Integer. Number of retry attempts on download failure. Default: 3.

Value

Invisible character string or vector of file paths:

If unzip = FALSE

Path to the downloaded file

If unzip = TRUE

Vector of paths to extracted files

Details

This function provides a comprehensive solution for downloading files with:

Supported Protocols

Supports HTTP/HTTPS, FTP, and SFTP protocols.

Features

Includes retry mechanism, resume support, bandwidth control, auto-extraction, progress tracking, and custom headers.

Compression Support

Supports .zip, .gz, and .tar.gz formats.

Dependencies

Required packages: curl, cli, R.utils (automatically checked at runtime).

Examples

# Basic usage (commented to avoid network operations):
# download_url("https://example.com/data.csv")

# Advanced usage with custom settings:
# download_url(
#   url = "https://example.com/dataset.zip",
#   dest = file.path(tempdir(), "dataset.zip"),
#   unzip = TRUE,
#   resume = TRUE,
#   speed_limit = 1000000,
#   timeout = 1800
# )

# With authentication:
# download_url(
#   url = "https://api.example.com/data.json",
#   headers = list(Authorization = "Bearer token")
# )