CLI tool to fetch a webpage's contents after it has finished loading
webpage-dl
Fetch a webpage’s contents via CLI after it has finished loading.
Scraping HTML from webpages can’t always be done via curl
as it is often useful to let the page’s JavaScript play out first. webpage-dl
launches a WKWebView
and only reads the source of the page once loading has completed.
webpage-dl
even allows you to provide your own JavaScript expression, so scraping specific HTML nodes has never been easier.
Installation
- Clone this repo
- Run
make install
(requires Xcode 13+)
Usage
USAGE: webpage-dl [--width <width>] [--height <height>] [--timeout <timeout>] [--delay <delay>] [--expression <expression>] <url>
ARGUMENTS:
<url> The URL of the webpage to download.
OPTIONS:
-w, --width <width> The width of the simulated web page. (default: 500)
-h, --height <height> The height of the simulated web page. (default: 500)
--timeout <timeout> The maximum number of seconds the program should run
for before exiting (in the event of slow loading).
(default: 20.0)
-d, --delay <delay> An additional time to wait before executing
'expression'.
--expression <expression>
The JavaScript expression to run. The output from
this expression with be printed to stdout before the
program exits. (default:
document.documentElement.outerHTML.toString())
-h, --help Show help information.