Under Linux, use wget to crawl the entire site

888u

Last update at :2024-04-29,Edit by888u

wget -r -p -np -k http://example.com/ -r recursive -p, --page-requisites (page required elements) -np, --no-parent (do not trace to parent) -k Convert the links in the downloaded HTML page to relative links, that is, local links

I found a few wget tips and posted them.

$ wget -r -np -nd http://example.com/packages/ This command downloads all files in the packages directory on the http://example.com website. Among them, -np The function is not to traverse the parent directory, -nd means not to recreate the directory structure on the local machine. $ wget -r -np -nd --accept=iso http://example.com/centos-5/i386/ Similar to the previous command, but with the addition of the --accept=iso option, which instructs wget to download only i386 All files with the iso extension in the directory. You can also specify multiple extensions, just separate them with commas. $ wget -i filename.txt This command is often used in batch download situations. Put the addresses of all files that need to be downloaded into filename.txt. Then wget will automatically download all the files for you. $ wget -c http://example.com/really-big-file.iso The -c option specified here is used to resume the download from a breakpoint. $ wget -m -k (-H) http://www.example.com/ This command can be used to mirror a website and wget will convert the links. If the images on the website are placed on another site, Then you can use the -H option

Recommended site search: Korean independent server, server purchase, server rental, registration-free space and free network, free cn domain name registration, foreign virtual space, IP address, PHP space purchase, asp free space application, free server,

Under Linux, use wget to crawl the entire site

All copyrights belong to 888u unless special state
取消
微信二维码
微信二维码
支付宝二维码