Use Tumblr crawler and combine with h5ai to create private video library and photo gallery

888u

Last update at :2024-04-30,Edit by888u

Tumblr, rich in content, especially pictures and videos. Next, use the Tumblr crawler, combined with the h5ai directory direct reading program, or other image bed programs to create a gallery. CentOS 7 comes with python 2.7. CentOS 7 and python 2.7 are used below, combined with lamp, and h5ai is used to create the gallery.

1. Install and use h5ai 1. Install the lamp one-click installation package

yum -y install wget screen wget http://mirrors.linuxeye.com/lnmp-full.tar.gz tar xzf lnmp-full.tar.gz cd lnmp screen -S lnmp ./install.sh

Select Apache, PHP, Mysql, etc. during the installation process, but do not select others.

2. Create a site

./vhost.sh

Just follow the instructions. For example, the last created site is t.sib8.net For detailed tutorials, please see: OneinStack: lnmp, lamp, lnmpa one-click installation package (supports HHVM)

3. Install h5ai Enter the t.sib8.net directory

cd /data/wwwroot/t.sib8.net/ wget https://down.us.com/code/h5ai-0.29-mod.zip unzip h5ai-0.29-mod.zip

4. Modify the configuration file

vi /usr/local/apache/conf/vhost/t.sib8.net.conf

Put

DirectoryIndex index.html index.php

Change to

DirectoryIndex index.html index.php /_h5ai/server/php/index.php

Restart Apache

service httpd restart

5. Install ffmpeg to preview videos Install epel extension source

yum -y install epel-release

Install extension sources:

su -c 'yum localinstall --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-7.noarch.rpm https://download1.rpmfusion.org/nonfree/el/rpmfusion- nonfree-release-7.noarch.rpm' rpm --import http://li.nux.ro/download/nux/RPM-GPG-KEY-nux.ro rpm -Uvh http://li.nux.ro/download/nux/dextop/el7/x86_64/nux-dextop-release-0-1.el7.nux.noarch.rpm

Start installation

yum -y install ffmpeg ffmpeg-devel

2. Use tumblr-crawler crawler 1. Install possible dependencies

yum install openssl-devel bzip2-devel expat-devel gdbm-devel readline-devel sqlite-devel yum -y install gcc automake autoconf libtool make yum install gcc gcc-c++ yum -y install readline-devel

2. Install tumblr-crawler

cd /data/wwwroot/t.sib8.net/ git clone https://github.com/dixudx/tumblr-crawler.git cd tumblr-crawler pip install -r requirements.txt

3. Use tumblr-crawler to download pictures and videos a. Add tumblr sites in sites.txt, such as wanimal1983.tumblr.com and ma-tro.tumblr.com wanimal1983,cncn88 After saving, run

python tumblr-photo-video-ripper.py

b. Direct download

python tumblr-photo-video-ripper.py wanimal1983,ma-tro

4. All pictures and videos are saved in the folder with the same name as the tumblr blog in the current path

5. Combine with tumblr_spider to obtain multiple users and video addresses

3. Use scheduled tasks to automatically update videos and pictures

crontab -e

Add

0 */6 * * * python /data/wwwroot/t.sib8.net/tumblr-crawler/tumblr-photo-video-ripper.py

It means updated every 6 hours.

Recommended site search: Hong Kong server defense dynamic IP vps, ICP registration query, permanent free cloud server address, vps server rental, website virtual host space, server, IP address query, expired registered domain name, foreign free all-purpose space ,

Use Tumblr crawler and combine with h5ai to create private video library and photo gallery

All copyrights belong to 888u unless special state
取消
微信二维码
微信二维码
支付宝二维码