下载以保存到本地的html页面上的图片

时间：2020-09-12 21:25:38 收藏：0 阅读：44

xdd1997原创
所用http链接：https://www.tool22.com/zb_tools/html/PCwallpaper/
所用http页面下载链接
下载http页面的方式：火狐浏览器，右上角，另存页面为

# write by xdd1997  xdd2026@qq.com
# 2020-08-31
# encoding=utf-8

import os
import re
import time
import urllib.request

file = open(‘D:\\桌面\\爱情美图 - 在线壁纸.txt‘, mode=‘r‘, encoding=‘utf-8‘)
p1 = re.compile(r‘http://p.*.jpg‘)
list =[]
for line in file:
  #  print(line)
    match1 = re.findall(p1, line)
    if (match1 != None) & (match1 != []) :
        list.append(match1)

piclist = []
for ii in list[0]:
    tex = ii.split(‘ ‘)
    for jj in tex:
        if "__85" in jj:
            piclist.append(jj)
print(piclist)
picHttp = []
for ii in piclist:
    link = ii.split(‘"‘)[1]
    picHttp.append(link)

# -------------递归创建的目录-----------
path = "D:\桌面\爬取的图片"
if not os.path.exists(path):
    os.makedirs(path)
for http in picHttp:
    name = time.strftime("%Y%m%d%H%M%S", time.localtime())
    filesavepath = os.path.join(path, name + ‘.jpg‘)
    urllib.request.urlretrieve(http, filesavepath)
    print(‘正在下载---‘)
print(‘下载完成,保存路径为‘ + path)

都是不会正则表达式的亏，不让上面获取图片的http仅是一句话的事......

就这样吧。