虫言虫语 虫言虫语

给scrapy设置HTTP代理

in Pythonread (265323) 文章转载请注明来源!

在scrapy项目文件下建立一个.py文件 填入以下信息:

# Importing base64 library because we'll need it ONLY in case if the proxy we are going to use requires authentication
import base64
# Start your middleware class
class ProxyMiddleware(object):
    # overwrite process request
    def process_request(self, request, spider):
        # Set the location of the proxy
        request.meta['proxy'] = "http://proxyIP:port"

        # Use the following lines if your proxy requires authentication
        # proxy_user_pass = "USERNAME:PASSWORD"
        # setup basic authentication for the proxy
        # encoded_user_pass = base64.encodestring(proxy_user_pass)
        # request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass

在setting中加入以下代码:(其中project_name自行替换后面跟上建立的.py文件路径,最后加上cls名)

DOWNLOADER_MIDDLEWARES = {
    'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 110,
    'project_name.middlewares.ProxyMiddleware': 100,
}
jrotty WeChat Pay

微信打赏

jrotty Alipay

支付宝打赏

文章二维码

扫描二维码,在手机上阅读!

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 4096 bytes) in /data/htdocs/worm.host.smartgslb.com/build/var/Typecho/Plugin.php on line 481 Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 4096 bytes) in /data/sys/hostker_start_7.php on line 64