Scrapy crawl with goagent agent I say you goagent list address: http://127.0.0.1:8087 and you create a scrapy project named: myscrapy. and you pwd is myscrapy
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# file: myscrapy/settings.py ... USER_AGENT = 'http://127.0.0.1:8087' DOWNLOADER_MIDDLEWARES = { 'myscrapy.middlewares.MyProxyMiddleware': 100, 'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 110, 'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None, } ... # file: myscrapy/middlewares.py from myscrapy.settings import USER_AGENT class MyProxyMiddleware(object): def process_request(self, request, spider): request.meta['proxy'] = USER_AGENT |
