my spider have serious memory leak.. after 15 min of run memory 5gb , scrapy tells (using prefs() ) there 900k requests objects , thats all. can reason high number of living requests objects? request goes , doesnt goes down. other objects close zero. my spider looks this: class externallinkspider(crawlspider): name = 'external_link_spider' allowed_domains = [''] start_urls = [''] rules = (rule(lxmllinkextractor(allow=()), callback='parse_obj', follow=true),) def parse_obj(self, response): if not isinstance(response, htmlresponse): return link in lxmllinkextractor(allow=(), deny=self.allowed_domains).extract_links(response): if not link.nofollow: yield linkcrawlitem(domain=link.url) here output of prefs() htmlresponse 2 oldest: 0s ago externallinkspider 1 oldest: 3285s ago linkcrawlitem 2 oldest: 0s ago request ...
Comments
Post a Comment