python - Save Scrapy output in individual json objects with while-loop -


i using while-loop scrape several fields on webpage. want save output every iteration of loop in individual json object.

this works on machine (scrapy 0.24.6, python 2.7.5), not on ssh server (scrapy 1.0.1, python 2.7.6). want write item pipeline or item exporter ensure every iteration of loop saved single json object when running script on ssh server.

this python code:

from scrapy.spiders import spider blogtexts.items import blogitem  class blogtext1spider(spider): name = "texts1" allowed_domains = ["blogger.ba"]  start_urls = ["http://www.blogger.ba/profil/soko/blogovi/str1"]  def parse(self, response):     position = 1      while response.xpath(''.join(["//a[@class='blog'][", str(position), "]/@href"])).extract():         item = blogitem()         item["blog"] = response.xpath(''.join(["//a[@class='blog'][", str(position), "]/@href"])).extract()         item["blogfavoritemarkings"] = response.xpath(''.join(["//a[@class='broj'][", str(position), "]/text()"])).extract()         item["blogger"] = response.url.split("/")[-3]         yield item         position = position + 1 

i don't want output this:

{'blog': [u'http://emirnisic.blogger.ba', u'http://soko.blogger.ba'], 'blogfavoritemarkings': [u'180', u'128'], 'blogger': 'soko'} 

the output should instead this:

{'blog': [u'http://emirnisic.blogger.ba'],  'blogfavoritemarkings': [u'180'],  'blogger': 'soko'} {'blog': [u'http://soko.blogger.ba'],  'blogfavoritemarkings': [u'128'],  'blogger': 'soko'} 

do have recommendations on how can make sure output looks want? should use item pipeline or item exporter, or instead change while-loop? appreciated.

changing while loop while simple option. if gets more complex switch custom item exporter write items expected result leaving transparency between spider , result.

with in mind (and preparing future changes) i'd create own item exporter , form resulting json elements. of itertools.cycle.


Comments

Popular posts from this blog

javascript - Karma not able to start PhantomJS on Windows - Error: spawn UNKNOWN -

Nuget pack csproj using nuspec -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -