<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ThinkChat2.0新版上線,更智能更精彩,支持會話、畫圖、視頻、閱讀、搜索等,送10W Token,即刻開啟你的AI之旅 廣告
                #### spider ``` import scrapy from scrapy_tencent.items import ScrapyTencentItem class TencentSpider(scrapy.Spider): name = 'tencent' allowed_domains = ['tencent.com'] start_urls = ['https://hr.tencent.com/position.php?&start=0#a'] def parse(self, response): tr_list = response.xpath("//tr")[1:-2] for tr in tr_list: item = ScrapyTencentItem() item["name"] = tr.xpath("./td[1]/a/text()").extract_first() item["detail_url"] = "https://hr.tencent.com/"+tr.xpath("./td[1]/a/@href").extract_first() item["category"] = tr.xpath("./td[2]/text()").extract_first() item["city"] = tr.xpath("./td[4]/text()").extract_first() #yield item yield scrapy.Request( item["detail_url"], callback=self.parse_detail, meta={"item1": item} ) next_url = 'https://hr.tencent.com/' + response.xpath("//*[@id='next']/@href").extract_first() if 'javascript:;' not in next_url: yield scrapy.Request(url=next_url, callback=self.parse) def parse_detail(self, response): item = response.meta['item1'] item['duty'] = response.xpath('//tr[3]/td/ul/li/text()').extract()[0] item['require'] = response.xpath('//tr[4]/td/ul/li/text()').extract()[0] # 返回item給引擎 yield item ``` #### items ``` import scrapy class ScrapyTencentItem(scrapy.Item): # define the fields for your item here like: name = scrapy.Field() category = scrapy.Field() city = scrapy.Field() detail_url = scrapy.Field() duty = scrapy.Field() require = scrapy.Field() ``` #### ``` import json from scrapy_tencent.items import ScrapyTencentItem class ScrapyTencentPipeline(object): def open_spider(self, item): self.file = open("tencent.json", "w") def process_item(self, item, spider): # 保存json到文件 if isinstance(item, ScrapyTencentItem): item_json = json.dumps(dict(item), ensure_ascii=False) + ',\n' self.file.write(item_json) return item def close_spider(self): self.file.close() ```
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看