<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ThinkChat2.0新版上線,更智能更精彩,支持會話、畫圖、視頻、閱讀、搜索等,送10W Token,即刻開啟你的AI之旅 廣告
                [TOC] ## 一 **線程池使用背景** 爬蟲的本質就是client發請求批量獲取server的響應數據,如果我們有多個url待爬取,只用一個線程且采用串行的方式執行,那只能等待爬取一個結束后才能繼續下一個,效率會非常低。那么該如何提高爬取性能呢? 1. 多進程 2. 多線程 3. 進程池 4. **線程池** 5. 協程 其中最推薦初學者的就是線程池,原因如下 1. 多進程/線程的方式會頻繁創建銷毀,浪費性能 2. 線程比進程開銷小,能使用線程池就不用進程池 3. 協程雖然高效,但是實現起來復雜 ## 二 線程池的實現 ```python #1. 導入模塊,re模塊的作用是解析詳情頁中的js數據 import requests,re from lxml import etree from multiprocessing.dummy import Pool # 2. 設置url,headers等 url='https://www.pearvideo.com/category_2' headers={ 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36' } video_list=[] # 3. 創建請求,頁面數據 res=requests.get(url=url,headers=headers).text tree=etree.HTML(res) # 4. 利用xpath解析 li_list=tree.xpath('//*[@id="categoryList"]/li') for li in li_list: video_name=li.xpath('./div/a/div[2]/text()')[0]+'.mp4' detal_url='https://www.pearvideo.com/'+li.xpath('./div/a/@href')[0] detal_res=requests.get(url=detal_url,headers=headers).text # 5. 詳情頁視頻url在js中,不能用xpath和bs4解析,只能用正則 video_url=re.findall('srcUrl="(.*?)",vdoUrl',detal_res)[0] dic={ 'video_name':video_name, 'video_url':video_url } video_list.append(dic) # 6. 創建解析下載視頻的函數 def get_video(dic): url=dic['video_url'] name=dic['video_name'] video_data=requests.get(url=url,headers=headers).content print("開始下載視頻:%s ....."%name) with open(name,'wb') as f: f.write(video_data) # 7. 創建線程池并調用map方法 pools=Pool(4) pools.map(get_video,video_list) pools.close() pools.join() ```
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看