<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ThinkChat2.0新版上線,更智能更精彩,支持會話、畫圖、視頻、閱讀、搜索等,送10W Token,即刻開啟你的AI之旅 廣告
                ## **解析庫-XPath的基本使用(2)** 導讀: 上一章節我們講述了Xpath的基本使用方法,常見的使用方式,最后講到了文本的獲取方式,這章將繼續將Xpath的基本使用。 <br> #### **1.屬性的獲取** &nbsp;&nbsp;&nbsp;&nbsp;我們知道用text()可以獲取節點內部文本,那么節點屬性該怎樣獲取呢? 其實還是用@符號就可 以。例如,我們想獲取所有li節點下所有a節點的href屬性,代碼如下: ~~~ from lxml import etree text = ''' <div> <ul> <li class="item-0"><a href="link1.html">first item</a></li> <li class="item-1"><a href="link2 . html" >second item</a></li> <li class="item- inactive" ><a href="link3. html">third item</a></li> <li class="item-1"><a href="link4. html">fourth item</a></li> <li class="item-0"><a href="link5.html">fifth item</a> </ul> </div> ''' html = etree.tostring(etree.HTML(text)).decode("utf8") # 將其補全轉為字符串,解碼 html = etree.HTML(html) #構造XPath 解析對象 result = html.xpath('//li/a/@href') print(result) 輸出: ['link1.html', 'link2 . html', 'link3. html', 'link4. html', 'link5.html'] ~~~ <br> #### **2.屬性多值匹配** 有時候,某些節點的某個屬性可能有多個值,例如: ~~~ from lxml import etree text ='<li class="li li-first"><a href="link.html">first item</a></li>' html = etree.HTML(text) result = html.xpath('//li[@class="li"]/a/text()') print(result) 輸出結果: [] ~~~ <br> 這時就需要用contains()函數了,代碼可以改寫如下: ~~~ from lxml import etree text ='<li class="li li-first"><a href="link.html">first item</a></li>' html = etree.HTML(text) result = html.xpath('//li[contains(@class,"li")]/a/text()') print(result) 輸出結果: ['first item'] ~~~ <br> #### **3.多屬性匹配** 另外,我們可能還遇到一種情況, 那就是根據多個屬性確定一個節點, 這時就需要同時匹配多個 屬性。此時可以使用運算符and來連接,示例如下: ~~~ from lxml import etree text ='<li class="li li-first" name="item" ><a href="link.html">first item</a></li>' html = etree.HTML(text) result = html.xpath('//li[contains(@class,"li") and @name="item"]/a/text()') print(result) 輸出結果: ['first item'] 這里的and其實是XPath中的運算符。另外,還有很多運算符,如or、mod ~~~ <br> #### **4.按序選擇** 有時候,我們在選擇的時候某些屬性可能同時匹配了多個節點,但是只想要其中的某個節點,如 第二個節點或者最后一個節點,這時該怎么辦呢? ~~~ from lxml import etree text = ''' <div> <ul> <li class="item-0"><a href="link1.html">first item</a></li> <li class="item-1"><a href="link2.html">second item</a></li> <li class="item-inactive"><a href="link3.html">third item</a></li> <li class="item-1"><a href="link4.html">fourth item</a></li> <li class="item-0"><a href="link5.html">fifth item</a> </ul> </div> ''' html = etree.HTML(text) result = html.xpath('//li[1]/a/text()') print(result) result = html.xpath('//li[last()]/a/text()') print(result) result = html.xpath('//li[position()<3]/a/text()') print(result) result = html.xpath('//li[last()-2]/a/text()') print(result) 輸出結果: ['first item'] ['fifth item'] ['first item', 'second item'] ['third item'] ~~~ <br> #### **5.節點軸選擇** |名稱|參數描述| |:---- |:---| | ancestor | 可以獲取所有祖先節點 | attribute| 可以獲取節點所有屬性值 | child |可以獲取所有直接子節點 |descendant|可以獲取所有子孫節點 |following|可以獲取當前節點之后的所有節點 |following-sibling|可以獲取當前節點之后的所有同級節點 示例如下: ~~~ from lxml import etree text = ''' <div> <ul> <li class="item-0"><a href="link1.html">first <a>sss</a>item</a></li> <li class="item-1"><a href="link2.html">second item</a></li> <li class="item-inactive"><a href="link3.html">third item</a></li> <li class="item-1"><a href="link4.html">fourth item</a></li> <li class="item-0"><a href="link5.html">fifth item</a> </ul> </div> ''' html = etree.HTML(text) result = html.xpath('//li[1]/ancestor::*') print(result) result = html.xpath('//li[1]/ancestor::div') print(result) result = html.xpath('//li[1]/attribute::*') print(result) result = html.xpath('//li[1]/child::a[@href="link1.html"]') print(result) result = html.xpath('//li[1]/descendant::*') print(result) result = html.xpath('//li[1]/following::*[2]') print(result) result = html.xpath('//li[1]/following-sibling::*') print(result) 輸出如下: [<Element html at 0x25e10bdb108>, <Element body at 0x25e10bdb0c8>, <Element div at 0x25e10bdb1c8>, <Element ul at 0x25e10bdb248>] [<Element div at 0x25e10bdb1c8>] ['item-0'] [<Element a at 0x25e10bdb248>] [<Element a at 0x25e10bdb248>, <Element a at 0x25e10bdb0c8>] [<Element li at 0x25e10bdb1c8>, <Element a at 0x25e10bdb308>, <Element li at 0x25e10bdb2c8>, <Element a at 0x25e10bdb208>, <Element li at 0x25e10bdb288>, <Element a at 0x25e10bdb348>, <Element li at 0x25e10bdb388>, <Element a at 0x25e10bdb3c8>] [<Element li at 0x25e10bdb1c8>, <Element li at 0x25e10bdb2c8>, <Element li at 0x25e10bdb288>, <Element li at 0x25e10bdb388>] ~~~ <br> <span style="color:red">結語:</span> 到現在為止,我們基本上把可能用到的XPath選擇器介紹完了。XPath功能非常強大,內置函數 非常多,熟練使用之后,可以大大提升HTML信息的提取效率。
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看