Board logo

标题: [问题求助] VBS如何用正则提取网址中的这一句? [打印本页]

作者: batsealine    时间: 2013-5-27 15:37     标题: VBS如何用正则提取网址中的这一句?

  1. set regex = New RegExp
  2. set fso = CreateObject("scripting.filesystemobject")
  3. Set http = CreateObject("Msxml2.XMLHTTP")
  4. url = "http://www.xiami.com/search/album?key=%E6%B5%AE%E8%BA%81"
  5. http.open "GET",url,False
  6. http.send
  7. html = http.responseText
  8. regex.ignoreCase = true
  9. regex.Global = true
  10. regex.Pattern = """浮躁"""
  11. Set matches = regex.Execute(html)
  12. For Each match In matches
  13. msgbox match
  14. Next
复制代码
我想的是先用title="浮躁"及title="王菲"提取到这一段
  1. <div class="album_item100_block">
  2. <p class="cover"><a class="CDcover100" href="/album/11943" title="浮躁">
  3. <img src="http://img.xiami.com/images/album/img77/2177/119431362392699_1.jpg" width="100" height="100" alt="" /></a>
  4. </p>
  5. <p class="name"><a href="/album/11943" title="浮躁"><b class="key_red">浮躁</b></a>
  6. <a class="singer" href="/artist/2177" title="王菲">王菲</a>
  7. </p>
  8. <p class="album_rank clearfix"><span style="width:48.5px;">总体评分</span><em>9.7</em></p>
  9. <p class="year">1996-08</p>
  10. </div>
复制代码
然后再提取: "http://img.xiami.com/images/album/img77/2177/119431362392699_1.jpg",我想保证精度,因为另一个人也有可能有"浮躁"专辑。
作者: apang    时间: 2013-5-27 22:21

  1. Set http = CreateObject("Msxml2.XMLHTTP")
  2. url = "http://www.xiami.com/search/album?key=%E6%B5%AE%E8%BA%81"
  3. http.open "GET",url,False
  4. http.send()
  5. Do Until http.ReadyState = 4 :Wscript.Sleep 100 :Loop
  6. html = http.responseText
  7. Set http = Nothing
  8. With New RegExp
  9.     .Global = true
  10.     .ignoreCase = true
  11.     .Pattern = "title=""浮躁""(.*\r\n){4}.*title=""王菲"""
  12.     For Each match In .Execute(html)
  13.         MsgBox Split(Split(match,vbCrLf)(1),Chr(34))(1)
  14.     Next
  15. End With
复制代码

作者: apang    时间: 2013-5-31 19:09

貌似也可以这样:
  1. Set http = CreateObject("Msxml2.XMLHTTP")
  2. url = "http://www.xiami.com/search/album?key=%E6%B5%AE%E8%BA%81"
  3. http.open "GET",url,False
  4. http.send()
  5. Do Until http.ReadyState = 4 :Wscript.Sleep 100 :Loop
  6. html = http.responseText
  7. Set http = Nothing
  8. Set re = New RegExp
  9. re.Global = true
  10. re.ignoreCase = true
  11. re.Pattern = "title=""浮躁""[\s\S]*?(http://.*?\.jpg)[\s\S]*?title=""王菲"""
  12. For Each match In re.Execute(html)
  13.       MsgBox match.SubMatches(0)
  14. Next
复制代码





欢迎光临 批处理之家 (http://www.bathome.net/) Powered by Discuz! 7.2