代碼如下:
#注意:本電腦環境是Python 3.7 #下面是導入相應模塊 import requests #導入requests庫 from bs4 import BeautifulSoup #導入解析庫 import pandas as pd #下面是網頁請求 url="http://q.stock.sohu.com/" #設置請求網址為搜索網址 response=requests.get(url) #對搜狐網站就行get請求并將請求結果賦值給response response.encoding='utf-8' #設置編碼為utf-8格式的 html=response.text #獲取網頁的html源代碼并賦值給html #下面是網頁解析 soup=BeautifulSoup(html,'lxml') #將lxml解析為html content=soup.findAll('a') #查找所有的a標簽內容并賦值給content for aa in content: #遍歷查到的的a標簽內容 print(aa.get('href')) #獲取a href后面的網址,并打印出來 #下面是保存數據 df=pd.DataFrame(content,columns=["網址"]) #設置列標為網址,單元格數據為content內容 df.to_Excel("搜索a標簽內容.xlsx") #將df數據存入搜索a標簽內容.xlsx中
運行結果如下:
/
//s.m.sohu.com/t/index.html
//q.stock.sohu.com/feedback.html
//q.stock.sohu.com/cn/mystock.shtml
//q.stock.sohu.com/cn/bk.shtml
//q.stock.sohu.com/cn/ph.shtml
//q.stock.sohu.com/cn/zs.shtml
//q.stock.sohu.com/fundflow/
/sdk/rank
//stock.sohu.com/ipo/
//q.stock.sohu.com/App2/bigdeal2.jsp
//q.stock.sohu.com/app2/rpsholder.up
//q.stock.sohu.com/app2/mpssTrade.up
//stock.sohu.com/s2011/jlp/
//q.fund.sohu.com/jzph/zxjz_date_up.shtml
//q.stock.sohu.com/us/zgg.html
JAVAscript:void(0);
/sdk/transfer?page=callin
/sdk/transfer?page=callin
/sdk/transfer?page=callout
/sdk/transfer?page=cancel
/sdk/transfer?page=record
//mp.sohu.com
JavaScript:void(0);
javascript:void(0);
javascript:void(0);
//q.stock.sohu.com/cn/ph_m.shtml?type=sh_as&field=changerate&sort=up
//q.stock.sohu.com/cn/ph_m.shtml?type=sz_as&field=changerate&sort=up
//q.stock.sohu.com/cn/bk.shtml
//q.stock.sohu.com/cn/bk.shtml
//q.stock.sohu.com/cn/bk.shtml
//q.stock.sohu.com/cn/bk.shtml
javascript:void(0);
javascript:void(0);
/sdk/rank
//q.stock.sohu.com/cn/mystock.shtml
javascript:void(0);
//q.stock.sohu.com/fundflow/stock_inflow.html?name=NetVal&io=In
//q.stock.sohu.com/fundflow/stock_inflow.html?name=NetVal&io=Out
//q.stock.sohu.com/app2/mpssTrade.up
//q.stock.sohu.com/app2/mpssTrade.up
//q.stock.sohu.com/app2/bigdeal2.jsp
圖片示例如下: