python - Python,Selenium:我需要收集 url 但元素中没有标签

美好的一天,伙计们。我的任务是从该站点收集人员的姓名和电子邮件:https://www.espeakers.com/s/nsas/search?available_on=&awards&budget=0%2C10&bureau_id=304&distance=1000&fee=false&items_per_page=3701&language=en&location=&norecord=false&nt=0&page=0&presenter_type=&q=%5B%5D&require&review=false&sort=speakername&video=false&virtual=false

我使用 selenium 和 python 来抓取它,但是我在访问人们的 url 时遇到了问题。人卡样本结构为:

<div class="col-xs-12 col-sm-6 col-md-4 col-lg-3">
   <div class="speaker-tile" id="sid12026">
    <div class="speaker-thumb" style='background-image: url("https://streamer.espeakers.com/assets/6/12026/159445.jpg"); background-size: contain;'>
     <div class="row">
      <div class="col-xs-8 text-left">
      </div>
      <div class="col-xs-4 text-right speaker-top-actions">
       <i class="fa fa-ellipsis-h fa-fw">
       </i>
      </div>
     </div>
    </div>
    <div class="speaker-details">
     <div class="speaker-name">
      Alex Aanderud
     </div>
     <div class="row" style="margin-top: 15px;">
      <div class="col-xs-12 col-sm-12">
       <div class="speaker-location">
        <i class="fa fa-map-marker mp-tertiary-background">
        </i>
        AZ
        <span>
         ,
        </span>
        US
       </div>
      </div>
      <div class="col-sm-6 col-xs-12">
       <div class="speaker-awards">
       </div>
      </div>
     </div>
     <div class="speaker-oneline text-left">
      <p>
      </p>
      <div>
       Certified Trainer of Advanced Integrative Psychology and Certified John Maxwell Speaker, Trainer, Coach, will transform your organization and improve your results.
      </div>
     </div>
     <div class="speaker-assets">
      <div class="row">
      </div>
     </div>
     <div class="speaker-actions">
      <div class="row">
       <div class="text-center col-xs-12">
        <div class="btn btn-flat mp-primary btn-block">
         <span class="hidden-xs hidden-sm">
          View Profile
         </span>
         <span class="visible-xs visible-sm">
          Profile
         </span>
        </div>
       </div>
      </div>
     </div>
    </div>
   </div>
  </div>

当你点击

<span class="hidden-xs hidden-sm">
      View Profile
</span>

它会将您带到包含人员信息的页面,我可以在其中访问它。我如何使用 selenium 来做到这一点,或者还有其他可以帮助我的解决方案。谢谢!

回答1

如果您注意到,所有的个人资料 url 的形式都是

https://www.espeakers.com/s/nsas/profile/id

其中 id 是一个 5 位数字,例如 27397。因此您只需提取 id 并将其与基本 url 连接以获得配置文件 url。

url = 'https://www.espeakers.com/s/nsas/profile/'
profile_urls = [url + el.get_attribute('id')[3:] for el in driver.find_elements(By.CSS_SELECTOR, '.speaker-tile')]
names = [el.text for el in driver.find_elements(By.CSS_SELECTOR, '.speaker-name')]

names 是一个包含所有名称的列表,urls 是一个包含相应配置文件网址的列表

相似文章

最新文章