Q: How to scrape data from multiple web pages/URLs?

 

A:

 

Sometimes you will encounter web compatibility issues with Octoparse’s built-in browser. For example, the hyperlink on the web page cannot be triggered when a next page button is clicked.

Sometimes Octoparse stops executing the next step during the extraction if it takes a really long time to load the URL while the web content has loaded completely.

For some websites, we would suggest that you use the “URL list” loop to extract information out of multiple web-pages with similar layout.

You can observe whether the website URLs you want to scrape has same characters or parameters.

Sample URLs:

altex.ro/tv-video/televizoare/ultra-hd-4k/filtru/p/1

altex.ro/tv-video/televizoare/ultra-hd-4k/filtru/p/2

altex.ro/tv-video/televizoare/ultra-hd-4k/filtru/p/3

Octoparse can scrape data from multiple web pages that share similar layout or many website URLs that are organized as a logical sequence by using “URL list Loop”.

There are only 4 steps to scrape multiple URLs. See the picture below.

  1. Drag a Loop action to workflow
  2. Choose the “List of URLs”mode
  3. Enter/Paste a list of URLs you want to scrape into the text box
  4. Don’t forget to click OK and Save button

 

That’s done! The “Go to Web Page” action will be generated automatically.

 

btn_sidebar_use.png
btn_sidebar_form.png
当社ウェブサイトは、利便性、品質維持・向上を目的に、Cookieを使用しております。詳しくはプロキシーをご確認ください。Cookieの利用に同意頂ける場合は、「同意する」ボタンを押してください。同意頂けない場合は、ブラウザを閉じて閲覧を中止してください。
同意する 閉じる