Q: How to get current page URL when scraping in Octoparse?

 

Description:

How to add current page's URL as one of my data fields when making a scraping task in Octoparse?

  

A:

The simplest method:

You can add the current page's URL when you are in the "Extract Data" action:

1. Click the "Add Pre-defined Fields".

 

2. Choose the “Add the current page URL”.

 

3. The current page's URL will be added automatically in the Define Fields. You can rename the data field.

 

Another method:

You can add the current page's URL when you are in the "Extract Data" action:

1. Click anywhere (for example, the blank place) on the web page  ➜ Choose "Extract text", and a data field will be generated automatically  Click "Save".

 

 2. Select the “Customize Field” button ➜ Choose “Define data extracted” ➜ Choose "Extract page URL" under the "Extract data from browser" option. ➜ Click "OK" ➜ Click "Save". Then you will see the current page's URL has been extracted. You can rename the data field if necessary.

 

 

btn_sidebar_use.png
btn_sidebar_form.png
当社ウェブサイトは、利便性、品質維持・向上を目的に、Cookieを使用しております。詳しくはプロキシーをご確認ください。Cookieの利用に同意頂ける場合は、「同意する」ボタンを押してください。同意頂けない場合は、ブラウザを閉じて閲覧を中止してください。
同意する 閉じる