Q: How to Schedule a Crawler/Scraping Task at a Specific Date?
I want to collect the website data at a specific date in 2017, twice a day.
How to set the Scheduled Cloud Extraction for my scraping task/crawler?
After you complete configuring your task, select the option “Schedule Cloud Extraction Settings” to begin the scheduling process.
In the “Schedule Cloud Extraction Settings” dialog box, you can select
- Periods of Availability - The data extraction period by setting the Start date and End date.
- Run Mode - Running your periodic tasks to collect data with varying intervals: Once, Weekly, Monthly, Real Time.
You can choose the first type - Once, and select the days and times of each day to collect the data from the internet.
Then click Save to save the configuration or click Start to begin the scheduling of the task.
Please check out this tutorial to collect data with different intervals:
Scheduled Data Extraction - Octoparse Cloud Web Scraping Service
When the periods of Availability you set does not within the task's effective period, you will get a pop-up saying, "Schedule Cloud Extraction Failed - Cloud extraction failed for task is out of effective date, please renew the effective date for task."
In this case, you can extend the task's effective period first.
In the "Schedule Cloud Extraction Settings" window, you can change the Start time and End time in the periods of Availability to extend the effective period of the task, then click Save to save the setting and the task's effective period will be changed.