Recent study nutch, want it to automatically each day to capture the content of our school website. cron is a tool to solve this problem.
cron is a regular implementation of the tool under linux, you can completely without manual intervention operations. Can use this command to manually open or close the task:
/ Sbin / service crond start start the service
/ Sbin / service crond stop shut down service
/ Sbin / service crond restart Restart the service
/ Sbin / service crond reload reload services
cron task allocation scheme for writing format:
Minutes to hours of sun, moon and Zhou [username] command
Description:
The first paragraph should be defined are: minutes, said the first few minutes of each hour to perform. Range from 0-59
Second paragraph should be defined are: hours, said a few hours from execution, ranges from 0-23
The third paragraph should be defined is: the date that the first few days from the implementation of each month, ranging from 1-31
The fourth paragraph should be defined is: month, said the first few months each year to perform, ranging from 1-12
Fifth paragraph should be defined is: week that the first few days the implementation of a week, ranging from 0-6, where 0 is Sunday.
Should be defined for each Liuduan are: user name, that is, the implementation process through which the user to perform, this can generally be omitted;
Seventh paragraph should be defined is: the implementation of the command and parameters.
In the system, / etc directory, there cron.daily, cron.hourly, cron.monthly, cron.weekly file directory, we only need to add in the appropriate directory shell-written documents can automatically timed execution.
If I want nutch in daily 9:00 start automatically crawl, it can be achieved:
[Root @ localhost cron.daily] # touch autonutch.sh
[Root @ localhost cron.daily] # chmod 755 autonutch.sh / * change autonutch.sh access * /
[Root @ localhost cron.daily] # echo "/ home / sunny / nutch / bin / nutch crawl urls-dir crawl"> autonutch.sh
[Root @ localhost cron.daily] # more autonutch.sh
/ Home / sunny / nutch / bin / nutch crawl urls-dir crawl