Do you need to run a script regularly but don’t want to remember to launch it manually? Or maybe you need to execute a command at a specific time or interval but don’t want the process to monopolize your CPU or memory. In either case, cron jobs are perfect for the task. Let’s look at what they are, how to set them up, and some of the things you can do with them.
There are times when there’s a need to run a group of tasks automatically at certain times in the future. These tasks are usually administrative but could be anything – from making database backups to downloading emails when everyone is asleep.
Cron is a time-based job scheduler in Unix-like operating systems, which triggers certain tasks in the future. The name originates from the Greek word χρόνος (chronos), which means time.
The most commonly used version of Cron is known as Vixie Cron. Paul Vixie originally developed it in 1987.
Cron Job Terminology
Table of Contents
- Job: a unit of work, a series of steps to do something. For example, sending an email to a group of users. This article will use task, job, cron job, or event interchangeably.
- Daemon: a computer program that runs in the background, serving different purposes. Daemons often start at boot time. A web server is a daemon serving HTTP requests. Cron is a daemon for running scheduled tasks.
- Cron Job: a cron job is a scheduled job. The daemon runs the job when it’s due.
- Webcron: a time-based job scheduler that runs within the server environment. It’s an alternative to the standard Cron, often on shared web hosts that do not provide shell access.
Getting Started with Cron Jobs
If we take a look inside the /etc
directory, we can see directories like cron.hourly
, cron.daily
, cron.weekly
and cron.monthly
, each corresponding to a certain frequency of execution.
One way to schedule our tasks is to place our scripts in the proper directory. For example, to run db_backup.php
on a daily basis, we put it inside cron.daily
. If the folder for a given frequency is missing, we would need to create it first.
Note: This approach uses the run-parts
script, a command which runs every executable it finds within the specified directory.
This is the simplest way to schedule a task. However, if we need more flexibility, we should use Crontab.
Crontab Files
Cron uses special configuration files called crontab
files, which contain a list of jobs to be done. Crontab stands for Cron Table. Each line in the crontab file is called a cron job, which resembles a set of columns separated by a space character. Each row specifies when and how often Cron should execute a certain command or script.
In a crontab file, blank lines or lines starting with #
, spaces or tabs will be ignored. Lines starting with #
are considered comments.
Active lines in a crontab are either the declaration of an environment variable or a cron job. Crontab does not allow comments on active lines.
Below is an example of a crontab file with just one entry:
0 0 * * * /var/www/sites/db_backup.sh
The first part 0 0 * * *
is the cron expression, which specifies the frequency of execution. The above cron job will run once a day.
Users can have their own crontab files named after their username as registered in the /etc/passwd
file. All user-level crontab files reside in Cron’s spool area. You should not edit these files directly. Instead, we should edit them using the crontab
command-line utility.
Note: The spool directory varies across different distributions of Linux. On Ubuntu it’s /var/spool/cron/crontabs
while in CentOS it’s /var/spool/cron
.
To edit our own crontab file:
crontab -e
The above command will automatically open up the crontab file which belongs to our user. If you haven’t chosen a default editor for the crontab before, you’ll see a selection of installed editors to pick from. We can also explicitly choose or change our desired editor for editing the crontab file:
export VISUAL=nano; crontab -e
After we save the file and exit the editor, the crontab will be checked for accuracy. If everything is set properly, the file will be saved to the spool directory.
Note: Each command in the crontab file executes from the perspective of the user who owns the crontab. If your command runs as root (sudo) you will not be able to define this crontab from your own user account unless you log in as root.
To list the installed cron jobs belonging to our own user:
crontab -l
We can also write our cron jobs in a file and send its contents to the crontab file like so:
crontab /path/to/the/file/containing/cronjobs.txt
The preceding command will overwrite the existing crontab file with /path/to/the/file/containing/cronjobs.txt
.
To remove the crontab, we use the -r
option:
crontab -r
Anatomy of a Crontab Entry
The anatomy of a user-level crontab entry looks like the following:
The first two fields specify the time (minute and hour) at which the task will run. The next two fields specify the day of the month and the month. The fifth field specifies the day of the week.
Cron will execute the command when the minute, hour, month, and either day of month or day of week match the current time.
If both day of week and day of month have certain values, the event will run when either field matches the current time. Consider the following expression:
0 0 5-20/5 Feb 2 /path/to/command
The preceding cron job will run once per day every five days, from 5th to 20th of February plus all Tuesdays of February.
Important: When both day of month and day of week have certain values (not an asterisk), it will create an OR
condition, meaning both days will be matched.
The syntax in system crontabs (/etc/crontab
) is slightly different than user-level crontabs. The difference is that the sixth field is not the command, but it is the user we want to run the job as.
* * * * * testuser /path/to/command
It’s not recommended to put system-wide cron jobs in /etc/crontab
, as this file might be modified in future system updates. Instead, we put these cron jobs in the /etc/cron.d
directory.
Editing Other Users’ Crontab
We might need to edit other users’ crontab files. To do this, we use the -u
option as below:
crontab -u username -e
Note We can only edit other users’ crontab files as the root user, or as a user with administrative privileges.
Some tasks require super admin privileges. You should add them to the root user’s crontab file:
sudo crontab -e
Note: Please note that using sudo
with crontab -e
will edit the root user’s crontab file. If we need to to edit another user’s crontab while using sudo
, we should use -u
option to specify the crontab owner.
To learn more about the crontab
command:
man crontab
Standard and Non-Standard Crontab Values
Crontab fields accept numbers as values. However, we can put other data structures in these fields, as well.
Ranges
We can pass in ranges of numbers:
0 6-18 1-15 * * /path/to/command
The above cron job will run from 6 am to 6 pm from the 1st to 15th of each month in the year. Note that the specified range is inclusive, so 1-5 means 1,2,3,4,5.
Lists
A list is a group of comma-separated values. We can have lists as field values:
0 1,4,5,7 * * * /path/to/command
The above syntax will run the cron job at 1 am, 4 am, 5 am and 7 am every day.
Steps
Steps can be used with ranges or the asterisk character (*)
. When they are used with ranges they specify the number of values to skip through the end of the range. They are defined with a /
character after the range, followed by a number. Consider the following syntax:
0 6-18/2 * * * /path/to/command
The above cron job will run every two hours from 6 am to 6 pm.
When steps are used with an asterisk, they simply specify the frequency of that particular field. As an example, if we set the minute to */5
, it simply means every five minutes.
We can combine lists, ranges, and steps together to have more flexible event scheduling:
0 0-10/5,14,15,18-23/3 1 1 * /path/to/command
The above event will run every five hours from midnight of January 1st to 10 am, 2 pm, 3 pm and also every three hours from 6pm to 11 pm.
Names
For the fields month and day of week we can use the first three letters of a particular day or month, like Sat
, sun
, Feb
, Sep
, etc.
* * * Feb,mar sat,sun /path/to/command
The preceding cron job will run only on Saturdays and Sundays of February and March.
Please note that the names are not case-sensitive. Ranges are not allowed when using names.
Predefined Definitions
Some cron implementations may support some special strings. These strings are used instead of the first five fields, each specifying a certain frequency:
- @yearly, @annually Run once a year at midnight of January 1
(0 0 1 1 *)
- @monthly Run once a month, at midnight of the first day of the month
(0 0 1 * *)
- @weekly Run once a week at midnight of Sunday
(0 0 * * 0)
- @daily Run once a day at midnight
(0 0 * * *)
- @hourly Run at the beginning of every hour
(0 * * * *)
- @reboot Run once at startup
Multiple Commands in the Same Cron Job
We can run several commands in the same cron job by separating them with a semi-colon (;
).
* * * * * /path/to/command-1; /path/to/command-2
If the running commands depend on each other, we can use double ampersand (&&)
between them. As a result, the second command will not run if the first one fails.
* * * * * /path/to/command-1 && /path/to/command-2
Environment Variables
Environment variables in crontab files are in the form of VARIABLE_NAME = VALUE
(The white spaces around the equal sign are optional). Cron does not source any startup files from the user’s home directory (when it’s running user-level crons). This means we should manually set any user-specific settings required by our tasks.
Cron daemon automatically sets some environmental variables when it starts. HOME
and LOGNAME
are set from the crontab owner’s information in /etc/passwd
. However, we can override these values in our crontab file if there’s a need for this.
There are also a few more variables like SHELL
, specifying the shell which runs the commands. It is /bin/sh
by default. We can also set the PATH
in which to look for programs.
PATH = /usr/bin;/usr/local/bin
Important: We should wrap the value in quotation marks when there’s a space in the value. Please note that values are ordinary strings. They will not be interpreted or parsed in any way.
Different Time Zones
Cron uses the system’s time zone setting when evaluating crontab entries. This might cause problems for multiuser systems with users based in different time zones. To work around this problem, we can add an environment variable named CRON_TZ
in our crontab file. As a result, all crontab entries will parse based on the specified timezone.
How Cron Interprets Crontab Files
After Cron starts, it searches its spool area to find and load crontab files into the memory. It additionally checks the /etc/crontab
and or /etc/cron.d
directories for system crontabs.
After loading the crontabs into memory, Cron checks the loaded crontabs on a minute-by-minute basis, running the events which are due.
In addition to this, Cron regularly (every minute) checks if the spool directory’s modtime
(modification time) has changed. If so, it checks the modetime
of all the loaded crontabs and reloads those which have changed. That’s why we don’t have to restart the daemon when installing a new cron job.
Cron Permissions
We can specify which user should be able to use Cron and which user should not. There are two files that play an important role when it comes to cron permissions: /etc/cron.allow
and /etc/cron.deny
.
If /etc/cron.allow
exists, then our username must be listed in this file in order to use crontab
. If /etc/cron.deny
exists, it shouldn’t contain our username. If neither of these files exists, then based on the site-dependent configuration parameters, either the superuser or all users will be able to use crontab
command. For example, in Ubuntu, if neither file exists, all users can use crontab by default.
We can put ALL
in /etc/cron.deny
file to prevent all users from using cron:
echo ALL > /etc/cron.deny
Note: If we create an /etc/cron.allow
file, there’s no need to create a /etc/cron.deny
file as it has the same effect as creating a /etc/cron.deny
file with ALL
in it.
Redirecting Output
We can redirect the output of our cron job to a file if the command (or script) has any output:
* * * * * /path/to/php /path/to/the/command >> /var/log/cron.log
We can redirect the standard output to dev null to get no email, but still send the standard error email:
* * * * * /path/to/php /path/to/the/command > /dev/null
To prevent Cron from sending any emails to us, we change the respective crontab entry as below:
* * * * * /path/to/php /path/to/the/command > /dev/null 2>&1
This means “send both the standard output and the error output into oblivion.”
Email the Output
The output is mailed to the owner of the crontab or the email(s) specified in the MAILTO
environment variable (if the standard output or standard error are not redirected as above).
If MAILTO
is set to empty, no email will be sent out as the result of the cron job.
We can set several emails by separating them with commas:
MAILTO=admin@example.com,dev@example.com
* * * * * /path/to/command
Cron and PHP
We usually run our PHP command line scripts using the PHP executable.
php script.php
Alternatively, we can use shebang at the beginning of the script, and point to the PHP executable:
#! /usr/bin/php <?php // PHP code here
As a result, we can execute the file by calling it by name. However, we need to make sure we have permission to execute it.
To have more robust PHP command-line scripts, we can use third-party components for creating console applications like Symfony Console Component or Laravel Artisan. This article is a good start for using Symfony’s Console Component.
Learn more about creating console commands using Laravel Artisan. If you’d rather use another command-line tool for PHP, we have a comparison here.
Task Overlaps
There are times when scheduled tasks take much longer than expected. This will cause overlaps, meaning some tasks might be running at the same time. This might not cause a problem in some cases, but when they are modifying the same data in a database, we’ll have a problem. We can overcome this by increasing the execution frequency of the tasks. Still, there’s no guarantee that these overlaps won’t happen again.
We have several options to prevent cron jobs from overlapping.
Using Flock
Flock is a nice tool to manage lock files from within shell scripts or the command line. These lock files are useful for knowing whether or not a script is running.
When used in conjunction with Cron, the respective cron jobs do not start if the lock file exists. You can install Flock using apt-get
or yum
depending on the Linux distribution.
apt-get install flock
Or:
yum install flock
Consider the following crontab entry:
* * * * * /usr/bin/flock --timeout=1 /path/to/cron.lock /usr/bin/php /path/to/scripts.php
In the preceding example, flock
looks for /path/to/cron.lock
. If the lock is acquired in one second, it will run the script. Otherwise, it will fail with an exit code of 1.
Using a Locking Mechanism in the Scripts
If the cron job executes a script, we can implement a locking mechanism in the script. Consider the following PHP script:
<?php
$lockfile = sys_get_temp_dir() . '/' md5(__FILE__) . '.lock';
$pid = file_exists($lockfile) ? trim(file_get_contents($lockfile)) : null; if (is_null($pid) || posix_getsid($pid) === false) { // Do something here // And then create/update the lock file file_put_contents($lockfile, getmypid()); } else { exit('Another instance of the script is already running.');
}
In the preceding code, we keep pid
of the current PHP process in a file, which is located in the system’s temp
directory. Each PHP script has its own lock file, which is the MD5 hash of the script’s filename.
First, we check if the lock file exists, and then we get its content, which is the process ID of the last running instance of the script. Then we pass the pid
to posix_getsid PHP function, which returns the session ID of the process. If posix_getsid
returns false
it means the process is not running anymore and we can safely start a new instance.
Anacron
One of the problems with Cron is that it assumes the system is running continuously (24 hours a day). This causes problems for machines that are not running all day long (like personal computers). If the system goes offline during a scheduled task time, Cron will not run that task retroactively.
Anacron is not a replacement for Cron, but it solves this problem. It runs the commands once a day, week, or month but not on a minute-by-minute or hourly basis as Cron does. It is, however, a guarantee that the task will run even if the system goes off for an unanticipated period of time.
Only root or a user with administrative privileges can manage Anacron tasks. Anacron does not run in the background like a daemon, but only once, executing the tasks which are due.
Anacron uses a configuration file (just like crontab) named anacrontabs
. This file is located in the /etc
directory.
The content of this file looks like this:
SHELL=/bin/sh
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root RANDOM_DELAY=45 START_HOURS_RANGE=3-22 1 5 cron.daily nice run-parts /etc/cron.daily
7 25 cron.weekly nice run-parts /etc/cron.weekly
@monthly 45 cron.monthly nice run-parts /etc/cron.monthly
In an anacrontab
file, we can only set the frequencies with a period of n
days, followed by the delay time in minutes. This delay time is just to make sure the tasks do not run at the same time.
The third column is a unique name, which identifies the task in the Anacron log files.
The fourth column is the actual command to run.
Consider the following entry:
1 5 cron.daily nice run-parts /etc/cron.daily
These tasks run daily, five minutes after Anacron runs. It uses run-parts
to execute all the scripts within /etc/cron.daily
.
The second entry in the list above runs every 7 days (weekly), with a 25 minutes delay.
Collision Between Cron and Anacron
As you have probably noticed, Cron is also set to execute the scripts inside /etc/cron.*
directories. Different flavors of Linux handle this sort of possible collision with Anacron differently. In Ubuntu, Cron checks if Anacron is present in the system and if it is so, it won’t execute the scripts within /etc/cron.*
directories.
In other flavors of Linux, Cron updates the Anacron timestamps when it runs the tasks. Anacron won’t execute them if Cron has already run them.
Quick Troubleshooting
Absolute Path to the commands
It’s a good habit to use the absolute paths to all the executables we use in a crontab file.
* * * * * /usr/local/bin/php /absolute/path/to/the/command
Make Sure Cron Daemon Is Running
If our tasks are not running at all, first we need to make sure the Cron daemon is running:
ps aux | grep crond
The output should similar to this:
root 7481 0.0 0.0 116860 1180 ? Ss 2015 0:49 crond
Check /etc/cron.allow
and /etc/cron.deny
Files
When cron jobs are not running, we need to check if /etc/cron.allow
exists. If it does, we need to make sure we list our username in this file. And if /etc/cron.deny
exists, we need to make sure our username is not listed in this file.
If we edit a user’s crontab file whereas the user does not exist in the /etc/cron.allow
file, including the user in the /etc/cron.allow
won’t run the cron until we re-edit the crontab file.
Execute Permission
We need to make sure that the owner of the crontab has the execute permissions for all the commands and scripts in the crontab file. Otherwise, the cron will not work. You can add execute permissions to any folder or file with:
chmod +x /some/file.php
New Line Character
Every entry in the crontab should end with a new line. This means there must be a blank line after the last crontab entry, or the last cron job will never run.
Wrapping Up
Cron is a daemon, running a list of events scheduled to take place in the future. We define these jobs in special configuration files called crontab files. Users can have their own crontab file if they are allowed to use Cron, based on /etc/cron.allow
or /etc/cron.deny files
. In addition to user-level cron jobs, Cron also loads the system-wide cron jobs which are slightly different in syntax.
Our tasks are commonly PHP scripts or command-line utilities. In systems that are not running all the time, we can use Anacron to run the events which happen in the period of n
days.
When working with Cron, we should also be aware of the tasks overlapping each other, to prevent data loss. After a cron job is finished, the output will be sent to the owner of the crontab and or the email(s) specified in the MAILTO
environment variable.
Did you learn anything new from this post? Have we missed anything? Or did you just like this post and want to tell us how awesomely comprehensive it was? Let us know in the comments below!