Once a user has submitted a job into the incoming directory it is picked up by the incoming queue processor. The submitted job goes through a number of stages of processing and validation before tasks are registered for each target platform and architecture required.
Typically the incoming queue processor runs as a single daemon, on a
server usually referred to as the PkgForge master. The daemon
will scan the incoming job directory and form a queue from all the
jobs found within. It will process each of these jobs in turn, the
processing sequence is based on the order in which
the readdir
function returns them (usually alphabetical
based on your locale). Once all discovered jobs have been processed it
will wait for a certain amount of time (controlled by
the poll option) before doing another scan.
The processing of a submitted job can be broken down into a number of basic stages:
In the first stage any directory which is discovered within the incoming job directory is considered to contain a new job. Anything which is not a directory will have been discarded when the queue of jobs was formed.
An attempt will be made to load each potential job directory into
a PkgForge::Job
object. If the loading fails this will
not be initially considered as a complete failure. A soft
failure will have occurred where the job will be allowed to remain in
the incoming directory, in an unloadable form, for up to 5 minutes
(controlled by the wait_for_job option). This waiting is done
because the new jobs are typically submitted over a network filesystem
and it will take a finite amount of time for all the necessary files
to become available.
Once a job has been successfully loaded into
a PkgForge::Job
object the next stage is to check that
the identifier string has not been previously used in the job
registry. If it has not been seen previously then the new job will be
added into the registry database. An entry is added to
the job
table for the new identifier along with copies of
the information in a subset of the options which are specified in the
job metadata. Only the information necessary to schedule the job is
added (e.g. submission time, submitter name, job size). It is not
intended to be a complete copy of the job metadata.
Once a new job has been successfully added into the registry it is in the incoming state.
If the job identifier had been previously seen or it was not possible to add the new job to the registry then an immediate hard failure will occur and there will be no further attempts to process the job. In that situation the job will then move immediately to the Clean-Up stage.
Once a new job has been loaded it is necessary to validate the
associated payload. This is done by calling the validate
method on the PkgForge::Job
object. Firstly this method
checks to see if a
Presently the checking of the SHA1 sum for each source package is purely to ensure that the file is the same as that submitted by the user. This ensures that it is not still in transit and has not been corrupted. The system has been designed to make it possible to add support for the user to digitally sign the job manifest. Currently it would be possible to alter both a source package and the associated manifest after the user has completed their submission. A digitally-signed manifest could be used to guarantee that no tampering has occurred.
Each source package is represented by a Perl class which implements
the PkgForge::Source
Moose role. The role requires that
the class must implement a validate
method which returns true or false to indicate whether or not the
source package is valid. Currently only the SRPM file type is
supported. For that class the following validity checks are done:
.src.rpm
suffix..spec
suffix.
If a new job has passed all the validation checks then it will be
marked as valid in the registry database. If the new job has
failed any checks then, as with the loading stage, it will initially
be considered a soft failure and the job will be left in the
incoming queue up to the time specified in
the wait_for_job
option. With each queue run it will be
reconsidered to see if it has become valid. This allows time for the
complete submission of files over a slow network when a network
filesystem is being used. Once that time limit is exceeded then
a hard failure will have occurred and the job will be marked
in the registry database as invalid
. The job would then
move immediately the Clean-Up stage.
Once the job has been validated it is copied to the directory
where accepted jobs are stored. Throughout the copying
process the new PkgForge::Job
object for the recently
validated job is kept in memory. Once the copying is complete this
object is used to, once more, check the SHA1 sums of the copied source
files to ensure that no tampering or corruption has occurred. This
object is also used to write-out a new job metadata file into the
accepted job directory.
The intention is that, for security, only the user which the incoming queue processor runs as is permitted to write into the accepted job directory. If a standard, local unix filesystem is being and anything else is run as the same user then this probably does not give much extra confidence. However, if something like AFS is being used then there can be a high-level of trust in the data integrity of the accepted job directory if the write/insert access is highly restricted.
If, for any reason, the transfer fails, then the processing of this job will be considered to have failed. It will then be marked in the registry database as being in the failed state and it will be moved immediately to the Clean-Up stage.
Once a job has been validated and accepted then tasks can be registered for each platform. Terminology is used here to avoid confusion, a task is purely a job which has been registered for a specific platform/architecture. It should be noted that a job can be considered completely valid but not result in the registration of any tasks. In that case nothing will actually be done with the submitted job and its source packages.
As part of the processing instructions specified by the user, each submitted job has a list of applicable platforms and a list of applicable architectures. Typically, these might actually be just the special "all" string in both cases, as expected, this signifies that tasks should be registered for all platforms and/or architectures. There is plenty of scope for a user to be able to specify and restrict the sets of platforms and architectures for which tasks should be registered, this is fully described in the job documentation. The sets of target platforms and architectures are computed by examining the list of available, active, platforms in the registry database and applying the filters specified by the user for the new job.
Note that having a task registered for an active platform does not guarantee that a build daemon is currently available for that platform. It is perfectly acceptable to queue tasks for a platform which currently has no build daemons registered. It may also, of course, be the case that a build daemon is busy or currently unavailable due to maintenance. It is also worth noting that a platform may have multiple build daemons. It is not possible to guarantee which build daemon will accept a particular task, as they should all have identical build environments this should not cause problems. Full details of the task scheduling is available in the build daemon documentation.
Once tasks have been successfully registered the new job will be marked in the registry database as being in the registered state. If, for any reason, the task registration fails, then the processing of this job will be considered to have failed. It will then be marked in the registry database as being in the failed state and it will be moved immediately to the Clean-Up stage.
The final stage, no matter which final state a submitted job has achieved, is the clean-up of the incoming queue directory. The directory for the submitted job and all of the contents will be removed.