fcsArchiver.py (CLI)

Final Cut Server is deeply integrated with PresStore archival software to provide seamless access to media assets while preserving available disk space. There are two distinct components involved in the actual archive and restore process: Final Cut Server, and Archiware PresStore archival software. The integration between these two systems is provided mainly by the Python script /usr/local/bin/fcsArchiver.py. This script takes a number of options, though the primary usage will be through the -p flag, for process queues:

>>> /usr/local/bin/transmogrifier.py -p

When ran in this form, transmogrifier will process first restore queues, followed by archive queues. These queues are stored in the form of a SQLite3 database named backupHistory.db, found at the path designated in our settings file by attribute supportPath. For any previously submitted jobs, fcsArchiver.py will check with PresStore on their status. For any jobs that have completed, fcsArchiver.py will remove their entry from the queue, and create a record in the archiveHistory table recording the accomplishment. For any failed or cancelled jobs, fcsArchiver.py will resubmit the job to PresStore for reprocessing. For more information on the backupHistory.db database, see The fcsArchiver Database.

Files are added to the queue via two command line scripts, addToArchiveQueue.sh and addToRestoreQueue.sh, each located by default at /usr/local/bin. Both commands take a single argument, a path to a file to add to either the archive or restore queue, respectively. Each script has a corresponding queue, in the form of two plain text files, filesToArchive, and filesToRestore, located in the support folder designated by our configuration parameter supportPath.

When either addToArchiveQueue.sh or addToRestoreQueue.sh are called, they will consult these plain text queue files, and if the provided file path is not already cataloged in the file, it will append the new path. When fcsArchiver.py is ran with the -p flag, it will first process queues loaded into the SQLite3 database. After this is done, it will consult each of these flat file queues, and add the new file paths into the system. This is when MD5 checksums are ran and compared against the archive history, if the file has never been archived in it’s current form, it will be sent to PresStore for archiving. When restore queues are processed at this point, fcsArchiver.py will first ensure that the file does not already exist on disk, either in it’s online location, or at it’s designated location on the archive device. If the file does not exist in either location, fcsArchiver.py will restore the asset from tape. However, a restore job will only be submitted to PresStore once the tape is available in the library; if the tape is not available, fcsArchiver.py will instead send an offline media notification, and the file will be reprocessed during the next execution. This will continue until the appropriate restore media is placed inside the library.

Syntax

The fcsArchiver.py script has the following usage:

FCS Archiver
  Version: 1.0b Build: 2011040702
  Framework Version: 1.0b Build: 2011041301

Copyright (C) 2009-2011 Beau Hunter, 318 Inc.

Usage:

fcsArchiver.py [option]

Options:
  -h, --help                   Displays this help message
  -v, --version                Display version number
  -f configfilepath,           Use specified config file
    --configFile=configfilepath
  -p, --processQueue           Process archive and restore queues
      --processRestoreQueue    Process restore queues
      --processArchiveQueue    Process archive queues

  --getVolumeBarcode           Lists volume barcode for the requested file
  --getVolumeLabel             (must be used with --file option)
  --file='/path/to/file'

  --getVolumeBarcodeForFile=   Outputs barcode for specified file
  --getVolumeLabelForFile=     Outputs label for specified file
  --getVolumeBarcodeForLabel=  Outputs the barcode for the specified label

Examples:
  fcsArchiver.py --processArchiveQueue
  fcsArchiver.py --getVolumeBarcode --file='/myfile.txt'
  fcsArchiver.py --getVolumeBarcodeForFile='/myfile.txt'
  fcsArchiver.py --getVolumeBarcodeForLabel=10001

Configuration

By default, the fcsArchiver.py script utilizes the configuration file found at /usr/local/etc/fcsArchiver.conf. In this file, fcsArchiver will queue off of a number of parameters configured under the [fcsArchiver] Section.

The following shows an example fcsArchiver configuration:

[GLOBAL]
archivePath=/Users/Shared/FCSStore/Archive
supportPath=/Users/Shared/FCSStore/Support/Archive
debug=False


[BACKUP]
useOffsitePlan=True
archivePlan=10001
offsiteArchivePlan=10002
backupSystem=PresStore
nsdchatpath=/usr/local/aw/bin/nsdchat
nsdchatUseSSL=True
nsdchatUseSudo=False
remoteSSLHost=hax.lbc
remoteSSLUserName=root
trustRestoreChecksumMismatch=True
preventArchiveDuplicates=True

[NOTIFICATIONS]
SMTPServer=hax.lbc
SMTPPort=25
SMTPUser=''
SMTPPassword=''
emailToNotify=hunterbj@hax.lbc
emailFromAddress=fcs@hax.lbc

As shown above, there are four specific settings which we will read in from this file, broken off into several sections

[GLOBAL]
archivePath
The full path to the Final Cut Server archive device root
supportPath
The full path to a support folder which contains our sqlite3 database and queue files
debug
Specify whether we run in debug mode
[BACKUP]
archivePlan
(str) – The name of the archive plan to utilize (i.e. ‘FCSOnsitePlan’)
useOffsitePlan
(bool) – Specifies whether to duplicate archive jobs to a separate offsite archive plan
offsiteArchivePlan
(str) – The name of the offsite archive plan to use if useOffsitePlan
is True
backupSystem
(str) – The name of the backup system
nsdchatpath
(str) – The filesystem path to nsdchat binary
nsdchatUseSSL
(bool) – Specifies whether we use ssh to a remote host for nsdchat calls
if True, we will reference ‘remoteSSLHost’ and ‘remoteSSLUserName’ for connection information.
remoteSSLHost
(str) – The IP or DNS name of remote host to call for nsdchat
remoteSSLUserName
(str) – The Username of remote host to call for nsdchat
trustRestoreChecksumMismatch
(bool) – Specify whether we trust checksum mismatches for restores: if
an asset exists on disk with a differing checksum, we will replace it when restoring if this option is set to False. If set to True, we will forego the restore from tape.
preventArchiveDuplicates
(bool) – Specify whether we trust checksum’s to skip tape archives. If
this option is set to True, if an asset is archived and the asset already has an entry in our archiveHistory database with an identical checksum, we will forego the archive to tape and remove the asset from disk. If set to False, we will re-archive the asset.
[NOTIFICATIONS]
SMTPServer
(str) – The IP or DNS name of remote host to utilize for email notifications.
SMTPUser
(str) – The username to utilize for authenticated SMTP email notifications. This value should be ommited if unauthenticated SMTP is desired.
SMTPPassword
(str) – The password to utilize for authenticated SMTP email notifications. This value should be ommited if unauthenticated SMTP is desired.
emailToNotify
(str) – The email address that notifications are sent to.
emailFromAddress
(str) – The From address used by email notifications

Example Usage

The fcsArchiver.py script has fairly limited scope in regards to command line options. In it’s typical usage, we will simply have process both restore and archive queues (in that order). To accomplish this, we simply use the -p flag:

>>> fcsArchiver.py -p
Apr 15 02:56:11:  INFO   :   Processing Restore Queues...
Apr 15 02:56:11:  INFO   :     Found 0 running restore jobs.
Apr 15 02:56:11:  INFO   :   Checking for new restore files...
Apr 15 02:56:11:  INFO   :     Restore Queue is empty.
Apr 15 02:56:11:  INFO   :   Finished processing all restore queues.
Apr 15 02:56:11:  INFO   :   Processing Archive Queues...
Apr 15 02:56:11:  INFO   :     Found 0 running archive jobs.
Apr 15 02:56:11:  INFO   :   Checking for new archive files...
Apr 15 02:56:11:  INFO   :     Archive Queue is empty.
Apr 15 02:56:11:  INFO   :   Finished processing all archive queues..

When ran in this form, transmogrifier will process first restore queues, followed by archive queues. These queues are stored in the form of a SQLite3 database named backupHistory.db, found at the path designated in our settings file by attribute supportPath. For any previously submitted jobs, fcsArchiver.py will check with PresStore on their status. For any jobs that have completed, fcsArchiver.py will remove their entry from the queue, and create a record in the archiveHistory table recording the accomplishment. For any failed or cancelled jobs, fcsArchiver.py will resubmit the job to PresStore for reprocessing. For more information on the backupHistory.db database, see The fcsArchiver Database.

We can also process solely archive or restore queues by using the flags --processArchiveQueue or --processRestoreQueue, respectively:

>>> fcsArchiver.py --processRestoreQueue
Apr 15 02:58:23:  INFO   :   Processing Restore Queues...
Apr 15 02:58:23:  INFO   :     Found 0 running restore jobs.
Apr 15 02:58:23:  INFO   :   Checking for new restore files...
Apr 15 02:58:23:  INFO   :     Restore Queue is empty.
Apr 15 02:58:23:  INFO   :   Finished processing all restore queues.

The fcsArchiver.py script can also be utilized to query PresStore to determine the tape that a particular file has been archived to:

>>> fcsArchiver.py --getVolumeLabelForFile='/my/archive/device/myfile.mov' --tapeSet=onsite
LABEL: 10001

>>> fcsArchiver.py --getVolumeBarcodeForFile='/my/archive/device/myfile.mov' --tapeSet=onsite
BARCODE: A00001

>>> fcsArchiver.py --getVolumeBarcodeForFile='/my/archive/device/myfile.mov' --tapeSet=offsite
BARCODE: B00001

The Scheduler

fcsArchiver.py is routinely fired via a launchd plist, located at /Library/LaunchDaemons/com.318.fcsarchiver.plist. This plist has a few notable declarations. First and foremost, it will execute the fcsArchiver.py script with the syntax every 15 minutes:

>>> /usr/local/bin/fcsArchiver.py -p

This launchd plist will also redirect stdout and stderr from runtime to the file located at /var/logs/transmogrifier/fcsArchiver.log. This log can be consulted to determine any current activity being pursued by the script.

Starting or stopping the automatic schedule for fcsArchiver is achieved using the standard launchctl cli tool. To start fcsArchiver to run every 15 minutes, the following command can be used:

>>> sudo launchctl load -w /Library/LaunchDaemons/com.318.fcsarchiver.plist

To stop the automation, we simple substitute ‘unload’:

>>> sudo launchctl unload -w /Library/LaunchDaemons/com.318.fcsarchiver.plist

It’s a good idea to stop fcsArchiver.py in the event that the backup server is taken down, to prevent unnecessary processing cycles.

The following shows the contenst of the com.318.fcsarchiver.plist launch daemon:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>Label</key>
        <string>com.318.fcsArchiver</string>
        <key>UserName</key>
        <string>admin</string>
        <key>ProgramArguments</key>
        <array>
                <string>/usr/local/bin/fcsArchiver.py</string>
                <string>-p</string>
        </array>
        <key>StartInterval</key>
        <integer>600</integer>
        <key>StandardOutPath</key>
        <string>/var/log/transmogrifier/fcsArchiver.log</string>
        <key>StandardErrorPath</key>
        <string>/var/log/transmogrifier/fcsArchiver.log</string>
        <key>RunAtLoad</key>
        <true/>
</dict>
</plist>

When configuring, it is important to ensure that the value of key UserName to the user which Final Cut Serve was installed under. It is also important to ensure that the log file at /var/log/transmogrifier/fcsArchiver.log is writable by that user.

The fcsArchiver Database

Upon each operation, fcsArchiver.py will consult it’s populated queue database to keep track of file archive operations and requests. These queues are stored in the form of a SQLite3 database, located at the root of the fcsArchiver support path in a file named backupHistory.db. This database file contains three tables: archiveQueue, restoreQueue, and archiveHistory. The first table, archiveQueue, holds all records for active archive requests and has the following schema:

CREATE TABLE archiveQueue (fcsID,filePath,checksum,archiveSet,tapeSet,jobID,jobSubmitDate,retryCount,status);

There are a few notable fields: the fcsID naturally is used for communication back with the Final Cut Server Asset. The filePath field is the full path to the asset when as it resides on the FCS Archive Device. The checksum field is an md5 checksum of the file, used to detect changes and prevent duplication for assets which have been restored and re-archived without changes. The archiveSet field designates the batch name in which the file was processed, this will typically be a Date+Time stamped identifier. The jobID field stores the PresStore jobid for the job in which the file was submitted to PresStore. The tapeSet field help us keep track of the asset should it need to pass through multiple tapesets (i.e. onsite, followed by offsite). The status field specifies where in the process the file is, i.e. ‘archiveQueued’, ‘archiveRunning’,‘archiveFailed’, etc..

The restoreQueue table is structure identically to the archiveQueue table, and functions in the same way. The only major difference is the removal of the checksum field and the addition of a barcode field, which is used to designate the tape from which the asset will be restored.

CREATE TABLE restoreQueue (fcsID,filePath,archiveSet,tapeSet,barcode,jobID,jobSubmitDate,retryCount,status);

The archiveHistory table represents both archive and restore jobs which have either completed successfully or terminated fatally. It has the following schema:

CREATE TABLE archiveHistory (fcsID,filePath,checksum,barcode,tapeSet,archiveSet,jobID,completionDate,status);

This table is structured a bit differently than either the archive or restore jobs, and will record both the checksum and barcode for finished jobs. When new assets are archived, their checksums will be compared to the checksum for any previous record in the archiveHistory database; provided a previous record is found, the asset will simply be removed from storage rather than re-archived.

Table Of Contents

Previous topic

transmogrifier.py (CLI)

Next topic

productionBuilder.py (CLI)

This Page