File ingest automation using "file manifests"

by Serban Simu
 

File ingest workflows can be based on individual files or file "packages". The challenge in processing "file packages" is to figure out what the "package" is (what are all files that belong to the "package"). In case of transfer failure, the client sending the file "package" will likely retry, and the "package" should only be processed once it is complete.

In order to process file "packages", I suggest using "file manifests". An Aspera server can be configured to create a file manifest for every received transfer, containing a list of all files that belong to that transfer session.

How to turn on file manifests

On the server that receives the file "packages", edit aspera.conf (or use the Aspera Console or Aspera Enterprise Server GUI) to add the following to the "default", "group" or "user" sections:

<file_system> 
...
<file_manifest>text</file_manifest>
<file_manifest_path>/data/asperamanifests</file_manifest_path>
</file_system>


Implementation guidelines

  1. Set up a post processing script to move the manifest file into a "final" directory, upon session completion. This will make it easy to determine which file manifest corresponds to completed sessions, failed sessions or sessions in progress. See Aspera Enterprise Server Administrator's Guide for more information on the post-processing script.

    Example post processing script (Linux, /opt/aspera/var/aspera-prepost).

    #!/bin/bash 
    # Uncomment for debugging
    # echo $SESSIONID $TYPE $STARTSTOP $STATE >> /tmp/my-post-process.log
    # Move file manifest upon session completion
    if [ "$TYPE" == "Session" ]; then
    if [ "$STARTSTOP" == "Stop" ]; then
    if [ "$STATE" == "success" ]; then
    mv /home/serban1/manifests/aspera-transfer-$SESSIONID* /home/serban1/complete >>/tmp/my-post-process.log
    else
    mv /home/serban1/manifests/aspera-transfer-$SESSIONID* /home/serban1/failed >>/tmp/my-post-process.log
    fi
    fi
    fi

    Note: in a future version the Aspera server will name the file manifest with a .aspera-partial extension while the session is in progress, and will rename to the .manifest.txt extension once the session completes. You will no longer need to move the manifest to a "final" directory using the post-processing script.

  2. Create a "processing" script to be run as a cron job
    • for all file manifests in the "completed" directory
      • process ("ingest") the file package (all files are listed in the manifest)
      • optionally notify sender (the user account is listed in the manifest)
      • remove all manifests in the "failed" directory that have the same source, destination or cookie (these sessions failed but the client retried and eventually completed)
    • for all file manifests in the "failed" directory
      • if the manifest is older than a desired timeout duration (for example 2 hours), notify sender of the failure and delete the file manifest
    • for all manifests in the original manifest directory
      • if the manifest is older than a desired timeout duration (for example 1 day), it means that the session was orphaned - didn't complete and for some reason the manifest didn't get moved in the "failed" directory; delete these manifests and optionally notify the sender

Example file manifest

## Transfer manifestion (FASP ver 2.5.22240) 
## Name: DSCN1316.JPG(+5)
## UUID: 88a15a66-3daa-4141-a86c-907fed747c0d
## Client: 172.16.122.1:49780
## Server: 172.16.122.137:33001
## Recipient: asp1@172.16.122.137
## Checksum: NONE
## Cookie: NONE
## Token: NONE
## UserStr: NONE
## Start: 2009-10-22 23:14:00

"/home/asp1/DSCN1316.JPG" 645664B 645664B completed
"/home/asp1/DSCN1317.JPG" 589416B 589416B completed
"/home/asp1/DSCN1318.JPG" 634387B 634387B completed
"/home/asp1/DSCN1319.JPG" 647251B 647251B completed
"/home/asp1/DSCN1320.JPG" 667803B 667803B completed
"/home/asp1/DSCN1321.JPG" 638747B 638747B completed

===========================================
Total number of sources: 6
--Total sources scanned: 6
Total paths scan attempted: 6
--Total paths scan failed: 0
--Total paths scan skipped: 0
--Total paths scan excluded: 0
--Total paths scan completed: 6
Total dir transfer attempted: 0
--Total dir transfer failed: 0
--Total dir transfer passed: 0
Total file transfer attempted: 6
--Total file transfer failed: 0
--Total file transfer passed: 6
--Total file transfer skipped: 0
===========================================
Total elapsed: 3.3s
Total transferred bytes: 3823268
Have more questions? Submit a request

0 Comments

Article is closed for comments.
Powered by Zendesk