Temporarily reduce space taken by Console database

Description

If you run out of space on a Console server but do not have additional storage immediately available, you can still remain operational and preserve data by temporarily reducing the space taken by the Console database. This can be done by clearing Console’s reporting tables to free up disk space, since in this situation it may be impossible to run purges, or back up or rebuild the entire database.

Console keeps two copies of transfer data. The first is for displaying, monitoring, and email notifications. It consists of the raw data logged by nodes (to the tables fasp_sessions, fasp_files, and fasp_rates) plus additional tables to tie them together for display (tables jobs, job_steps).

The second (canonicalized) copy is for generating reports and is itself generated based on the raw data. This is done by the Data Canonicalizer background process. The canonicalizer cleans up and cross-references the raw data to make report generation easier. This data set is contained in the following tables:

  • transfers
  • transfer_sessions
  • transfer_session_sides
  • transfer_files
  • transfer_session_files
  • transfer_session_file_sides

In addition, there is a utility table (canonicalizer_max_ids) which keeps track of how much of the raw data has been digested (in other words, canonicalized and copied to the reporting tables).

You can safely delete the reporting data and have the canonicalizer reconstruct it later when there is additional storage available, given the following conditions:

  1. The amount of historical data in the console is reasonable (rough rule of thumb: less than a million sessions, less than 10 million files)
  2. No managed nodes have been deleted within the time window covered by historical data
  3. You do not need to generate reports for a while

Condition #1 is to ensure that the reporting data can be reconstructed in a reasonable amount of time (hours or days as opposed to weeks).
Condition #2 is due to the Canonicalizer’s reliance on the managed node information being available in order to pull in node name and cross reference transfer data. If the node record does not exist any more when Canonicalizer reprocesses the historical data then the transfer will not be properly assigned to a node.

Procedure to drop historical data

1. First, you will need at least some disk space available in order for MySQL to be able to run properly.

Shut down Console by running the following command:

asctl console:stop

 

Delete or compress log files (Console, Apache, or OS logs) if possible to free up some space.

2. Once space is freed and MySQL is running, start Console by running the following:

asctl console:start


Go to your Console web interface. Click Configuration on the main navigation menu, then select the Background tab. Stop the following three background processes:

  • DataCanonicalizer
  • CustomField
  • Report


3. One the 3 background processes are stopped, click edit for each and set Startup type to disabled. This will prevent them from starting if the server is rebooted or Console is restarted.

4. Log in to MySQL command line client.

You can do this by using the following command, where SECRET is your mysql root password:

/opt/aspera/common/mysql/bin/mysql --password=SECRET


5. Check to make sure that the canonical tables are not in use.

You can do this with the following command:

SHOW PROCESSLIST;

Make sure you do not see any queries that are working with the six tables listed above (named "transfer_*"). It is possible (although unlikely) that there are some queries left running. If you see any, kill those queries.

You can kill queries with the following command, where processlist_ID is the ID of the process listed:

kill processlist_ID


6. Execute the following SQL statements:

use aspera_console;
 
set foreign_key_checks = 0;
 
truncate transfers;
truncate transfer_sessions;
truncate transfer_session_sides;
truncate transfer_files;
truncate transfer_session_files;
truncate transfer_session_file_sides;
 
optimize table transfers;
optimize table transfer_sessions;
optimize table transfer_session_sides;
optimize table transfer_files;
optimize table transfer_session_files;
optimize table transfer_session_file_sides;
 
set foreign_key_checks = 1;
update cf_fields set recalc_status="requested";
update canonicalizer_max_ids set max_fasp_file_id=0, prev_max_fasp_file_id=0, max_fasp_session_id=0;

 

7. Check disk space. You should now have much more free space.
After more storage has been allocated (or after a scheduled purge has been set up to keep historical data trimmed to a shorter window, and AFTER this scheduled purge has run at least once) then the three background processes that support reporting can be reenabled.

You can do this by going back to the Background page on the Console interface and editing each Startup type to automatic. Then either restart the Console or manually start the three processes.

8. Edit the DataCanonicalizer background process and set Maximum Heartbeat Interval to something large (such as 3600) and the batch size to something large as well (at least 10,000). This will help with processing the large backlog of historical data.

You can keep track of canonicalizer progress in reconstructing the reporting copy of data with the following query:

select utc_timestamp() as current_utc_time,
  (select started_at from fasp_sessions where id=
    (select max(foreign_key_id) from transfer_session_sides)) as last_session_processed,
  (select started_at from fasp_files where id=
    (select max(foreign_key_id) from transfer_session_file_sides)) as last_file_processed;


This will produce output like the following:

+---------------------+------------------------+---------------------+
| current_utc_time    | last_session_processed | last_file_processed |
+---------------------+------------------------+---------------------+
| 2014-02-12 18:11:27 | 2014-02-10 19:51:56    | 2014-02-10 20:06:40 |
+---------------------+------------------------+---------------------+
1 row in set (0.00 sec)


The second and third columns show how far the canonicalizer has progressed in digesting session and file data. By comparing against the first column you can judge how far behind you are.
In the above example Canonicalizer has about 48 hours worth of data still to process (or alternatively, there has not been a transfer started in the last 48 hours).

Have more questions? Submit a request

0 Comments

Article is closed for comments.
Powered by Zendesk