Overview
Starting from v2.4.5 Resilio Connect product is high availability ready. This articles outlines how to deploy Resilio Connect MC and configure jobs to ensure high availability (HA) of your whole setup.
Management Console
Setting up Management Console Availability watchdog includes the following steps:
-
Install 2 Management Consoles. consider one to be primary and the other to be secondary. Secondary MC should NOT be running by default.
-
Set up a DNS CNAME for your MC, ensure your local DNS server points to primary MC address by default.
-
Ensure you always put a DNS name of MC into Agents configuration file
-
Edit primary MC's configuration file fields
(root).backup.path
and(root).backup.schedule
:- Point backup location to either shared network drive / SAN OR (see note)
-
Point backup location to secondary MC over any preferred network protocol (see note)
-
Set backup frequency to 1 hour using provided Cron syntax.
-
Setup crossover procedure, as described below.
Note
When using SMB mapped location, always use UNC path (like \\Computer\Backups) instead of the mapped drive letter. Escape the slash on Windows (\\\\Computer\\Backups). Also, ensure that the user running Agent has read-write access to the shareManagement Console crossover procedure:
1) Install python3 & pip3
Scripts are tested using python3.6, so python3 should point to python3.6) and requests, simplejson libraries:
- Linux:
sudo apt-get install python3 python3-pip
sudo python3 -m pip install requests simplejson - Windows
Download and install python3.6 and python3-pip from official site. Install requests and simplejson libraries:python3 -m pip install requests simplejson
2) Download & unpack archive with scripts on server with secondary MC
On Windows unarchive them into same directory where secondary MC is installed (where node.exe
and srvctrl.cmd
are located).
3) Adjust Config File
Adjust config.json
as necessary:
PRIMARY_SERVER_ADDRESS
URL (protocol + host + port) of primary management console serverBACKUPS_PATH
Path to primary management console backups location (should be a regular path in file system)SECONDARY_SERVER_DIR_PATH
Path where secondary Management Console (srvctl) is installed. On Windows, escape the backslash: "C:\\foo\\bar"SECONDARY_SERVER_START_TIMEOUT
Secondary server start timeout in secondsPULL_BACKUP_SCRIPT_TIMEOUT
Timeout in seconds for PULL_BACKUP_SCRIPT executionPULL_LATEST_BACKUP_INTERVAL
Time interval in seconds between pulling backup fromCUSTOM_PRE_SCRIPT
Command to execute custom script before running secondary server, not compulsoryCUSTOM_POST_SCRIPT
Command to execute custom script after running secondary server, not compulsoryCUSTOM_SCRIPT_TIMEOUT
Timeout in seconds for CUSTOM_SCRIPT executionNOTIFICATOR_SCRIPT
Command to execute script that sends notifications. Important: script should accept message to be sent as the first argument, for example/bin/bash /path/to/script.sh "Some message"
NOTIFICATOR_SCRIPT_TIMEOUT
Timeout in seconds for NOTIFICATOR_SCRIPT executionMAX_RETRIES_NUMBER
Number of retries of @utils.retrier decorator, if decorated function returned FalseFAILURE_RETRY_INTERVAL
Sleep time in seconds until the next attempt in @utils.retrier decoratorHEALTH_CHECK_INTERVAL
Sleep time in seconds until the next health checkHTTP_REQUEST_TIMEOUT
Time out in seconds for http requestFAILURE_GRACE_PERIOD
Time in seconds after the failure, during which False check status isn't returnedWINDOWS_APP_DATA_DIR_PATH
Storage path of secondary MC, for exampleC:\\Windows\\System32\\config\\systemprofile\\AppData\\Roaming\resilio-connect-server
(for Windows only)
4) Place code to be executed into *.sh files if you're on Linux or replace those files with cmd scripts if you're on Windows.
Be sure to adjust the corresponding parameters in config: NOTIFICATOR_SCRIPT
, CUSTOM_POST_SCRIPT
, CUSTOM_PRE_SCRIPT
.
Pre script gets executed when failover logic is triggered but before launching the secondary management console; post script gets executed after attempt to launch the secondary management console. Notificator script is executed to notify administrator about important events. It should contain a custom code that accepts message as the first argument and sends this message somewhere.
5) Navigate to the directory where scripts are located & launch the script:
python3 main.py --config /path/to/config.json --logging debug
For example, if config.json
is in the same folder as main.py
script:
python3 main.py --config config.json --logging debug
This is the main executable script and monitors primary MC and if necessary launches secondary. Logging (--logging) prints events to stdout. It's not compulsory to enable debug logging, but highly advisable.
6) Add this script to system startup:
The script will be polling primary server at intervals as REQUESTER_LOOP_INTERVAL
states. If valid json is not returned, script jumps to failover part and starts secondary MC. After launching secondary server, script starts requesting secondary server status and if secondary server doesn't respond - script triggers notificator.sh to notify admin about it and exits.
Note
This script doesn't automate switching of agent to secondary MC. You need to implement DNS switch logic in for example custom_pre_script.sh.7) Tracker Server
- Install additional Tracker Server on machine other than primary MC
- Set next Default Agent Profile parameters:
- Custom trackers: to tracker server which runs on primary MC and your spare tracker server.
- Custom tracker mode: High Availability
8) Agents & Jobs HA
Using Management Console agent in HA setup
Both Management Consoles have a local agent by default. Whenever a fallback happens, this agent's unique identity is not preserved. Secondary console will try to run the jobs as if its local agent was the same agent that the primary console was using.If you're planning to implement High Availability solution, it is not recommended to add Management Console agents to any jobs. If you still want to sync files on either primary or secondary Management Console host machine, install a new independent agent service there and set up jobs for this agent instead.
While deploying agents, ensure to deliver a sync.conf file which contains DNS name of your MC, not IP address. Each Job's HA depends on a type of job you want to make highly available.
About Jobs & HA
- Synchronization job
HA available only if you have a limited (1-3) number of RW agents. As a HA measure just set up one more RW agent on a separate machine. - Consolidation job
Set up one more agent as "Destination" on separate machine. Please note, that destination agents MUST NOT save data from sources to the same physical location (i.e. data cannot be saved to same network or SAN location). - Distribution job
Set up one more agent as "Destination" on separate machine. Ensure, it has fastest connection possible to the source and has data pre-seeded. Alternatively, reserve destination machine may point to the same data location as source does (i.e. same network / SAN location).
Note, that in either case reserve agent will still need some time to receive metadata from source. The exact time depends on data size, CPU power and storage speed. Once reserve agent got the metadata, it'll keep seeding data to all other destinations even if source dies.