Starting from v2.4.5 Resilio Connect product is high availability ready. This articles outlines how to deploy Resilio Connect MC and configure jobs to ensure high availability (HA) of your whole setup.
Setting up Management Console Availability watchdog includes the following steps:
Install 2 Management Consoles. consider one to be primary and the other to be secondary. Secondary MC should NOT be running by default.
Set up a DNS CNAME for your MC, ensure your local DNS server points to primary MC address by default.
Ensure you always put a DNS name of MC into Agents configuration file
Edit primary MC's configuration file fields
- Point backup location to either shared network drive / SAN OR (see note)
Point backup location to secondary MC over any preferred network protocol (see note)
Set backup frequency to 1 hour using provided Cron syntax.
Setup crossover procedure, as described below.
NoteWhen using SMB mapped location, always use UNC path (like \\Computer\Backups) instead of the mapped drive letter. Escape the slash on Windows (\\\\Computer\\Backups). Also, ensure that the user running Agent has read-write access to the share
Management Console crossover procedure:
1) Install python3 & pip3
Scripts are tested using python3.6, so python3 should point to python3.6) and requests, simplejson libraries:
sudo apt-get install python3 python3-pip
sudo python3 -m pip install requests simplejson
Download and install python3.6 and python3-pip from official site. Install requests and simplejson libraries:
python3 -m pip install requests simplejson
2) Download & unpack archive with scripts on server with secondary MC
On Windows unarchive them into same directory where secondary MC is installed (where
srvctrl.cmd are located).
3) Adjust Config File
config.json as necessary:
URL (protocol + host + port) of primary management console server
Path to primary management console backups location (should be a regular path in file system)
Path where secondary Management Console (srvctl) is installed. On Windows, escape the backslash: "C:\\foo\\bar"
Secondary server start timeout in seconds
Timeout in seconds for PULL_BACKUP_SCRIPT execution
Time interval in seconds between pulling backup from
Command to execute custom script before running secondary server, not compulsory
Command to execute custom script after running secondary server, not compulsory
Timeout in seconds for CUSTOM_SCRIPT execution
Command to execute script that sends notifications. Important: script should accept message to be sent as the first argument, for example
/bin/bash /path/to/script.sh "Some message"
Timeout in seconds for NOTIFICATOR_SCRIPT execution
Number of retries of @utils.retrier decorator, if decorated function returned False
Sleep time in seconds until the next attempt in @utils.retrier decorator
Sleep time in seconds until the next health check
Time out in seconds for http request
Time in seconds after the failure, during which False check status isn't returned
Storage path of secondary MC, for example
C:\\Windows\\System32\\config\\systemprofile\\AppData\\Roaming\resilio-connect-server(for Windows only)
4) Place code to be executed into *.sh files if you're on Linux or replace those files with cmd scripts if you're on Windows.
Be sure to adjust the corresponding parameters in config:
Pre script gets executed when failover logic is triggered but before launching the secondary management console; post script gets executed after attempt to launch the secondary management console. Notificator script is executed to notify administrator about important events. It should contain a custom code that accepts message as the first argument and sends this message somewhere.
5) Navigate to the directory where scripts are located & launch the script:
python3 main.py --config /path/to/config.json --logging debug
For example, if
config.json is in the same folder as
python3 main.py --config config.json --logging debug
This is the main executable script and monitors primary MC and if necessary launches secondary. Logging (--logging) prints events to stdout. It's not compulsory to enable debug logging, but highly advisable.
6) Add this script to system startup:
The script will be polling primary server at intervals as
REQUESTER_LOOP_INTERVAL states. If valid json is not returned, script jumps to failover part and starts secondary MC. After launching secondary server, script starts requesting secondary server status and if secondary server doesn't respond - script triggers notificator.sh to notify admin about it and exits.
NoteThis script doesn't automate switching of agent to secondary MC. You need to implement DNS switch logic in for example custom_pre_script.sh.
7) Tracker Server
- Install additional Tracker Server on machine other than primary MC
- Set next Default Agent Profile parameters:
- Custom trackers: to tracker server which runs on primary MC and your spare tracker server.
- Custom tracker mode: High Availability
8) Agents & Jobs HA
Using Management Console agent in HA setupBoth Management Consoles have a local agent by default. Whenever a fallback happens, this agent's unique identity is not preserved. Secondary console will try to run the jobs as if its local agent was the same agent that the primary console was using.
If you're planning to implement High Availability solution, it is not recommended to add Management Console agents to any jobs. If you still want to sync files on either primary or secondary Management Console host machine, install a new independent agent service there and set up jobs for this agent instead.
While deploying agents, ensure to deliver a sync.conf file which contains DNS name of your MC, not IP address. Each Job's HA depends on a type of job you want to make highly available.
About Jobs & HA
- Synchronization job
HA available only if you have a limited (1-3) number of RW agents. As a HA measure just set up one more RW agent on a separate machine.
- Consolidation job
Set up one more agent as "Destination" on separate machine. Please note, that destination agents MUST NOT save data from sources to the same physical location (i.e. data cannot be saved to same network or SAN location).
- Distribution job
Set up one more agent as "Destination" on separate machine. Ensure, it has fastest connection possible to the source and has data pre-seeded. Alternatively, reserve destination machine may point to the same data location as source does (i.e. same network / SAN location).
Note, that in either case reserve agent will still need some time to receive metadata from source. The exact time depends on data size, CPU power and storage speed. Once reserve agent got the metadata, it'll keep seeding data to all other destinations even if source dies.