Resilio Connect consumes approximately 2kb of RAM for every file/folder (4kb if you are syncing Posix or NTFS permissions).
This article covers the following use case (The following examples assume that you are not syncing files' permissions):
- You need to deliver millions of files and you don't have enough available RAM
- Files on source Agent may change, new files added, but nothing is removed
- You need to keep all Agents synchronized
- The source and target systems don't have enough RAM to keep the entire files tree in RAM
This can be achieved in two steps:
1st - The whole array needs to be delivered in batches (using
transfer_job_files_limit parameter) to remote Agent without consuming all of the RAM.
2nd - Keep the delivered files in sync, and synchronize only newly populated files.
Acknowledge the following limitations and peculiarities1. Only 1-to-1 transfer. The initial transfer with
transfer_job_files_limitparameter only supports 1-to-1 transfer. Adding multiple destinations may cause the job to stop and never finish.
2. Only supported by 2.10.2+ agents
3. The batch limit value shall be more that nested folders depth in the whole data array. I.e. setting
transfer_job_files_limit:10while there are 15 nested folders won't work.
4. Batch limit can't be smaller than number of folders in the whole data array.
5. If destination agent has same files and they are newer that files on source, the files will be overwritten.
6. Data transfer may be slow of flat folder structure (e.g. 2M file on same level)
Step 1 - Initial files synchronization
- Calculate the amount of subfolders (recursively) in the folder you are going to synchronize.
- Create two Agent profiles in advance, one will be for source agent, the other - for destination agent.
- Add the following custom parameters to these two of agent profiles:
In both set
"Allow to copy local files"->
- to ensure agents don't to abort distribution job if they encounter files changes, file locks and other events that normally aborts the job add
"Skip file errors(Transfer jobs)->
True(if for some reason either of agent requires this setting to be false for other jobs, instead add custom parameter "
true" and it will be applied only for this current job)
For destination agent profile
For source agent profile:
XXX, the "XXX" value indicates the number of file/folder entries to be synced per iteration. It is limited to your RAM capabilities: if your computer has 4GB of RAM, we recommend you set this value to a maximum of 1000000 (2GB for files/folders, 1GB for networking buffers, and 1GB RAM should be reserved for the OS)
- Configure and start the distribution job (mark down the distribution job start time)
Once the job has completed the initial transfer, proceed to the second step
Step 2 - Keeping the array of data synchronized
- Edit the parameter "Max file age for scanning (seconds)" in the Job profile.
- This parameter allows Agent to only sync files that have changed during X last seconds. Therefore, setting it to 86400 orders Agent to only sync files changed during last 24 hours. Ensure that this value overlaps the timestamp you marked down in Step 1.
- Setup a Sync job.
Agents will only recheck the files that fall into the configured time range. It will speed up syncing, as agents won't have to recheck all those millions of files.
Enforcing absolute time window for files to be synced
In other complex setups, for step 2 administrator might need to force agent to scan files using absolute time window, not a sliding window. This can be done by using 2 custom parameters:
- Min file age for scanning (seconds) - starting time of the window in UNIXTIME format
- Max file age for scanning (seconds) - ending time of the window in UNIXTIME format
For example, to only sync files from May 25 2019 9:00am (UTC) to May 29 2019 9:00am (UTC) set
- Max file age for scanning (seconds) = 1559120400
- Min file age for scanning (seconds) = 1558774800
You can use this site for UNIXTIME conversion