File policy is a set of rules to manage a cache of files in File cache and Hybrid work dataflows. Its is available starting with Resilio Active Everywhere Platform v4.0.
fsutil behavior set disablelastaccess 0
If the access time update is not enabled, the Agent will report the error
1. Managing the cache size
2. File download priority
3. Managing the file rules
4. Policy checker
5. Changing the File policy
6. Limitations and peculiarities
Create a new file policy and use it in the job for Cache Servers or End-user Agents.
Different cache servers in the job may have different file policies depending on the use case needs.
The File policy applies to files and subfolders in the job. For simplicity we will be using 'files' in the guide below.
Managing the cache size
Define the maximum cache size that can be taken on the storage. Cache size can be defined in absolute values or in percentage to the storage. The critical point to note is that only the actual cache size matters when determining if the defined threshold has been exceeded. The absolute values or storage percentage specified are effectively reserved for the cache. In practical terms, this means that if other files already occupy space on the storage (outside of the cache), the Agent may not be able to utilize the full allocated cache size. In such cases, instead of cleaning up the cache, the system will report a "no free space" error.
For example, there's storage with 1Tb capacity, setting "dehydrate file when size exceeds 80% of storage", some files there already occupy 700 Gb. In this scenario, the system will try to allocate 800 GB for the cache (80% of 1 TB). However, since 700 GB is already occupied by other files, the cache can only occupy the remaining 300 GB. As a result, the storage will be completely filled, triggering a "no free space" error. Cache cleanup will not occur until the actual cache size exceeds 800 GB.
For refined configuration, use high and low watermarks. Continuing with our example, you might set a high watermark at 90% and a low watermark at 70% of the allocated 800 GB. When the cache size exceeds 720 GB (90% of 800 GB), the system will begin cleanup until the cache is reduced to 560 GB (70% of 800 GB).
The cache server will use as all of the configured storage quota. For multiple jobs on a single storage you need to split the available storage space across the jobs. One way is to do it evenly, however, in this case each of the caches will use only a smaller part of the storage. The other way is to set a larger size of cache that is going to be most actively accessed.
If the files, present in cache, don't fit into the configured allowed size, the least accessed, based on their access time, will be cleared from cache (dehydrated). By default files are cleared from cache until there’s 70% of cache occupied.
Calculated cache size includes not only the user files, but also Archive and service files inside .sync folder. Partly downloaded files (.!sync type) are not calculated in cache size.
Files won’t be cleared from cache, if:
- they have not yet been uploaded to the selected Priority Agent. This also applies to the files that were locally added to cache by end-users. If the Priority peer is not configured in the job, file will be cleared from cache.
- they cannot be accessed by the Resilio Agent at the moment. For example, they’re in use by another application or are locked.
- files are in 'pinned' state, match the Pinning rule or are pinned locally by a user.
- these are partly downloaded files in cache
If the files don't fit the cache size, even if it's a single file in the job, and there are no candidates for cleanup, the Agent gives the error.
Cache cleanup does not happen immediately. The folder rescan is triggered for the Agent to discover the files that can be cleaned up.
File download priority
Cache prepopulating of files can be prioritized by their timestamp or by name.
Managing the file rules
These are basically the rules defining which exactly files/folders, originated on Primary storages or other cache server, will or will not be populated on a given cache server automatically. Rules are based on files/folders names, ECMAScript regexp syntax is used (ECXMA-262 standard)
Two modes are supported - case sensitive and case insensitive. The case insensitive rules work slower.
Backslash \ is used for escaping special characters. In case of Windows path, back slash should be also escaped (double backslash used, e.g. D:\\test)
Forward slash / is accepted as a delimiter for Windows paths as well, e.g. c:/windows and c:\\windows are equivalent.
Also avoid using `.+` `.*` at beginning
Use only local path as rules, for example:
a) to pre-hydrate all txt files in a subfolder (for example `/mnt/storage/subfolder/`) use `subfolder/.+\.txt`, not `/mnt/storage/subfolder/.+\.txt` (will not work) and not `./subfolder/.+\.txt` (will work unexpectedly)
b) rule .+\.0$
is interpreted the following way: any char one or more times .+
+ string \.
(here .
is not token, because it was escaped by \
) + string 0
+ end of string$
(token).
While there is no encoded limit to the number of rules that you have, for highest performance is advisable not to have more than 100 rules in total.
Exclusion policy
This is the most prioritised rule - files that match this rule are excluded from caching. They do not appear in cache, even though they match some other file rule. All changes to such files are ignored by the Cache Server, whether the files is updated on the Primary storage or by the user or the cache server.
Files counter discrepancy: files excluded from cache and not counted as total/local files in job run details by the Cache server.
!! This is not equivalent to File and folder filter! While the file and folder filter basically ignores the files and does not even track their state on all Agents in the job, the Exclude policy simply excludes them from showing up in the cache server where the policy with this rule is applied. Exclude list applies only to cache server. If the files are added to or are updated on the the Primary Storage, it tracks these changes locally.
Pinning policy
Files that match this rule are always kept in cache and are not cleared automatically, provided cache size is big enough to keep them.
Users cannot clear such files from the storage using "Free up space" context menu or by a cli command.
Hydration policy
Files that match this rule are automatically downloaded in cache once they appear in the job on Primary storage or other cache servers. The rule applies to the new files in the folder or if the file is updated.
If such file is cleared from cache (either manually or by the cache size policy), it’s no longer downloaded in cache unless requested manually or is updated on a remote Primary storage or Cache server.
Files that are fetched to cache by user request (not by a rule in file policy) don't automatically download file updates from remote servers.
Policy checker
Policy checker allows to compare an entered file name to the configured file rules and file priority by name. It shows what rule the give filename will match, which may help to find some misconfigured contradicting configuration.
In order to check a part of the path, use relative path (relative to the job folder), not the absolute path on the storage.
Try avoiding rule conflicts. If some rules are contradicting each other, further behavior is undefined. For example, if a subfolder matches the Exclude rule, but files inside it match the Pinning rule, the Agent will give an error about lack of free space and the job will stall.
If a file does not match any rule, the file will be available for access from cache, and won't be fetched in cache until requested by a user.
Changing the File policy
All the parameters in the File policy can be changed in runtime. The changes are applied immediately and in most cases do not require restarting the agent or rescanning the folder.
Limitations and peculiarities
Configured cache size may seem to be exceeded because of party downloaded files in .sync service folder.
Linux cache server unable to download file matching case sensitive policy.
The local whitelist rule has a lower priority than the global file exclusion policy. I.e. if a file is whitelisted in .sync/Ignorelist on an agent, but is excluded in Exclude policy, it will be excluded from the cache.
Backreferences in regular expressions are not supported. The Agents will report error "Not enough free space on the cache"