$Id: FAQ.Capacity_Planning.txt,v 1.14 2025/09/12 12:52:23 gilles Exp gilles $

This documentation is also available online at
https://imapsync.lamiral.info/FAQ.d/
https://imapsync.lamiral.info/FAQ.d/FAQ.Capacity_Planning.txt


======================================================================
            Imapsync tips for Capacity Planning.
======================================================================


I planned to go to a distributed architecture for the online service. 
I did, it is in place since May 2022, using proximapsync as a front-end
to handle the requests and distribute the jobs to a pool of hosts.

Let's do some capacity planning. Imapsync takes memory, cpu and bandwidth
in a relatively deterministic values.

My current question is: Shall I use 
* N   2GB hosts
* N/2 4GB hosts
* N/4 8GB hosts

or another choice. My goal is to minimize the bill, the money spent.



Let's do some observations and calculations

The observations are done on the standalone imapsync online which 
characteristics are:

CPU: Intel i5-2300 with 4 cores

RAM: 16 GB

NET: 100 Mbps symmetrical, 12.5 MBytes/s symmetrical, 
     so 25 MBytes/s max for a global imapsync rate.

Disks: I don't know.

System: FreeBSD 11.4 (Now Debian)

===== CPU =====

The CPU can be an issue. On average, an imapsync run takes 5% of the
overall cpu time for a Intel i5-2300 with 4 cores.  It implies 20
imapsync runs is ok on the current online host before the cpus become
the bottleneck.  As a rule of thumb, imapsync takes 20% of a cpu core.

In the Intel i5-2300 4 cores, so far, the maximum number of imapsync 
processes has been 68, which is 3 times what the system should 
support in a standard imapsync workload. For this workload, the 
imapsync performances were not good, the server could not handle the load
and was even out of order for a while. For 40 imapsync processes, the
the performances are ok.

===== RAM =====

The RAM can be an issue. On average, an imapsync run takes 250 MB.  So
4 imapsync processes per GB is the limit before swapping to disk,
which is a known phenomenon telling when memory becomes the bottleneck.
16 GB allows 64 imapsync processes.

On the 16GB, so far, the maximum memory usage taken by imapsync processes was
13 GB. For that value, the one minute load was 10 or more, the number of 
imapsync processes was around 40, the total bandwidth was around 19 MiB/s.

===== LINK =====

The Bandwidth I/O can be an issue.  The "Average bandwidth rate" value
given by imapsync at the end of a transfer, and also the bandwidth
rate given during the sync on the ETA line, is the total size of all
messages copied divided by the time passed. If imapsync is run between
two foreign imap servers, then the total size transferred on the
network link is twice this value, one time when getting the message
from host1, rx, and one time sending the message to host2, tx.

On average, an imapsync runs at 3 Mbps both ways, rx and tx, so 6 Mbps
in total.  So a 100 Mbps symmetrical link allows 33 imapsync processes
before the link becomes the bottleneck.

The best minutes observed so far are a global 187 Mpbs rate (~23 MiB/s) 
on a 100 Mbps symmetrical link done by 28 imapsync processes in parallel, 
6.7 Mbps per process at that minute.  But the best minutes observed for a
single imapsync process is at 103 Mbps (~13 MiB/s), the rx/tx link limit.

===== DISKS =====

The hard-disk I/O can be an issue. I don't measure it yet because
imapsync doesn't perform heavy I/0 where it runs. Well, I really 
don't know, since no measure means no knowledge, just guesses.

Well, let's have a go, it'll be good to know why it isn't an issue
yet; and estimate when it will be an issue.

A quick check on hard-disk performance on Linux is done by:

  hdparm -t /dev/sda
  
From a bunch of kimsufi hosts I tested:
A SATA disk gives reads at 150 MB/sec, writes at 100 MB/sec, 14 MB/sec by block size of 200 bytes
A SSD  disk gives reads at 740 MB/sec, writes at 170 MB/sec, 

In bits, to easily compare disk and network bandwidths:
A SATA disk gives reads at 1200 Mbps/sec, writes at  800 Mbps/sec
A SSD  disk gives reads at 5920 Mbps/sec, writes at 1360 Mbps/sec 

The main I/O imapsync does on disk during a run is writing the logfile.
It writes a line of characters for each message, around 200 bytes.
14 MB/sec allows 70k lines written per second. It will be a 
problem when imapsync sync 70k messages per second.
Even in a parallel execution, we're far from it.

===== Resource allocation ======


Let's consider we have a bunch of different hosts able to run imapsync 
processes. How should we distribute imapsync jobs among them?

A simple rule is "do not add more load to a host when one of the resources
has reached its maximal. The resources are memory, bandwidth, cpu, disk.
I ignore the disk part for now.

maximum number of imapsync processes for a host 
= min( 4 * RAM_in_GB - 4, 10 * nb_cores, bandwidth_in_mbps / 3 )



Here are some ovh servers with their characteristics and their prices.

https://www.ovhcloud.com/fr/vps/

vps starter   RAM  2 GB cpu 1 vCore Link 100 Mbps 42 EUR/year 
imapsync_nominal = min( 4, 10, 33 ) = 6 ; 42/4 = 10.50 EUR/year/imapsync

vps value     RAM  2 GB cpu 1 vCore Link 250 Mbps 59 EUR/year 
imapsync_nominal = min( 4, 10, 83 ) = 4 ; 59/4 = 14.75 EUR/year/imapsync

vps essential RAM  4 GB cpu 2 vCore Link 500 Mbps 128 EUR/year 
imapsync_nominal = min( 12, 20, 166 ) = 12 ; 128/12 = 10.66 EUR/year/imapsync

vps comfort   RAM  8 GB cpu 4 vCore Link 1000 Mbps 240 EUR/year 
imapsync_nominal = min( 28, 40, 333 ) = 28 ; 240/28 = 8.57 EUR/year/imapsync


https://eco.ovhcloud.com/fr/kimsufi/

I rent 2 of these,  i007/ks7 and i008/ks8, not on rent anymore:
kimsufi        RAM  4 GB cpu 4 Link 100 Mbps  60 EUR/year 
imapsync_nominal = min( 12, 40, 33 ) = 12 ;  60/12 = 5.00 EUR/year/imapsync

I rent 1 of these, i005/ks5, not on rent anymore:
kimsufi        RAM 16 GB cpu 4 Link 100 Mbps 172 EUR/year 
imapsync_nominal = min( 60, 40, 33 ) = 33 ; 172/33 = 5.21 EUR/year/imapsync

I rent 1 of these, i009/ks9:
kimsufi ks-le-c RAM 32 GB cpu 8 Link 300 Mbps 264 EUR/year 
imapsync_nominal = min( 124, 80, 100 ) = 80 ; 156/80 = 1.95 EUR/year/imapsync

Even if the last one looks the best one, it depends on the how many imapsync
are running in average since the price per year per imapsync is a nominal value,
it is the price when the host is constantly charged by imapsync processes,
idle time is not counted in this value.

A better minimal cost estimation takes care of the idle time.


https://www.ovhcloud.com/fr/public-cloud/prices/#388

Sandbox S1-2  RAM  2 GB cpu 1 Link 100 Mbps 77 EUR/year
imapsync_nominal = min( 4, 10, 33 ) = 4 ;  77/4  =  19.25  EUR/year/imapsync

Discovery     RAM  2 GB cpu 1 Link 100 Mbps 87 EUR/year 
imapsync_nominal = min( 4, 10, 33 ) = 4 ;  87/4  =  21.75  EUR/year/imapsync

Discovery     RAM  4 GB cpu 2 Link 250 Mbps 174 EUR/year 
imapsync_nominal = min( 12, 20, 83 ) = 12 ;  174/12  =  14.5  EUR/year/imapsync

Discovery     RAM  8 GB cpu 4 Link 500 Mbps 313 EUR/year 
imapsync_nominal = min( 28, 40, 166 ) = 28 ;  313/28  =  11.18  EUR/year/imapsync

