This is Part 2 of a series, if you have not read Part 1 you can do so here:
How was Veeam configured?
Lastly, I required a configuration for the backup copy jobs, to test the file systems as best I could. To this end I decided to create backup copy jobs that targeted the same 8 servers, with 7 incremental and 8 weekly backup copies configured via GFS. The final jobs looked something like this:
I want to see results!
Okay, following on from my previous post I sought to share information that was garnered from early results. I wanted to see how the initial ingestion of data looked, to make a comparison between the different storage techniques. With that in mind, before any post processing was applied, I looked at the file systems and compared the sizes of the per VM .VBK files. (these are full backups in Veeam).
Interestingly, 64K ReFS formatted drives have an additional file system overhead once formatted, when compared to 4K. Feel free to test this yourself, create two new large identical sized thin provisioned disks on a VM running Windows 10 1703 or above or Server 2016, format the drives as ReFS 4K & 64K block size and look at the used and free space.
I will have to find out why this is, nothing turned up after a quick Google so I assume, it’s something quite low down in the ReFS filesystem. I have publicly shared the graphs below and all other graphs used, via Power BI, I would encourage people to expand the content below and have a closer look.
Use the double-sided arrow in the bottom right corner to view full screen, these are interactive graphs.
So, what can we learn from the above?
Right away we can confirm that Veeam default job settings give us a varying amount of data reduction, before the data even lands on the repository. As expected the DB server with its structured data achieves the best reduction in space, servers with binary data see the least amount of savings. Check the two graphs below for 4K and 64K block sizes comparison:
How well did they dedupe?
Similarly, as expected raw uncompressed data overwhelmingly achieved the best levels of deduplication, I have included the ReFS repository data to compare, obviously there was no post process operation on the ReFS repositories. The chart below shows the results of enabling deduplication on the NTFS repositories and compares the capacity of the repository before and after deduplication had completed:
Initial observations look good…
Over the course of the next 8 weeks the copy jobs did their thing, and during the tests I also made an unexpected discovery regarding the behaviour of dedupe-friendly vs optimal and uncompressed that caught me off guard.
The main backups are stored using optimal compression, when a copy job for optimal and uncompressed repositories run, Veeam data mover service sends the data in its deduplicated compressed form over the network to the mount server for the target repository and the blocks get written.
My observations for dedupe-friendly seemed to show that when the source optimal compression blocks were read from the backup, the data mover service inflates the data again and reprocesses it into dedupe-friendly and sends the relatively inflated dedupe-friendly data over the network. You can observe this behaviour in the slides below:
If you are unsure as to what the job name means, please refer to the jobs screenshot above. In short, Jobs assume a default of optimal compression, DF = Dedupe-Friendly * UC = Uncompressed.
Images are best viewed full screen:
I will ask some of my Veeam friends to confirm if this behaviour is expected, I imagine it is to increase the retention capabilities on deduplicating appliances such as Dell EMC Data Domain and HPE StoreOnce. I’m not sure if the data mover service on an Exagrid decompresses this before it commits to disk either to be fair, if so, it may be better sending optimal blocks over the network to increase throughput if that is a constraint, especially if over a WAN link!
What was the usage on repositories over the course of the 8 weeks of testing?
The server that hosted the LUNs for the repositories was monitored by an agent that logged drive usage every couple of minutes to our monitoring platform, this would have created a silly amount of data points so in the end I settled for 478 data points per repository for file system usage over the course of the 8 weeks. This provided capacity reporting every 2 hours 51 minutes and I was able to export this to a CSV and analyse the data.
Below you can see how each respective compression setting per file system compared for 4K and 64K block sizes:
I think this is the first time we start to see the true nature of ReFS vs NTFS, the ReFS gradient is smooth and predictable, the graphs for NTFS look choppy and come in waves. Additionally, 4K blocks and 64K blocks appear to be very similar in results:
What if you only stored full backups and ignored the daily’s?
Due to the repositories containing ReFS and NTFS filesystems, to make a fair comparison I had to chop off, the first week and last week and use the 6 weeks in the middle for the next graph. I did not want to report on potentially skewed results. Once I had all the other reporting data I needed I removed the first and last weeks from the repositories and ran scrubbing and garbage collections on all NTFS volumes, the ReFS volumes had the same backup copies removed.
The following graph is the middle 6 weeks of the backup test:
This is the only graph looking at 6 weeks data, all others report on 8 weeks
There is a lot of information in this graph. Initially the capacity savings of processed data in the NTFS uncompressed repositories is impossible to ignore, however you cannot ignore the additional space required to ingest the data. If a long-term retention repository is your goal, then within the constraints of NTFS deduplication, (1TB officially, seen 4TB restored without issue in testing) uncompressed offers huge gains in terms of data reduction, 20:1 in this case, for free, with Windows.
10:1 can be achieved using dedupe-friendly albeit using additional network bandwidth. Almost 4:1 can be attained using optimal compression which works around the 1TB officially supported file system limits nicely depending on data type. With ReFS and Optimal compression, we can achieve an approximate 2.5:1 ratio using the data in this test, obviously your real-world mileage may vary. On some deployments I have seen it as high as 3.5:1.
What are your thoughts on 4K block sizes?
4K blocks offer no benefits, if anything, they will be a hindrance long term if for nothing else other than increased volume fragmentation.
Part 3 is available here: