Current GemBox.Document class is not thread-safe. If you start mail merging in multiple threads it will probably end up with an error.
Since merging of huge documents could be time consuming and since most of the processors nowadays contain two or more cores it would be nice if the merging could utilize the full power of the CPU. That means to
- split the merge into several threads, every working with its own document model
- run them in parallel
- wait till all threads are complete
- join all document models into the final one
However, this is not possible now. At least not when you have custom Document.MailMerge.FieldMerging event.
Official response
Hi,
preliminary analysis based on the proposals listed here showed that performance gain would be minimal on none. Based on that we have decided to close this request.
Regards,
Stipo
Mail merge in large documents can be optimized so that a small region of a document (instead of an entire document) is duplicated and filled with data for each record in the data source. For an example visit the following link:
https://www.gemboxsoftware.com/document/examples/mail-merge-ranges/903
Running mail merge in parallel would require the following steps:
1. Partitioning of the original data source for each thread.
2. Loading or cloning the original template document in each thread.
3. Executing mail merge on the clone of the original template document with the partition of the original data source in each thread.
4. Wait for all threads to finish and merge the resulting documents of each thread into one resulting document.
As opposed to regular mail merge, concurrent mail merge has a lot of redundant operations required to enable parallelism without blocking. These operations are partitioning the data source, cloning template documents, and merging resultant documents. These extra operations make concurrent mail merge more inefficient than regular mail merge.
The conclusion is that mail merge in GemBox.Document is naturally a sequential operation because housekeeping operations required to achieve parallelism outweigh the actual parallel execution.
Regards,
Mario