Join / Combine multiple documents into a single file - Knowledgebase / GemBox.Document - GemBox Support Center

Join / Combine multiple documents into a single file

In order to join multiple documents together to create a single document, we need to add all the content from these source documents into a destination document.
It is not allowed to directly insert an element instance from one document to another; we need to first import these elements to another document and then insert them into a specified place.

So to achieve a documents joining, we can loop through source documents sections, import them to a destination document, and insert them at the end.

Here is an extension method which will achieve this task:

C# code

public static class GemBoxDocumentHelper
{
    public static DocumentModel JoinWith(this DocumentModel destinationDocument, string filePath)
    {
        DocumentModel sourceDocument = DocumentModel.Load(filePath);
        foreach (Section sourceSection in sourceDocument.Sections)
        {
            Section destinationSection = destinationDocument.Import(sourceSection, truefalse);
            destinationDocument.Sections.Add(destinationSection);
        }
        return destinationDocument;
    }
}

VB.NET code

Module GemBoxDocumentHelper
    <System.Runtime.CompilerServices.Extension>
    Public Function JoinWith(destinationDocument As DocumentModel, filePath As StringAs DocumentModel
        Dim sourceDocument As DocumentModel = DocumentModel.Load(filePath)
        For Each sourceSection As Section In sourceDocument.Sections
            Dim destinationSection As Section = destinationDocument.Import(sourceSection, TrueFalse)
            destinationDocument.Sections.Add(destinationSection)
        Next
        Return destinationDocument
    End Function
End Module

And here is how we can use it:

C# code

string filePath1 = "In1.docx";
string filePath2 = "In2.docx";
string filePath3 = "In3.docx";
string filePath4 = "Out.docx";

DocumentModel.Load(filePath1)
             .JoinWith(filePath2)
             .JoinWith(filePath3)
             .Save(filePath4);

VB.NET code

Dim filePath1 As String = "In1.docx"
Dim filePath2 As String = "In2.docx"
Dim filePath3 As String = "In3.docx"
Dim filePath4 As String = "Out.docx"

DocumentModel.Load(filePath1) _
             .JoinWith(filePath2) _
             .JoinWith(filePath3) _
             .Save(filePath4)

Helpful Unhelpful

37 of 51 people found this page helpful

Comments(11)

Chris Crowley
Does this work with PDFs too?
Mario - GemBox
Hi,

Yes, you can load any documents which are of a supported input file formats and save them into any document which is of a supported output file format.
For a list of supported input and/or output file formats please refer to a following help page:
https://www.gemboxsoftware.com/document/help/html/Supported_File_Formats.htm#FileFormatSupport

However, I must point out that currently GemBox.Document's PDF reader is in beta and has limitations. For more information please refer to a following help page:
https://www.gemboxsoftware.com/document/help/html/Supported_File_Formats.htm#PdfReaderSupportLevel
The current implementation of PDF reader does not provide high fidelity.

Instead, to merge multiple PDF files I would recommend you to use our other component, GemBox.Pdf.
See the following example:
https://www.gemboxsoftware.com/pdf/examples/c-sharp-vb-net-merge-pdf/201

Regards,

Mario
GemBox d.o.o.
Gary Rynearson
Two questions:
1. Can my source documents be .DOCX and my destination be a PDF?
2. If the answer to question 1 is yes, will the process scale to my requirement of aggregating a large number of .DOCX files (each containing about 300 pages) into a PDF that will eventually contain up to 40,000 pages?
Mario - GemBox
Hi,

1. Yes, as mentioned in the above comment, you can load any files of supported input format and save them to any file of supported output format.

2. In general GemBox.Document should be able to meat that requirement, but I cannot say for sure:
https://support.gemboxsoftware.com/kb/articles/how-many-paragraphs-does-the-gembox-document-support

This will depend on the document's content and on the machine which is executing the code.
With large documents there will be appropriate memory requirements, that is because the entire document is represented with an in-memory object, a rich content model called DocumentModel.

Also converting a document to a PDF or XPS requires more time and memory then saving a document to a DOCX file format. This is because the rendering engine needs to paginate and render (calculate) all the objects from GemBox.Document's DocumentModel instance.

So it depends on the document's data and on machine that is executing the code how much data it can handle with the GemBox.Document. Also it depends if it's a 32-bit or 64-bit application.

Regards,

Mario
GemBox d.o.o.
Hoang Nguyen
is there a way to join documents without a section break/page break? I'd like these documents to be merged but continue to flow. e.g. if my first document has one paragraph and my 2nd doc has one paragraph, I'd like my merged doc to contain both paragraphs on the same page.
Mario - GemBox
Hi,

Yes, you could for example import all the document's content block by block, like the following:

public static DocumentModel JoinWith(this DocumentModel destinationDocument, string filePath)
{
Section destination Section = destinationDocument.Sections[destinationDocument.Sections.Count - 1];

DocumentModel sourceDocument = DocumentModel.Load(filePath);
foreach (Section sourceSection in sourceDocument.Sections)
{
foreach (Block sourceBlock in sourceSection.Blocks)
{
Block destination Block = destinationDocument.Import(sourceBlock, true, false);
destinationSection.Blocks.Add(destinationBlock);
}
}

return destinationDocument;
}

Or you could perhaps just change the imported Section element's SectionStart property, like the following:

public static DocumentModel JoinWith(this DocumentModel destinationDocument, string filePath)
{
DocumentModel sourceDocument = DocumentModel.Load(filePath);

foreach (Section sourceSection in sourceDocument.Sections)
{
Section destination Section = destinationDocument.Import(sourceSection, true, false);
destinationSection.PageSetup.Section Start = SectionStart.Continuous;
destinationDocument.Sections.Add(destinationSection);
}
return destinationDocument;
}

You could also consider doing this only for the first Section element in the source document and leave the rest of the Section elements as they were (keep their original SectionBreak).

Regards,

Mario
GemBox d.o.o.
Hoang Nguyen
you guys are awesome. It worked perfectly. I really like the second method
David
Hi,
Can we merge multiple pdfs into one without page breaking between them?
Thanks.
Stipo - GemBox
Hi,

Yes, you can merge multiple PDFs into one without page breaking between them.
Use the code from the above comment that shows how to join documents without a section break/page break.

However, currently GemBox.Document's PDF reader is in beta and has limitations. For more information please refer to a following help page:
https://www.gemboxsoftware.com/document/help/html/Supported_File_Formats.htm#PdfReaderSupportLevel
The current implementation of PDF reader does not provide high fidelity.

Therefore, for merging PDF files we recommend that you use GemBox.Pdf component.
Here is an example: https://www.gemboxsoftware.com/pdf/examples/c-sharp-vb-net-merge-pdf/201

Regards,

Stipo
GemBox d.o.o.
David
Hi Stipo,

I tested the code above and I got the following exception:
Cannot handle iref streams. The current implementation of PDF component cannot handle this PDF feature introduced with Acrobat 6.

Thanks for the help.
Stipo - GemBox
Hi,

Currently GemBox.Document's PDF reader cannot handle PDF files with iref streams.

Therefore, for merging PDF files we recommend that you use GemBox.Pdf component that can handle PDF files with iref streams.
Here is an example: https://www.gemboxsoftware.com/pdf/examples/c-sharp-vb-net-merge-pdf/201

Regards,

Stipo
GemBox d.o.o.

Add a comment

You need to log in before you can submit a comment.

Need a password reminder?