I want to split a RTF file (with C# or VB.Net) in 2 ore more parts by the string [BreakPage]. I have for exemple this file, containing a [BreakPage], which needs to be split in 2 parts:
{\rtf1\ansi\ansicpg1251\uc1\deff0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\deflang1049\deflangfe1049{\fonttbl{\f0\froman\fcharset204\fprq2{*\panose
02020603050405020304}Times New Roman;}{\f38\froman\fcharset0\fprq2
Times New Roman;} {\f36\froman\fcharset238\fprq2 Times New Roman
CE;}{\f39\froman\fcharset161\fprq2 Times New Roman
Greek;}{\f40\froman\fcharset162\fprq2 Times New Roman
Tur;}{\f41\froman\fcharset177\fprq2 Times New Roman (Hebrew);}
{\f42\froman\fcharset178\fprq2 Times New Roman
(Arabic);}{\f43\froman\fcharset186\fprq2 Times New Roman
Baltic;}{\f44\froman\fcharset163\fprq2 Times New Roman
(Vietnamese);}}{\colortbl;\red0\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;
\red0\green255\blue0;\red255\green0\blue255;\red255\green0\blue0;\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;\red128\green0\blue128;\red128\green0\blue0;\red128\green128\blue0;
\red128\green128\blue128;\red192\green192\blue192;}{\stylesheet{\ql
\li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0
\fs24\lang1049\langfe1049\cgrid\langnp1049\langfenp1049 \snext0
Normal;}{*\cs10 \additive \ssemihidden Default Paragraph
Font;}{*\ts11\tsrowd\trftsWidthB3\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\trcbpat1\trcfpat1\tscellwidthfts0\tsvertalt\tsbrdrt\tsbrdrl\tsbrdrb\tsbrdrr\tsbrdrdgl\tsbrdrdgr\tsbrdrh\tsbrdrv
\ql
\li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0
\fs20\lang1024\langfe1024\cgrid\langnp1024\langfenp1024 \snext11
\ssemihidden Normal
Table;}}{*\latentstyles\lsdstimax156\lsdlockeddef0}{*\rsidtbl
\rsid2111663\rsid7154806 \rsid15558346}{*\generator Microsoft Word
11.0.5604;}{\info{\author Programmer}{\operator
Programmer}{\creatim\yr2011\mo8\dy2\hr12\min45}{\revtim\yr2011\mo8\dy5\hr12\min34}{\version3}{\edmins1}{\nofpages1}{\nofwords5}{\nofchars34}{\nofcharsws38}
{\vern24689}}\margl1701\margr850\margt1134\margb1134
\widowctrl\ftnbj\aenddoc\noxlattoyen\expshrtn\noultrlspc\dntblnsbdb\nospaceforul\hyphcaps0\horzdoc\dghspace120\dgvspace120\dghorigin1701\dgvorigin1984\dghshow0\dgvshow3
\jcompress\viewkind1\viewscale100\nolnhtadjtbl\rsidroot15558346
\fet0\sectd \linex0\sectdefaultcl\sftnbj
{*\pnseclvl1\pnucrm\pnstart1\pnindent720\pnhang {\pntxta
.}}{*\pnseclvl2\pnucltr\pnstart1\pnindent720\pnhang {\pntxta
.}}{*\pnseclvl3 \pndec\pnstart1\pnindent720\pnhang {\pntxta
.}}{*\pnseclvl4\pnlcltr\pnstart1\pnindent720\pnhang {\pntxta
)}}{*\pnseclvl5\pndec\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta
)}}{*\pnseclvl6\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb
(}{\pntxta )}} {*\pnseclvl7\pnlcrm\pnstart1\pnindent720\pnhang
{\pntxtb (}{\pntxta
)}}{*\pnseclvl8\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb
(}{\pntxta )}}{*\pnseclvl9\pnlcrm\pnstart1\pnindent720\pnhang
{\pntxtb (}{\pntxta )}}\pard\plain \ql
\li0\ri0\nowidctlpar\faauto\rin0\lin0\itap0
\fs24\lang1049\langfe1049\cgrid\langnp1049\langfenp1049
{\b\insrsid7154806\charrsid7154806 Line 1 \par }{\insrsid7154806 \par
}{\i\insrsid7154806\charrsid7154806
Line3}{\lang1048\langfe1049\langnp1048\insrsid7154806 \par
}{\lang1048\langfe1049\langnp1048\insrsid2111663 [BreakPage] \par
}{\insrsid7154806 Line4 \par \par Line5 \par }}
Can anyone help me?
Thanks!
The problem is that RTF has some (but not necessarily all) formatting information in a global header. In order to split the RTF text so that the results are again valid RTF with formatting applied you essentially need to know where the header information is, and replicate it across a splits.
There are two ways of doing this:
(1) is doable, but will take time. Luckily, RTF parsers already exist, for example this one on CodeProject.
Alternatively, you can also load the RTF text into a
RichTextBox, then search for the split text"[BreakPage]"inside theRichTextBox, programmatically select the first and second part and retrieve the RTF text using theSelectedRtfproperty.