Software > myPREP

myPREP is a text aligner software, a tool which makes possible to automatically align two by two the documents in a multilingual corpus. The outcome of the alignment is a translation memory in TMX format. The alignment is done at the sentence level.

Thanks to myPREP, the production of training corpus for statistical translation is also possible (in Moses format). Such corpora are divided into several parts for the training, the tuning and the evaluation.

myPREP also makes possible the alignment of comparable corpora. The outcome of the alignment is a set of pair of sentences associated with a score, the number of aligned terms, and the length of sentences. These functions can control the alignments.

myPREP requires segmented documents corpora in UTF-8 format. The converter and the segmentation tool of the myCAT software are included in the installation of myPREP.

myPREP is available for both Windows and for GNU/Linux (Ubuntu 12.04); please find the links to the respective installation files below. As for the sources, they are the same for both versions.


myPREP is available both for Windows (tested on Windows 7 and Windows 2008 Server) and for GNU/Linux (tested on Ubuntu 12.04 LTS).

The software owned by Olanto are distributed under the GNU Affero General Public License Version 3, or AGPL V3.

Olanto CAT Suite

Available for Windows and GNU/Linux