Pages in topic:   < [1 2]
Free TM in 24 languages! The DGT TM
Thread poster: Philippe Locquet
Philippe Locquet
Philippe Locquet  Identity Verified
Portugal
Local time: 05:02
English to French
+ ...
TOPIC STARTER
A few viable options Jun 30, 2017

Niann-Tsyr wrote:

So I followed the steps and completed the extraction to a tmx file with the language pair I wanted. I imported it into MemoQ via Resource Console TM tab. It took a long time (didn't time that). Then it was done apparently (window closed on finish, should have unchecked the box, since I didn't see what happened), but when I looked at the TM info zero entries -__- The tmx are supposed to be aligned, right? Or do I need to do that still? Also, are there better programs to use just only for termbases and translation memories? (Free or paid license)


Hello

You can have encoutered different problems.
If the TM did not extract properly, the tmx file could be unreadable in MemoQ. Or it could be too big as it has been suggested.
One way to find out is to download one of the already extracted TMs I have put below the video.
Create a project in Memo-Q with the languages of the TM you downloaded and the load it in MemoQ and make sure to connect it to the file you wish to translate (file has to be in the same source language as selected TM).
The single TMs below the video are all under 60 000 entries so it shouldn't be so big that your RAM can't support it.

If it works, it means your TM had previously not been well extracted, then come back here so we can all figure out what's wrong.

As to other CAT tools, there's a free tool that hasn't been suggested: Wordfast Anywhere. It's free and it's web-based, access at https://www.freetm.com/
Th CAT tool runs on Wordfast Servers so there's no issue with processes overloading your computer or your RAM since the srevers do the work. You can upload a TM up to 500 000 TUs, lol.

I have a series showing tips for WFA on my YouTube channel, you caa check these out if you wish here: https://www.youtube.com/watch?v=1-pydYh4Z9Q&list=PLcHsCymYkwXqUWwNXXsshS31u6--A-DkT

Hope this helps,

My bests


 
Philippe Noth
Philippe Noth  Identity Verified
Switzerland
Local time: 06:02
Member (2015)
German to French
+ ...
TMs for en-fr and de-fr (and how to make others) Jul 21, 2018

Hi everyone,

First of all, thanks to Fi2 n Co for the information and his informative video.

I revive this thread because I spent the afternoon manipulating these files from EU's DGT
(https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory) and my findings and/or the resulting files may be of interest to you.
... See more
Hi everyone,

First of all, thanks to Fi2 n Co for the information and his informative video.

I revive this thread because I spent the afternoon manipulating these files from EU's DGT
(https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory) and my findings and/or the resulting files may be of interest to you.

My goal was to have TMs for the language pairs en-fr and de-fr.

The problem is that the page has 74 ZIP files altogether ! Downloading them is not the problem - there are download managers for that - but each of these files has to be converted into a TMX using the TMXtract.jar. For me that meant 148 operations, way too much.

Fortunately I noticed that TMXtract does not really care about the content of the ZIP file, it will just process all TMXs that are inside. That means that for example, instead of converting the 9 files released in 2017 (Vol_2016_1, Vol_2016_2, etc.), one can make a single archive of these 9 files and use TMXtract only once.

Doing this I managed to make TMs for every DGT release in a couple of hours:

0000 - 11'168 files processed - 441'006 de-fr / 1'106'442 en-fr units written (I called 0000 the initial release)
2004 - 2'155 files processed - 141'147 de-fr / 150'001 en-fr units written
2005 - 2'133 files processed - 167'160 de-fr / 176'484 en-fr units written
2006 - 3'293 files processed - 346'797 de-fr / 367'758 en-fr units written
2007 - 2'656 files processed - 241'863 de-fr / 254'940 en-fr units written
2008 - 2'560 files processed - 327'676 de-fr / 345'418 en-fr units written
2009 - 1'906 files processed - 223'182 de-fr / 233'928 en-fr units written
2010 - 1'651 files processed - 223'701 de-fr / 232'464 en-fr units written
2011 - 2'449 files processed - 265'412 de-fr / 273'961 en-fr units written
2012 - 3'765 files processed - 441'612 de-fr / 462'431 en-fr units written
2013 - 3'169 files processed - 467'910 de-fr / 485'652 en-fr units written
2014 - 1'060 files processed - 195'024 de-fr / 200'361 en-fr units written
2015 - 3'270 files processed - 435'353 de-fr / 454'271 en-fr units written
2016 - 3'234 files processed - 558'200 de-fr / 603'848 en-fr units written
2017 - 1'956 files processed - 272'746 de-fr / 279'587 en-fr units written


If people are interested I can

  • post the 15 TMXs for en-fr and the 15 for de-fr on some cloud platform like Fi2 n Co did

  • post the 15 merged ZIP files for every release so you can extract other language pairs

  • document how to do it for other languages pairs in case it is not clear


Just let me know,

Philippe
Collapse


Philippe Locquet
 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 05:02
Member (2009)
Dutch to English
+ ...
farkastranslations.com > EU Translation Memories Jul 22, 2018

Just thought I'd remind everyone about this:

http://www.farkastranslations.com/eu_translation_memories.php
http://www.farkastranslations.com/glossaries.php

...because for a small fee, someone has already done all of this (and a more) for
... See more
Just thought I'd remind everyone about this:

http://www.farkastranslations.com/eu_translation_memories.php
http://www.farkastranslations.com/glossaries.php

...because for a small fee, someone has already done all of this (and a more) for us! I use Andras’ amazing EU collections via TMLookup (http://www.farkastranslations.com/tmlookup.php ) every day and am very happy with it.

Michael
Collapse


Noe Tessmann
 
Noe Tessmann
Noe Tessmann  Identity Verified
Local time: 06:02
English to German
+ ...
If extract.jar doesn't open Jul 22, 2018

Hi,

I just downloaded the latest DGTM files 2017 but the extraction tool didn't work. I found this jarfix tool that did the job.


https://www.heise.de/download/product/jarfix-41657/download

KR

Noe


Jorge Payan
 
Michael Albers
Michael Albers
Portugal
Local time: 05:02
English to Dutch
+ ...
How to open tmextract.jar Jun 17, 2019

Hi,

For all those who have trouble opening the tmextract.jar file on a Windows 64 bit system, uninstall the JAVA 64 bit version and install the JAVA 32 bit version. This did the trick for me on my Windows 10 machine.

Good luck.
Michael


 
Milan Condak
Milan Condak  Identity Verified
Local time: 06:02
English to Czech
Yes, files are updated Jun 17, 2019

[quote]Noe Tessmann wrote:

... EU TM alignments. Are there any updates? What happened to the project?

[quote]

DGT Translation Memory are files from JRC EC EU.

Here is a list of zipped TMX:

http://www.condak.cz/nove/2019-05/27/cs/01.html

(In Czech ProZ.com forum
https://www.proz.com/forum/czech/265325-aktualizované_soubory_dgt.html#2797039 )

Milan


 
José Manuel Miana
José Manuel Miana  Identity Verified
Spain
Local time: 06:02
English to Spanish
+ ...
Thanks Feb 26, 2020

Thanks! It will be useful.

 
Pages in topic:   < [1 2]


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Free TM in 24 languages! The DGT TM







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »