Termbases

There are two ways to store terms in CafeTran

  • In Termbases which are TMX files

Use them if you want to have fuzzy term recognition

  • In Glossaries which are plain-text files with TABS

Use them if you want to have compact files that are easy to manage

Glossaries versus Termbases

Watch a video about Glossaries versus Termbases

PLEASE NOTE: After importing the Excel file, you should always save the termbase under a new name via Memory > Save memory as …

Creating a Termbase

At the present time you create a Termbase from the Memory menu:

  • In the Memory menu choose New memory.

The dialogue box New memory is displayed (Note that this dialogue box should rather be called New Termbase).

  • Adjust the settings in the dialogue box to optimise for Termbases (actually I think that these should be default settings for Termbases):
new-memory.png

Adding term pairs to your Termbase

Once you have created a termbase, you can start adding term pairs to it.

  • Select one or more words in the Source segment pane.
  • Select one or more words in the Target segment pane.
  • Click on the icon Add fragment to memory:
add-fragment.png

The New fragment dialogue box will open.

  • Make any necessary modifications and additions.
  • Click the OK button to close the dialogue box and add a term pair to your termbase:
new-fragment.png

Manually overriding the auto-assembling result

Both in Termbases and Glossaries you can manually override the result of the auto-assembling process, that is: change the target term that CafeTran will insert automatically in the Target segment pane.

  • Position the text cursor in the target term that you want to override, or, when the target term consists of several words (a so-called multi-word term), select all those words.
  • Click the right mouse button to open the context menu.
  • Use right mouse button to override auto-assembling result only once (in this segment).
  • Use left mouse button to override auto-assembling result only once (in this segment).

Here the process is shown for the Termbase:

alt-term-termbase.png

And just to be complete, here the process is shown for the Glossary:

alt-term-glossary.png

Balancing your Termbases for auto-assembling

Let's have a look at the termbase that was created in the video:

termbase.png

Compare this with the structure of a glossary that uses source-side and target-side alternatives:

LA;Lüftungsanlage VE;ventilatie-eenheid
WLA;Wohnraumlüftungsanlage residentiële ventilatie-eenheid;RVE
NWLA;Nichtwohnraumlüftungsanlage niet-residentiële ventilatie-eenheid;NRVE

And answer for yourself the question:

Which resource for terminology is easier to manage with regards to balancing target terms for the auto-assembling process?

Browsing in Termbases

After loading a Termbase, the last entries will be displayed. You can browse in the Termbase via the context menu:

browse-TMs.png

Using Termbases with truncation of terms

You can also use Termbases to let CafeTran ignore the ending of words and only match on the first part or stem of a word. This is sometimes called stemming, though truncation is probably a more appropriate name.

From the Wikipedia:

Stemming is the term used in linguistic morphology and information retrieval to describe the process for reducing inflected (or sometimes derived) words to their word stem, base or root form—generally a written word form.

Watch a video about Using stemming in CafeTran

The relevant setting is the Prefix matching (%) selector in the New memory (New termbase) dialogue box.

When you choose Fixed length only the first four letters will be used for term matching (term recognition) OR: the last four letters will be truncated. This value of 4 is set in Edit > Options > Memory > Minimal prefix length:

5a.png

You can also set a percentage from 10 % to 90 %.

Here are 4 screenshots, made with Prefix matching 'Fixed length' active:

1a.png2a.png3a.png4a.png

Note that all terms were recognised with Prefix matching set to 'Fixed length'. When the Prefix matching is set to '10%', all terms will be recognised too. When the Prefix matching is set to '80%', only the terms in the first 2 segments will be recognised.

PLEASE NOTE: You will have to experiment with the value for Prefix matching. This value will be depending on some characteristics of your source language and will likely vary per language.

Read an article written by a CafeTran user who only uses Termbases for his terminology

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License