CafeTran jargon

Start here » CafeTran jargon

If you come across a word that is not, or poorly, defined, feel free to ask over in the CafeTran forum or email Michael Beijer directly at ku.rejieb|leahcim#ku.rejieb|leahcim.

Alignment: See: http://cafetran.wikidot.com/using-the-alignment-workflow

Auto-assembling: Putting together the target translation from fragments from the TM and terms from the Glossary

Auto-assembling panel: The auto-assembling panel (F1) displays auto-assembly results and best fuzzy matches from TMs.

Auto-completion: CafeTran feeds its auto-complete engine from various resources: TM fragments, glossary matches, online machine translation output and typed words. You can remove incorrect words or expressions by selecting them in the auto-completion pop-up window or by using the Ctrl+Shift+R shortcut.

Auto-dock: Def …

Automatic case adjustment: It is possible to have CafeTran automatically adjust the case for you to fit the context of where a term is being inserted. This can be activated by ticking ‘Automatic case adjustment’ in the ‘Auto-assembling tab’ at Edit > Options

Auto-propagation: Def …

‘Big Mama’: General TM

‘Big Papa’: General TB

Bilingual document: CafeTran can export a bilingual document (.docx) (Project > Export as bilingual document + Project > Export as bilingual document with notes). This file can be edited by a proofreader and the changes can be reimported back into your CT project using Project > Import bilingual document. If you make changes to a segment in this document, and you remove the ‘R’ in the last column (column 5), after reimporting the file into CT, the status of any changed segments will be switched from ‘Checked’ to ‘Unchecked’, so you can quickly see which segments have been changed. See also: http://cafetranhelp.com/collaborating-with-a-reviewer

Bilingual HTML review file: CafeTran can export a bilingual HTML (Project > Save Project as…) file that can be used for review purposes. This file contains any ‘Segment notes’ as well as a ‘Document note’ in its header.

Bookmark: Segments can be bookmarked using a shortcut (Alt+B), via Translation > Bookmark segment, or by placing the cursor in the source segment and selecting ‘Bookmark segment’ from the right-click menu. After bookmarking a segment, a little red ‘B’ will appear next to the segment number. Bookmarked segments can also be filtered on via ‘View | Bookmarked segments’.

Browser: Def …

Case sensitivity: See: http://cafetran.wikidot.com/translate-a-file-that-is-entirely-in-uppercase

CAT tool: A CAT tool (‘CAT’ = ‘computer-assisted/aided translation’) is a program to translate with. CAT tools almost always make use of translation memories (TMs), and will sometimes also include some sort of terminology system. CafeTran is a CAT tool. So is memoQ, SDL Studio, Wordfast, Fluency and OmegaT.

ChangeLog: A ChangeLog is a log or record of changes made to a project, such as a website or software project, usually including such records as bug fixes, new features, etc. Some open source projects include a ChangeLog as one of the top level files in their distribution. See: http://en.wikipedia.org/wiki/Changelog + ChangeLog

Clipboard sensitive target: When the Clipboard sensitive feature is on, CafeTran captures any text copied from other applications to its interface. This option makes it possible to determine where the copied text is pasted automatically in CafeTran. The choices in the drop-down field are Look up - the Search field, Segment - the Target segment box. No selection in this field means that the user decides where to the paste the copied text manually. This feature can be turned on/off at Edit | Clipboard sensitive.

Context-aware Auto-assembling (C-3A): Think of Context-aware Auto-assembling as a kind of 3-dimensional glossary, where source and target each represent 1 dimension of the glossary and the third ‘dimension’ is represented by the context of a segment or project.

Context matches: Matches that …

Dock, to: To manipulate an interface element, such as a toolbar or panel, in order to align it with the edge of another interface element, typically a window or pane. In CafeTran, you can dock pretty much anything to anything else, meaning, you are free to design your own user interface!

Document note: Def. …

Exact match: Def. …

External database: Def. …

Export: Def …

Fragment: Def …

Frequent word: Def …

Function word: Def …

Fuzzy match: Def …

Fuzzy match threshold (%): Sets the minimum accuracy of the fuzzy matches which are displayed by CafeTran.

Glossary: In CafeTran, a Glossary is a tab-delimited text file for storing terminology. You can organise the data in your glossary using metadata categories (such as user, client, subject, note, reference, definition, usage example, URL, etc.). These metadata categories can be defined in Edit > Options > Database. Once you have defined your metadata categories, CafeTran will automatically add them (in a header in your text file) when creating a new glossary. Your metadata category names will also be visible in the Quick Term Editor that opens when clicking on blue-highlighted glossary entries in the tabbed pane.

The header for a Dutch-English Glossary, e.g., might look something like this:

#nl-NL {TAB} #en-GB {TAB} #User or client {TAB} #Subject {TAB} #Note or definition {TAB} #Usage example.

Note the use of the hash character (‘#’) preceding each metadata category name.

Glossary pane: where the glossary hits are displayed in the Tabbed Pane.

Glueing: CafeTran allows you to work on many documents in a single project by selecting the folder that contains them when creating a new project. They are then available for translation via ‘Project | Documents’. During translation, you can either switch between them or ‘glue’ them together into a single ‘view’ for convenience. Keep in mind that your segments first need to be created so it'’s best to choose ‘Automatic segmentation’ when setting up a new project with several documents.

Grid: Left of …

Hunspell: Hunspell is a spell checker and morphological analyser designed for languages with rich morphology, complex word compounding and character encoding, originally designed for the Hungarian language. CafeTran uses Hunspell as its default spell checking dictionary. Hunspell is also used by other popular software including Apple’s Mac OS X, OpenOffice, LibreOffice, Mozilla, Opera and Google Chrome.

Include Project Segments: See http://cafetranhelp.com/case-adaptive-and-simultaneous-find-and-replace

Keymap: Def …

Library: Def …

Machine translation: Def …

Match case: If you are translating a file that is entirely in upper case, it is unlikely that you will also have a Glossary that is entirely in upper case. This can be handled by temporarily disabling case sensitivity. This can be done as follows: 1. For Auto-assembling (including Glossaries): Tick or untick ‘Match case’ at Edit > Options > Auto-assembling 2. For Translation Memories: Tick or untick ‘Match case’ in the Translation Memory Start-up (Options) panel

Memory matching: Def …

Merge alternative translations (in a Glossary): Optimise your glossary by combining multiple target terms for the same source term. The last added target term (the one with the highest line number in your glossary file) will get the highest priority during auto-assembling. This function is available by selecting a glossary in the tabbed pane and then Library > Glossary > Merge alternative translations

Please note: Do not use this function if you habitually place different target terms (concepts), with the same source term, on different ‘lines’ in your glossary (perhaps differentiated with metadata). E.g., if you have ‘CAT’, as in CAT tool, and ‘cat’, as in the animal, on two separate lines (i.e., as two different concepts), merging your glossary will lump these two concepts together into one.

MT: Def …

Non-translatable fragments (or simply ‘non-translatables’): Fragments like proper names, codes, part names that should’t be translated. CafeTran colours them to identify them (you can adjust the colour via Edit > Appearance > Colors > Non-translatable fragments). They are stored in Lists with non-translatable fragments (Options > General > Non-translatable fragments). You can easily insert them into your target segment using a keyboard shortcut (the default is F4, which can be adjusted). Non-translatables are excluded from spell-checking. Non-translatables are stored in a text file, here: /Applications/CafeTran.app/Contents/Resources/Java/resources/placeables/Non-translatable fragments.txt (Mac) or C:\Users\usr\Dropbox\Programs\CafeTran\cafetran\resources\placeables (Windows) For more information, see Working with Non-translatables.

Note: See: ‘Document note’ or ‘Segment note’.

Notepad: Def …

OCRed: ‘OCRed’ is the simple past tense and past participle of OCR (to perform optical character recognition). See e.g.: http://en.m.wiktionary.org/wiki/OCR#English

On-the-fly: During the actual translation process.

Perfectionist: Fundamentalist whose principles aren't based on a religion but on some other unwordly principles. Example of a perfectionist: Hans van den Broek

Placeable: See Non-translatable fragments.

Prefix matching: Prefix matching is an automatic function that can be used in (TMX) Memories to match the beginnings of all words in a segment.

Prefix: Def …

Pretranslate only: This is a new TM function to deal with huge memories, which frees up all the RAM from all unused segments just after Pretranslation is over. For example:

1. Point to the folder containing TMXs
2. Open it with ‘Pretranslate only’ on.
3. Pretranslate, and go for a cup of coffee :)
4. You can export the Pretranslation ((Memory > Export > Export Pretranslation…)) and import it back ((Memory > Import > Import Pretranslation…)) when you reopen the project to avoid repeating the longish process.

Project page: Def …

Project: Def …

Punctuation character: Def …

QA: See Quality Assurance.

Quality Assurance: Def …

Read-only: In CafeTran, you can set certain resources as ‘read-only’ in order to optimise them for speed. Read-only resources can be queried but cannot be saved. More importantly, however, is that they will also use less RAM and thus speed up CafeTran. It is therefore a good idea to set any resources that do not need to be edited (such as the gigantic DGT-TM translation memories) as read-only.

Regression testing: ‘Regression testing is any type of software testing that seeks to uncover new software bugs, or regressions, in existing functional and non-functional areas of a system after changes, such as enhancements, patches or configuration changes, have been made to them.’ (Wikipedia) That is, finding out if the new version broke any previous functionality.

Regular expression: Using + | etc.

Rendezvous Server: Def …

SDLXLIFF: Def …

Separable verb: A separable verb is a verb that is composed of a lexical core and a separable particle. In some sentence positions, the core verb and the particle appear in one word, whilst in others the core verb and the particle are separated. The particle cannot be accurately referred to as a prefix because it can be separated from the core verb. German, Dutch, and Hungarian are notable for having many separable verbs. Separable verbs challenge theories of sentence structure because when they are separated, it is not evident how the compositionality of meaning should be understood. (http://en.wikipedia.org/wiki/Separable_verb)

SDLPPX: Def …

Scope: In the Find and Replace dialog …

Segment note: Segment notes can be attached to segments using a shortcut (Alt+N), via ‘Translation | Add segment note’, or by placing the cursor in the source segment and selecting ‘New note’ from the right-click menu. After entering a note to a segment, it will appear below the segment in the Grid, highlighted in purple (I think it’s purple). Segments with notes can also be filtered on using ‘View | Project notes’.

Segment timer: Def …

Segmentation editor: Def …
Segmentation rules: Def …

Source document tab: Def …

Source segment box: def

Source-side regular expressions: See: http://cafetran.wikidot.com/using-source-side-variables

SRX: Segmentation Rules eXchange or SRX is an XML-based standard that was maintained by the Localization Industry Standards Association, until it became insolvent in 2011. It is currently maintained by GALA. (http://en.wikipedia.org/wiki/Segmentation_Rules_eXchange)

SRX file: File that contains rules to prevent splitting.

Stemming: Stemming can be used to define the root of a word using pipe characters (‘|’).

Subsegment: Def. …

Subterms, adding: See: http://cafetran.wikidot.com/adding-term-pairs-to-the-glossary + entry ‘2013-12-01’ @ http://cafetranhelp.com/changelog

Subsegment match: Subsegment matches are approximate matches for subsegments of the current segment. Note: Move the mouse over the numbers of the hits to display the subsegment in full context. See here and here for more information.

Tabbed pane: The part of the screen normally placed …

Tag: Def. …

Tag insertion: http://cafetran.wikidot.com/using-different-ways-for-tag-placement

Target segment box: def

Terminology priority hierarchy: 1. Segment context, 2. Subject/Client and other fields, 3. Overall glossary/memory priorities (= 3 levels)

Text shortcut: Tired of typing ‘hot air balloon’? The next time you type the word in CafeTran, select the term and click Edit > Add selection to text shortcuts (Ctrl + Shift + A on Windows). The text shortcut dialogue will appear where you can define a text shortcut for the term (e.g., in this case, CafeTran will suggest ‘hab’). From now on, all you need to do is type this shortcut, and CafeTran will automatically insert the corresponding complete term. Note that text shortcuts you add while translating are added for the current session only, unless you do: Edit > List text shortcuts and click on ‘Save’. This will save them to a text file on your computer (@ …\CafeTran\cafetran\resources\shortcuts\) called ‘shortcuts.txt’.

TM: See Translation Memory.

TMX: TMX (Translation Memory eXchange) is an open XML standard for the exchange of translation memory data created by computer-aided translation and localisation tools. The latest version of the TMX specification is 1.4b, which is the preferred version in most settings these days. The latest specification can be viewed here: http://www.gala-global.org/oscarStandards/tmx/

Translated segment: Empty target

Translation memory: def

Translation project: Def …

Translation unit (TU): In short: a TU is a collection of one source segment and its translation in one or more target languages. In the field of translation, a translation unit is a segment of a text which the translator treats as a single cognitive unit for the purposes of establishing an equivalence. The translation unit may be a single word, a phrase, one or more sentences, or even a larger unit. The most common unit used in CAT tools is the sentence. When a translator segments a text into translation units, the larger these units are, the better chance there is of obtaining an idiomatic translation. This is true not only of human translation, but also in cases where human translators use computer-assisted translation, such as translation memories, and also when translations are performed by machine translation systems.
Trim: Def …

TTX: text

TU: See ‘Translation unit’.

Untranslated segment: Non-empty target

User ID: Def …

View: Several documents glued together. See ‘Glueing’, above.

Virtual match: These are the best subsegment matches of all the subsegment matches, selected by CafeTran. The Subsegment to Virtual threshold can be set in the Edit > Options > Memory dialogue. See: here and here for more information.

Whitespace character: Def. …
XLIFF: text

¬¬¬

The CafeTran jargon page is maintained by Michael Beijer (Beijerdeas.com + Wordbook.nl).

CafeTran Help

Currently the best source of CafeTran documentation