Setting Up Web Searches

A few comments on the Web search in CafeTran.

1. The idea is to type in the Search field or select a word in the
source/target window and look it up in a chosen Web resource.

2. It is possible to query numerous Web resources simultaneously, which is
implemented the Aragorn version (trial is already available for download)
officially released in June/July.

3. It has the option to choose between the External (System) and Internal
(in the tab) Web browser.

4. Internal Browser (opened in the CT tab) in Java 6

Java 6 has the html 3.0 implementation of the internal browser. It works
fine for some Web resources but may not work for other more advanced web
pages. Then, switch the option back to the External (System) browser such
as Safari, Firefox, Opera, Interner Explorer, Chrome etc., or run CafeTran
with Java 7.

5. Internal Browser (opened in the CT tab) in Java 7

For Java 7, there is a new internal web browser which should work as well
as the System browser. CafeTran recognizes Java 7 and switches on this new
web browser. The switch works fine on Linux and Windows systems. The Mac
OS X system is sort of tuned for Java 6 released by Apple. It is possible
to use Java 7 but you still need Java 6 installed. Soon, I am going to
prepare CafeTran.app on Mac OS that will work independent of Apple's Java
6.

6. External browser.

It functions well on all operating systems both with Java 6 and Java 7.
Since the web search is opened in the System browser, CafeTran creates the
Desktop view in the tab to see the search result without the need to
switch between windows. Click at this view to refresh it.

7. The default choice between the External and Internal browser is done in
Edit | Options | Browser field. However, you may also make a choice
depending on the particular Web page (resource). Select the System browser
in the Style field of the Internet resource configuration panel to make it
open in the System browser even if the default option is set to Internal
browser.


The info below is not up to date and needs to be revised. Volunteers? Forward, please!

SlimBrowser:

0.png

[Igor] 
There is a free browser called SlimBrowser. Do the following there: 
1. Open the Internet resource. 
2. Point the mouse at the form search field. 
3. Select the Search engines drop-down box (at the right-hand side) 
4. Choose "Create your own search engine" command, and Voila! 
Then, it is easy to determine the Address start and end fields. 
[/Igor] 

Other version (by Igor or by me, I forgot): 

[other version] 
SlimBrowser can help you extract search engine data from the current web page automatically. To do automatic extraction, observe the following steps: 
- Open the search engine homepage, e.g., http://www.google.com   
- Click inside the input box where you are expected to type the words to search for. Keep the blinking caret inside the input box. 
- Click the dropdown arrow near the icon at the left end of the quick-search bar to get the list of search engines. Select "Create your own search engine" towards the end of the - - menu. You will see a dialog showing the extracted data after that. 
- Review the data and make modifications if necessary.  Click "Save As" button to save the data as a search engine definition file (*.qseg). 
The new search engine will be immediately available from the drop-down list of engines. 

http://de.wikipedia.org/w/index.php?search=$key&button=%3CIMG+alt%3DVolltext+src%3D%22%2F%2Fbits.wikimedia.org%2Fstatic-1.20wmf9%2Fskins%2Fvector%2Fimages%2Fsearch-ltr.png%3F303-4%22%3E&title=Spezial%3ASuche 

end=&submit_button=Search 
version 

CafeTran lets you access and query your favorite web resources.

Accessing Internet resources

KudoZ terminology bases are a good example of the web resource that translators use and share. It requires a few steps to access it.

Select Library | New resource.
Select Internet.
Fill in the fields in the following way:

Address:

Leave it empty if you intend to set Address start and end fields. This field is useful if you simply want to access a particular web page without querying it.

Address start and end:

A lot of Internet sites provide services such as searching terminology bases. They present the form to fill in and the button to press. Next, your browser generates the URL query that is sent to the resource provider. Based on this query you get the response.

Usually, the URL query that is generated and sent consists of three main parts: prefix, searched word and suffix. Sometimes, you may see this query in the browser address field, and thus determine the parts.
For example, accessing KudoZ Polish-English terminology base the browser generates the following URL query:

http://www.proz.com/?sp=ksearch&submit=1&term=
your_searched_word
&from=pol&to=eng&whole_words=y&edited=y&unedited=y&glossaries=y&glosspost=y

As you can see the prefix part is:
http://www.proz.com/?sp=ksearch&submit=1&term=

and the suffix part:
&from=pol&to=eng&whole_words=y&edited=y&unedited=y&glossaries=y&glosspost=y

Now, fill in the Address start (prefix) and end (suffix) fields with those information. For other languages replace pol and eng in the suffix part with your language pair.

Page start and end:

These two fields are optional. When you load a web page from the Internet, it often contains a lot of unnecessary information. It may be a good idea to filter the page and get only what you really need. In the start field you should provide the start of relevant information on the HTML page and in the end field the end of it.

Based on it, CafeTran will narrow the page down to what you really want to see. Occasionally, the web page authors change its content so that the page start and end fields might need to be updated too.

Cache:

This field lets you provide a number of web pages that are kept in the cache memory for fast browsing using Back/Forward buttons in the main toolbar. The default number is 10.

Encoding:

In case you have problems viewing national characters in the page, provide the language encoding for it (for example ISO-8559-2 for Central European languages). Since this value is often provided in the page HTML code you may leave this field empty.

Style:

In this optional field you can provide the path to your preferred stylesheet to change the colors or fonts of the internal browser. Choosing System browser in this field will let you view the search results in the system browser such as Firefox or Internet Explorer.

Description:

Here you can type any short description of the resource.

Press OK button and save the Internet configuration file under some name in the Internet directory. You can also make subdirectories of the parent directories and save the file there organizing your Web resources in a tree-like structure.
Opening internet resources

Select Library | Internet | Your resource. A new window will appear to access the Internet site.

Type a word in a CafeTran search field or select it with a mouse from the source language window.

Press the Enter key or the Search button in the toolbar. The results of the query should appear in either CafeTran's internal or system browser. The Back and Forward buttons in the main toolbar let you navigate through the search results.
External browser

It is possible to open all your Internet resources in the system web browser such as Internet Explorer, Opera or Firefox. In the Edit | Options | Browser field choose System. Then, the search query will be redirected to the system web browser. If you wish to set only particular Web resources to view in the system browser, choose "System browser" in the Style field of their configuration dialog.

Editing the Library configuration file

Select your Internet resource tab.
Choose Library | Edit resource info from the main menu.

More information: Adding New Searches for IntelliWebSearch. Please jump directly to Adding on-line resources manually (without the Wizard)

Example Web Search definitions

Unterm

Here is a Library link for UNTERM (with English as source). You can replace en with fr, sp, ru, ch and ar.

Address start:
http://unterm.un.org/DGAACS/unterm.nsf/WebView?SearchView&Query=[en]%20contains%20(
Address end:
)SearchOrder=1&SearchMax=250&SearchWV=TRUE&SearchFuzzy=FALSE&Start=1&Count=0

It’s very nice inside CafeTran.

unterm.png

IATE

Please click here

Web search for Google Translate

Windows:

My ‘NL-EN_Google Translate.res’ file (C:\Users\usr\Dropbox\Programs\CafeTran\cafetran\infos\resources\Internet\) contains this:

Type=internet
Internet address=Google Translate NL>EN
Address start=http://translate.google.com/translate_t?&sl=nl&tl=en&text=
Address end=
Page start=
Page end=
Encoding=UTF-8
Style="C:\Users\usr\AppData\Local\Torch\Application\torch.exe"
POST method=no

PLEASE NOTE: You can specify an alternative browser (like Torch) by inserting its path into the Style field.

OS X:

Browser: Edit > General > Browser > CafeTran

Resource definition:

Type=internet
Internet address=
Address start=http://translate.google.com/translate_t?&sl=de&tl=nl&text=
Address end=
Page start=Translate text or webpage
Page end=
Encoding=UTF-8
Style=styles/userstyle.css
POST method=no

Adapt the ISO 639-1 two-letter codes for source and target (now de and nl) in Address start to your preferences.

Searching Microsoft glossaries

Selcuk A. writes:

For Microsoft Terminology Search

Address start: http://www.microsoft.com/language/en-us/Search.aspx?sString=
Address end: &langID=tr-tr

(Turkish in my example)

It works like a charm!

TBX glossaries are good but the web resource searches in translation strings as well.

ms141111.JPG

Examples of web resource definitions

You can download a collection of web resource definitions here.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License