Database

Creator

Date

Thumbnail

Search results

96 records were found.

Tese de Doutoramento em Informática - Área do Conhecimento Processamento de Linguagem Natural
Dissertação de mestrado em Informática (área de especialização em Sistemas Distribuídos, Comunicações por Computador e Arquitectura de Computadores).
The MAP Doctoral Program in Computer Science of the Universities of Minho, Aveiro and Porto
The Marker Hypothesis was first defined by Thomas Green in 1979. It is a psycho-linguistic hypothesis defining that there is a set of words in every language that marks boundaries of phrases in a sentence. While it remains a hypothesis because nobody has proved it, tests have shows that results are comparable to basic shallow parsers with higher efficiency. The chunking algorithm based on the Marker Hypothesis is simple, fast and almost language independent. It depends on a list of closed-class words, that are already available for most languages. This makes it suitable for bilingual chunking (there is not the requirement for separate language shallow parsers). This paper discusses the use of the Marker Hypothesis combined with Probabilistic Translation Dictionaries for example-based machine translation resources extraction from parall...
Current music publishing in the Internet is mainly concerned with sound publishing. We claim that music publishing is not only to make sound available but also to define relations between a set of music objects like music scores, guitar chords, lyrics and their meta-data. We want an easy way to publish music in the Internet, to make high quality paper booklets and even to create Audio CD's. In this document we present a workbench for music publishing based on open formats, using open-source tools and script programming over them. The workbench is based on an archive specification written in a text-based format which includes sound references, music scores, chords and lyrics and their meta-information.
Dictionary and Thesaurus are valuable resources for Natural Language Processing but do not exist as freely available as expected, especially for languages other than English and, when they exist, they are just available for querying online. Our main goal with T2O - Thesaurus to Ontology framework - is to create a multilingual ontology: freely available online and to download; with a computer readable format; with a good API; with a structure as rich as possible; reusing all the structured information we can get;
One of the bottlenecks of example-based machine translation (EBMT) is to be able to amass automatically quantities of good examples. In our work in EBMT, we are investigating how far one can go by performing example extraction from parallel corpora using Probabilistic Translation Dictionaries to obtain example segmentation points. In fact, the success of EBMT highly depends on examples quality and quantity, but also in their length. Thus, we give special importance on methods to extract different size examples from the same translation unit. With this article we show that it is possible to extract quantities for examples from parallel corpora just using probabilistic translation dictionaries extracted from the same corpora.
Central libraryPalácio Ceia
Rua da Escola Politécnica, nº 141 - 147
1269-001 Lisboa, Portugal

Phones: (+351) 300 002 922
(+351) 300 002 925 | (+351) 300 002 930
(+351) 300 002 931 | (+351) 300 002 932
Electronic mail: cdoc@uab.pt

Opening hours:
Monday to friday, 9h to 18h
Coimbra delegationRua Alexandre Herculano, nº 52
3000-019 Coimbra, Portugal

Phone: (+351) 300 001 590
Electronic mail: cdocoimbra@uab.pt

Opening hours:
Monday to friday, 9h to 12h30 and 14h às 18h
Porto delegationRua de Amial, nº 752
4200-055 Porto, Portugal

Phone: (+351) 300 001 700
Electronic mail: cdocporto@uab.pt

Opening hours:
Monday to friday, 9h to 17h30