LT4CloseLang: Language Technology for Closely Related Languages and Language Variants
Important Information
- Workshop program is here
- Proceedings are here
- Acceptance rate: 55%, accepted 11 out of 20 submissions
- Presentations: oral, 17 minutes for presentation + 3 minutes for questions
- Registration: at least one author has to register and pay the registration fee.
- Visa: see here
Motivation
Recent initiatives in language technology have led to the development of at least minimal language processing toolkits for all EU-official languages as well as for languages with a large number of speakers worldwide such as Chinese and Arabic. This is a big step towards the automatic processing and/or extraction of information, especially from official documents and newspapers, where the standard, literary language is used.
Apart from those official languages, a large number of dialects or closely-related language variants are in daily use, not only as spoken colloquial languages but also in some written media, e.g., in SMS, chats, and social networks. Building language resources and tools for them from scratch is expensive, but the efforts can often be reduced by making use of pre-existing resources and tools for related, resource-richer languages.
Examples of closely-related language variants include the different variants of Spanish in Latin America, the Arabic dialects in North Africa and the Middle East, German in Germany, Austria and Switzerland, French in France and in Belgium, Dutch in the Netherlands and Flemish in Belgium, etc. Examples of pairs of related languages include Swedish-Norwegian, Bulgarian-Macedonian, Serbian-Bosnian, Spanish-Catalan, Russian-Ukrainian, Irish-Gaelic Scottish, Malay-Indonesian, Turkish–Azerbaijani, Mandarin-Cantonese, Hindi–Urdu, and many other.
The workshop aims to bring together researchers interested in building language technology applications that make use of language closeness to exploit existing resources in a related language or a language variant. A previous version of this workshop, organised at RANLP 2013, attracted a lot of research interest, showing the need for further activities.
Topics
Topics of interest include but are not limited to the following:
- Case studies of using language resources and tools for related languages and language variants
- Adaptation of monolingual tools and resources for closely-related languages and language variants
- Evaluation of language resources and tools when applied to closely-related languages and language variants
- Linguistic issues when adapting language resources and tools, e.g., semantic discrepancies, lexical gaps, false friends, etc.
- Machine translation between closely-related languages
Important Dates
Submission deadline: July 26, 2014, 11:59 p.m. PST ==> Extended to August 2, 2014!Acceptance/rejection notification: August 26, 2014Camera-ready deadline: September 12, 2014, 11:59 p.m. PST- Workshop: October 29, 2014
Committee
- Laura Alonso y Alemany (Univeristy of Cordoba, Argentina)
- César Antonio Aguilar (Pontificia Universidad Católica de Chile, Santiago de Chile, Chile)
- José Castaņo (University of Buenos Aires, Argentina)
- David Chiang (University of Southern California, USA)
- Marta Costa-Jussā (Institute for Infocomm Research, Singapore)
- Walter Daelemans (University of Antwerp, Belgium)
- Kareem Darwish (Qatar Computing Research Institute, Qatar)
- Tomaz Erjavec (Jozef Stefan Institute, Slovenia)
- Maria Gavrilidou (ILSP, Greece)
- Francisco Guzman (Qatar Computing Research Institute, Qatar)
- Barry Haddow (University of Edinburgh, UK)
- Nizar Habash (Columbia University, USA)
- Walther v. Hahn (University of Hamburg,Germany)
- Cvetana Krstev (University of Belgrade, Serbia)
- Vladislav Kubon (Charles University Prague, Czech Republic)
- Thang Luong Minh (Stanford university, USA)
- John Nerbonne (University of Groningen, Netherlands)
- Graham Neubig (Nara Institute of Science and Technology, Japan)
- Kemal Oflazer (Carnegie-Mellon University, Qatar)
- Maciej Ogrodniczuk (IPAN, Polish Academy of Sciences, Poland)
- Slav Petrov (Google, New York, USA)
- Stefan Riezler (University of Heidelberg, Germany)
- Laurent Romary (INRIA, France)
- Hassan Sajjad (Qatar Computing Research Institute, Qatar)
- Kiril Simov (Bulgarian Academy of Sciences)
- Milena Slavcheva (Bulgarian Academy of Sciences)
- Marco Tadic (University of Zagreb, Croatia)
- Jörg Tiedemann (Uppsala University, Sweden)
- Dusko Vitas (University of Belgrade, Serbia)
- Stephan Vogel (Qatar Computing Research Institute, Qatar)
- Pidong Wang (National University of Singapore, Singapore)
- Taro Watanabe (NICT, Japan)
Contact
- Preslav Nakov, Qatar Computing Research Institute
- Petya Osenova, Bulgarian Academy of Sciences
- Cristina Vertan, University of Hamburg