NEWS 2009

Call for Papers

Workshop Focus

Named Entities (NEs) play a critical role in Natural Language Processing (NLP) and Information Retrieval (IR) tasks, such as search, machine translation, document clustering, summarization, information extraction, etc. While identifying and analyzing NEs in a given natural language is a challenging research problem by itself, the phenomenal growth in the Internet user population, especially among the non-English speaking parts of the world, has extended this problem to the cross-language arena, making the handling of NEs in multiple languages critically important.

The purpose of this workshop is to bring together researchers interested in various aspects of NEs in natural language text. In addition, the NEWS workshop will feature a shared task on Machine Transliteration of NEs.

Important Dates

Research Paper Submissions
Research Paper Submission Deadline	1-May-2009
Shared Task
Registration Opens	16-Feb-2009
Registration Closes	9-Apr-2009
Release Training/Development Data	16-Feb-2009
Release Test Data	10-Apr-2009
Results Submission Due	14-Apr-2009
Results Announcement	29-Apr-2009
Task (short) Papers Due	3-May-2009
For All Submissions
Acceptance Notification	1-Jun-2009
Camera-Ready Copy Deadline	7-Jun-2009
Workshop Date	7-Aug-2009

Topics of Interest

This workshop invites original research contributions on all aspects of Named Entities (NEs), including identification, analysis, extraction, mining, transformation and applications to NLP and IR systems. The topics of interest include, but are not limited to the following:

NE Analysis

Distributional characteristics of NEs in mono- & multi-lingual corpora
Orthographic/phonetic characteristics of NEs
NE origin/genre recognition
Social network analysis and entity resolution

NE extraction

Language-independent monolingual NE extraction
Cross-language NE extraction
- General techniques
- Specific datasets (such as, Wikipedia, news, etc.)
Unsupervised and semi-supervised methods for NE extraction
Complex NEs, domain-specific term extraction
NE set expansion
Creation of annotated data

Machine Transliteration

Computational phonology, including modeling of phonological rules, structure, behavior, etc.
Transliteration modeling
- Phonetic, phonetic-semantic transliteration, grapheme ® phoneme and phoneme ® grapheme conversions
- Statistical and machine learning based approaches, transliteration unit alignment
- Forward and backward transliterations
- Learning transliteration from comparable corpora, transliteration lexicon construction
- Romanization of Asian languages
Transliteration evaluation metrics

Applications

Monolingual and Cross-Language IR
Machine Translation
Information Extraction and Management
Question Answering
Computational Journalism

Paper Format

Paper submissions to NEWS 2009 should follow the ACL-IJCNLP-2009 paper submission policy, including paper format, blind review policy and title and author format convention. Full papers (research paper) are in two-column format without exceeding eight (8) pages of content plus one extra page for references and short papers (task paper) are also in two-column format without exceeding four (4) pages, including references. Submission must conform to the official ACL-IJCNLP-2009 style guidelines. For details, please refer to

http://www.acl-ijcnlp-2009.org/main/authors/stylefiles/index.html.

Paper Submission

Submission is electronic using paper submission software at:

https://www.softconf.com/acl-ijcnlp09/NEWS/.

Shared Task on Transliteration

Transliteration of NEs is necessary in many applications, such as machine translation, corpus alignment, cross-language IR, information extraction and automatic lexicon acquisition. This calls for high-performance transliteration systems, which is the focus of the shared task in this workshop.

Details of the task is available here.

Organizing Committee

Haizhou Li	Institute for Infocomm Research
A Kumaran	Microsoft Research India
Sanjeev Khudanpur	Johns Hopkins University
Raghavendra Udupa	Microsoft Research India
Min Zhang	Institute for Infocomm Research
Monojit Choudhury	Microsoft Research India

Program Committee

Kalika Bali	Microsoft Research India
Rafael Banchs	UPC, Spain
Sivaji Bandyopadhyay	Univ of Jadavpur, India
Pushpak Bhattacharyya	IIT-Bombay, India
Monojit Choudhury	Microsoft Research India
Marta Ruiz Costa-jussà	UPC, Spain
Jianfeng Gao	Microsoft Research, USA
Gregory Grefenstette	Exalead, France
Sanjeev Khudanpur	John Hopkins University, USA
Kevin Knight	ISI, USA
Greg Kondrak	Univ of Alberta, Canada
Olivia Kwong	City U., Hong Kong
Gina-Anne Levow	Univ of Chicago, USA
Arul Menezes	Microsoft Research, USA
Jong-Hoon Oh	NICT, Japan
Yan Qu	Advertising.com, USA
Dan Roth	Univ of Illinois, Urbana-Champaign, USA
Sunita Sarawagi	IIT-Bombay, India
Sudeshna Sarkar	IIT-Kharagpur, India
Richard Sproat	Univ of Illinois, Urbana-Champaign, USA
Keh-Yih Su	Behavior Design Corporation, Taiwan
Raghavendra Udupa	Microsoft Research, India
Vasudeva Varma	IIIT-Hyderabad, India
Min Zhang	Institute for Infocomm Research, Singapore

Contact Information

For any information about the workshop or the shared task, please contact:

Dr. A. Kumaran
Microsoft Research India
Scientia, 196/36, Sadashivnagar 2nd Main Road, Bangalore
INDIA 560080
a.kumaran@microsoft.com