|
Student Research Workshop
May 2, 2004
http://inismor.ucd.ie/~nstokes/hlt-naacl04-student/
Paper submission deadline: February 8
Student researchers are invited to submit their work to the upcoming
HLT/NAACL 2004 Student Workshop. The main mission of the workshop is
to provide feedback for students' work in progress. Original and
unpublished research is invited on all aspects of speech, information
retrieval, and computational linguistics, but we encourage research
that is in the intersection of two or three of these areas.
5th SIGDIAL Workshop on Discourse and Dialogue
Friday April 30 and Saturday May 1, 2004
This workshop will be held at MIT.
http://sigdial04.eml-research.de
Paper submission deadline: January 12
Continuing with a series of successful workshops in Hong Kong,
Aalborg, Philadelphia, and Sapporo this workshop spans the ACL and
ISCA SIGdial interest area of discourse and dialogue. This series
provides a regular forum for the presentation of research in this area
to both the larger SIGdial community as well as researchers outside
this community. The workshop is organized by SIGdial, which is
sponsored jointly by ACL and ISCA.
CoNLL-2004: Eighth Conference on Computational Natural Language Learning
Thursday and Friday May 6 and 7, 2004
http://cnts.uia.ac.be/conll2004/
Paper submission deadline: February 4
CoNLL is an international conference for discussion and presentation
of research on natural language learning. We invite submission of
papers about natural language learning topics, including, but not
limited to:
- Computational models of human language acquisition
- Computational models of the evolution of language
- Machine learning methods applied to natural language processing
tasks (speech processing, phonology, morphology, syntax,
semantics, discourse processing, language engineering
applications)
- Symbolic learning methods (Rule Induction and Decision Tree
Learning, Lazy Learning, Inductive Logic Programming, Analytical
Learning, Transformation-based Error-driven Learning)
- Biologically-inspired methods (Neural Networks, Evolutionary
Computing)
- Statistical methods (Bayesian Learning, HMM, maximum entropy,
SNoW, Support Vector Machines)
- Reinforcement Learning
- Active learning, ensemble methods, meta-learning
- Computational Learning Theory analysis of language learning
- Empirical and theoretical comparisons of language learning methods
- Models of induction and analogy in Linguistics
Workshop on Pragmatics of Question Answering
Thursday and Friday May 6 and 7, 2004
http://languagecomputer.com/hltnaacl04qa/
Paper submission deadline: January 26
Open-domain Question Answering (QA) has determined substantial advances
in the past few years. Answering factual questions is performed with
better and better accuracy; multiple forms of definition questions are
processed correctly, and list questions retrieve sequences of answers
with good recall from large text collections. Evaluations in the Text
REtrieval Conference (TREC) QA track, as well as ARDA's Advanced
Question Answering for Intelligence (AQUAINT) program, enable these
advances in QA. These results and some of the research that made them
possible were discussed in various workshops on QA topics organized at
ACL (2001 and 2003), COLING (2002), LREC (2002), EACL (2003) and the
AAAI Spring Symposium series (2002 and 2003). In the past year, with the
emergence of scenario-based questions, several forms of pragmatic
processing have started to influence the architecture of QA systems. The
processing calls for handling multiple interactions with a QA system in
the context of a given scenario, the question decomposition required by
such contexts, the usage of the context and its interaction with the
user background. These are just a few of the new features required in QA
systems processing complex questions.
Document Understanding Conference 2004
Thursday and Friday May 6 and 7, 2004
http://duc.nist.gov/
Paper submission deadline (only from DUC participants): April 26
Text summarization has enjoyed a rebirth as can be noted by the number
of summarization meetings held recently. Summarization is of interest
to the NLP community and the IR community, both of which have made
significant contributions to this rebirth. In 2001 SIGIR hosted a
workshop that was the first official meeting of DUC (Document
Understanding Conference), a new evaluation for text summarization.
The DUC workshop has continued to grow, with 21 sites worldwide taking
part in DUC 2003, which was held as a workshop at the HLT/NAACL
meeting in Edmondton. The Boston workshop will present the results of
the 2004 DUC evaluation, along with papers by many of the DUC
participants. Additionally there will be open discussion of a new
roadmap to guide further DUC evaluations over the next couple of
years.
Workshop on Frontiers in Corpus Annotation
Thursday May 6, 2004
http://nlp.cs.nyu.edu/meyers/Frontiers_Workshop.html
Paper submission deadline: January 27
Corpus annotation has taken a pivotal role in computational
linguistics. As corpora become available with new sorts of annotation,
new tasks are born and new approaches are spawned to solve old
problems. The first treebanks made new types of statistical parsing
possible. Newer treebanks make it possible for treebank-based parsers
and related programs to provide more detailed output: we are seeing a
resurgence of multistage parsing, this time with a statistical bent.
Similarly, the annotation of corpora with part of speech, named
entity, coreference and sense disambiguation has resulted in new tasks
and extensions of old tasks.
Corpus annotation has also served as a bridge between knowledge-based
and statistical approaches. A model of research is emerging in which
the target analysis (the corpus annotation) is knowledge-based, but
the means of deriving that analysis are statistical. Corpus annotation
is providing a means for researchers with seemingly disparate research
agendas to work together in a way that simply was not possible before.
Workshop on Computational Lexical Semantics
Thursday May 6, 2004
http://cs.baylor.edu/~girju/hlt-naacl/cfp-CLS04.html
Paper submission deadline: February 2nd
Lexical semantics is the study of word semantic properties in context
and it is at the core of NLP and many of its applications. Recently,
there has been a renewed interest in text semantics fueled in part by
the complexity of some major research initiatives, such as Question
Answering, Text Summarization, Machine Translation, Information
Extraction, Reasoning, and others.
The aim of this workshop is to bring together researchers from
academia, government, and industry interested in text understanding,
lexical semantics, knowledge representation, question answering,
information retrieval, machine translation, and speech processing to
submit papers reporting on recent advances and new perspectives in
computational lexical semantics.
Second International Workshop on Scalable Natural Language Understanding (ScaNaLU 2004)
Thursday May 6, 2004
http://www.eml-development.de/SCANALU2004
Paper submission deadline: February 5
There is a growing need for systems that can understand and generate
natural language in applications that require substantial amounts of
knowledge as well as reasoning capabilities. Most current implemented
systems for natural language understanding (NLU) are decoupled from
any reasoning processes, which makes them narrow and
brittle. Furthermore, they do not appear to be scalable in the sense
that the techniques used in such systems do not appear to generalize
to more complex applications. While significant work has been done in
developing theoretical underpinnings of systems that use knowledge and
reasoning (e.g., development of models of linguistic interpretation
using abductive reasoning, intention recognition, formal models of
dialogue, formal models of lexical and utterance meaning, and
utterance planning), it has often proved difficult to utilize such
theories in robust working systems. Another major barrier has been the
vast amount of linguistic and world knowledge needed. But there is now
significant progress in compiling the required knowledge, using
manual, statistical and hybrid techniques. But even as these resources
become available, we still lack some key conceptual and computational
frameworks that will form the foundation for effective scalable
natural language systems.
Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval
Thursday May 6, 2004
http://www.glue.umd.edu/~bhuvana/hlt2004/
Paper submission deadline: February 9
For nearly a decade, speech indexing and retrieval have been a focus
of research in two largely independent communities, one at the
intersection of speech recognition and information retrieval, a second
at the intersection of information science and digital libraries.
Much has been learned, but there has been remarkably little crossover
between the two communities. As a result, we know a lot about the
technical requirements for audio indexing of broadcast materials, but
the state of the art for many other types of spoken word collections
still depends on manual indexing or on automated harvesting of
associated text and metadata.
Higher-Level Linguistic and Other Knowledge for Automatic Speech Processing
Thursday May 6, 2004
http://www.speech.sri.com/hlt-workshop/
Paper submission deadline: January 21
The theme of this workshop is the use of higher-level linguistic and
other types of knowledge for automatic speech processing, especially,
but not limited to, speech recognition (ASR). Most current
state-of-the-art speech recognizers do not explicitly use linguistic
information (with the exception of pronunciation dictionaries),
relying mainly on information encoded in statistical N-gram language
models. Higher-level linguistic processes such as prosody, syntax,
semantics, and pragmatics are obviously important, but such
information is typically harder to label, model, and integrate into
the standard computational frameworks (such as hidden Markov
models). In addition, high-level meta-information, such as personal
information stored in a database or dialogue and pragmatic coherence
constraints, can also play important roles. All these sources of
information can potentially compensate for acoustic confusability
resulting from noisy environments and unexpected channel and speaker
mismatch, which are very challenging issues for automatic speech
recognizers. Furthermore, high-level information is typically crucial
when the ultimate goal is to interpret the spoken input (i.e., the
same sequence of words can mean different things depending on prosodic
and syntactic features, as well as pragmatic constraints). Speaker
recognition is another field that has recently recognized the
importance of higher-level linguistic features, due to the fact that
speakers exhibit idiosyncratic prosodic, lexico-syntactic, and
pragmatic patterns ("conversational biometrics").
Spoken Language Understanding for Conversational Systems
Friday May 7, 2004
http://www.research.att.com/~dtur/NAACL04-Workshop/
Paper submission deadline: January 26
The success of a conversational system depends on a synergistic
integration of technologies such as speech recognition, spoken
language understanding (SLU), dialog modeling, natural language
generation, speech synthesis and user interface design. In this
workshop, we will address the SLU component of a conversational system
and its relation to the speech recognizer and the dialog model. In
particular, we aim to bring together techniques that address the issue
of robustness of SLU to speech recognition errors, language
variability and dysfluencies in speech with issues of representation
that provide greater flexibility to the dialog model.
Linking Biological Literature, Ontologies and Databases: Tools for Users
Thursday May 6, 2004
http://www.biolink2004.org
Paper submission deadline: January 16
This workshop will bring together researchers from the
fields of bioinformatics, natural language processing,
ontologies, data mining, and information retrieval. Our
focus will be on tools that can provide improved access and
cross-indexing for the biomedical literature, databases and
ontologies. We strongly encourage presentation of
approaches that support end users and user-defined tasks.
Biological databases have become increasingly important
resources in this field. These databases contain a mix of
data types, including sequence data (DNA and protein
sequences), structured data such as molecular weights or GC
content, and annotations in terms of controlled
vocabularies and, increasingly, ontologies such as the Gene
Ontology http://www.geneontology.org/, as well as free text
data in comment fields. Many biological databases are
manually curated, that is, constructed by PhD biologists
who read the literature and encode the information
contained in the literature in the appropriate fields of
the database that they are building.
|
|