The Second International Joint Conference on Natural Language Processing (IJCNLP-05)
Home People Conference Program Registration

General Information

Workshops Satellite Symposium Tutorials

Accommodation

Archives


  Conference Program

Progress in NLIP? What does the summarising task tell us? (Prof. Karen Sparck Jones)

Abstract:
Natural language information processing (NLIP) has made significant progress, in important ways, in the last twenty years. We have developed fairly comprehensive and robust tools like grammars and parsers, and have gained experience with applications including multilingual ones. We have been able not only to take advantage of the general advance in computing and communications technology but, more significantly, to exploit by-now vast text corpora to adapt our tools to actual patterns of language use. We have learnt, in particular, that many NLIP tasks can be sufficiently well done to be useful in many practical contexts by exploiting shallow text processing, ie by relying on surface indications of discourse meaning and communicative intent. We have also been learning how to do NLIP system evaluation.
Summarising illustrates what we have learnt, where we are, and where we need to go, very well. The first experiments in automatic summarising used very simple technology, a simple statistical sentence extraction technology that seemed too simple for useful summaries. Subsequent research focused on deeper text analysis that could sometimes work better could not readily be scaled up to large heterogenous data sources or to some user needs. More recent work on summarising has largely returned to the simpler, extractive approach, though it has also sought to refine or enrich this by, for example, incorporating parsing or by exploiting machine learning.
Summarising has also been better contextualised, partly by being seen as encompassing a spectrum of types ranging from basic index descriptions for individual documents to multi-source syntheses of specific types of information, for example biographies. At the same time, summarising is increasingly, and rightly, seen as a task that is only one activity within a set that may all be useful for some larger purpose so that, for example, summarising may be related to search queries or to the need to encapsulate extended information-seeking interactions.
But all of this richer view of summarising presents significant challenges for system evaluation. NLIP research has been transformed since 1990 by the major task evaluation programs that have been running, notably for information extraction and document retrieval and, later, question answering, that have served to establish whether plausible ideas actually work and to disseminate effective techniques. Summarising itself has been the focus of its own evaluation programmes for five years. This evaluation work, and the summarising evaluation work in particular, has been important both in promoting a better understanding of NLIP tasks and the impact of their application conditions. The summarising evaluations have, in particular, served to demonstrate both how crucial application contexts are for how tasks are handled, and how extremely challenging evaluation in itself is.

Karen Sparck Jones is emeritus Professor of Computers and Information at the Computer Laboratory, University of Cambridge. She has worked in automatic language and information processing research since the late fifties, and has many publications including nine books. She is a Fellow of the British Academy and of the American Association for Artificial Intelligence. She  has received three awards for information retrieval research as well as, in 2004, the Association for Computational Linguistics' Lifetime Achievement Award.  Her more recent research has been on information retrieval models and practice, on automatic summarising, and on system evaluation, where she is involved in international programmes.

Computer Laboratory
University of Cambridge
URL:
www.cl.cam.ac.uk/~ksj

   Conference Program

- Conference Program
- PDF version

   Proceedings

- LNCS On-line Proceedings

   Invited Speech

- Software and NLP R&D Strategy in Korea (Prof. Se Young Park)
-
Progress in NLIP? What does the summarising task tell us? (Prof. Karen Sparck Jones)
- Situated Natural Language Understanding (Prof. Hozumi Tanaka)