Non-uniform Language Detection in Technical Writing

Weibo Wang1, Abidalrahman Moh'd1, Aminul Islam2, Axel Soto3, Evangelos Milios1
1Dalhousie University, 2University of Louisiana at Lafayette, 3University of Manchester


Abstract

Technical writing in professional environments, such as user manual authoring, requires the use of uniform language. Non-uniform language detection is a novel task, which aims to guarantee the consistency for technical writing by detecting sentences in a document that are intended to have the same meaning within a similar context but use different words or writing style. This paper proposes an approach that utilizes text similarity algorithms at lexical, syntactic, semantic and pragmatic levels. Different features are extracted and integrated by applying a machine learning classification method. We tested our method using smart phone user manuals, and compared its performance against the state-of-the-art methods in a related area. The experiments demonstrate that our approach achieves the upper bound performance for this task.