Studying the appropriateness of different formulations of a discourse relation in context

Lead Research Organisation: University of Aberdeen
Department Name: Computing Science

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Publications

10 25 50
publication icon
Advaith Siddharthan (author) (2010) Reformulating discourse connectives for non-expert readers in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

publication icon
Advaith Siddharthan (Author) (2011) Text simplification using typed dependencies : a comparison of the robustness of different generation strategies in Proceedings of the 13th European Workshop on Natural Language Generation (ENLG)

publication icon
Advaith Siddharthan (author) (2010) Complex lexico-syntactic reformulation of sentences using typed dependency representations in Proceedings of the 6th International Natural Language Generation Conference

publication icon
Siddharthan, A. (2012) Offline sentence processing measures for testing readability with users in NAACL-HLT 2012 Workshop on Predicting and Improving Text Readability for target reader populations (PITR 2012)

 
Description Methods: We have demonstrated the potential for creating data sets that can produce useful insights in both behavioural (psycholinguistic) and computational disciplines. We have shown that it is possible to create a data set using reformulations of sentences extracted from corpora that is controlled enough to test specific hypotheses and varied enough for supervised machine learning. We believe such mixed methodologies provide a useful basis for collaboration between data-driven computational sciences and behavioural sciences (linguistics and psychology of language). Tools: We have developed software for text reformulation based on applying transformation rules to typed dependency representations. The software and a small set of transformation rules is available on request; a full release is planned following further developmental work. Networks: This project has lead to new collaborations. Dr Katsos is working with Dr Naveed Ahmed, from the University of Islamabad, on a 9- month project on simplifying legal text for lay audiences, with emphasis on laws on women's rights in pakistan. Dr Siddharthan has demonstrated the text simplification software developed on this project to researchers working on deaf education and on ageing. Based on their feedback, various extensions to the system are planned. Further, the University of Aberdeen is funding a Ph.D. Student (Andrew Walker, 2010-2013) supervised by Dr Siddharthan to research Lexical Simplification. Dr Siddharthan is also a Co-I on a large collaborative proposal submitted to the EPSRC for summarising medical information for lay audiences (e.g., patient records). This involves new research networks with Aberdeenshire GP practices and Aberdeen and Dundee Medical Schools.
Exploitation Route The key outcome is the development of a framework for lexico-syntactic text reformulation. This has been developed further under grant EP/J018805/1 and the software for text simplification (RegenT) is in the process of being commercialised.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description The key outcome is the development of a framework for lexico-syntactic text reformulation. This has been developed further under grant EP/J018805/1 and the software for text simplification (RegenT) is in the process of being commercialised.
First Year Of Impact 2014
Sector Digital/Communication/Information Technologies (including Software)
 
Description EPSRC First Grant
Amount £97,364 (GBP)
Funding ID EP/J018805/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 03/2013 
End 08/2014
 
Title Acceptability and recall of different formulations of causality 
Description This is the dataset described in the following paper : 'Learning the acceptability of different formulations of causality' by A. Siddharthan & N. Katsos (submitted to journal). It is provided as is, in particular : (a) There are no warranties and no representations that the data will be suitable for any particular purpose, or for use under any specific conditions. (b) In no circumstances will we accept any liability for loss of profits, goodwill or any kind of consequential losses of any nature even if such loss was foreseeable. There are three files: 1) reformulations.xml : set of manual reformulations of sentences expressing causality. 2) acceptability.data : data from acceptability experiment. 3) recalled-sentences.xml: data from sentence recall experiment. Examples of usage taken from the British National Corpus (BNC) were obtained under the terms of the BNC End User Licence. Copyright in the individual texts cited resides with the original IPR holders. For information and licensing conditions relating to the BNC, please see the web site at http://www.natcorp.ox.ac.uk. 
Type Of Material Database/Collection of data 
Provided To Others? No  
Impact It lead to the development of software for text simplification (the RegenT tool and framework for regenerating text) 
 
Title RegenT Text Simplification 
Description The RegenT text simpification offers functionality for a range of lexico-syntactic text simplification operations. 
Type Of Technology Software 
Year Produced 2014 
Impact none to date 
URL http://homepages.abdn.ac.uk/cgi-bin/cgiwrap/csc323/RegenT/demo.cgi