Studying the appropriateness of different formulations of a discourse relation in context
Lead Research Organisation:
University of Aberdeen
Department Name: Computing Science
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
Organisations
Publications
Advaith Siddharthan (author)
(2010)
Reformulating discourse connectives for non-expert readers
in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Advaith Siddharthan (Author)
(2011)
Text simplification using typed dependencies : a comparison of the robustness of different generation strategies
in Proceedings of the 13th European Workshop on Natural Language Generation (ENLG)
Advaith Siddharthan (author)
(2010)
Complex lexico-syntactic reformulation of sentences using typed dependency representations
in Proceedings of the 6th International Natural Language Generation Conference
Siddharthan, A.
(2012)
Offline sentence processing measures for testing readability with users
in NAACL-HLT 2012 Workshop on Predicting and Improving Text Readability for target reader populations (PITR 2012)
Description | Methods: We have demonstrated the potential for creating data sets that can produce useful insights in both behavioural (psycholinguistic) and computational disciplines. We have shown that it is possible to create a data set using reformulations of sentences extracted from corpora that is controlled enough to test specific hypotheses and varied enough for supervised machine learning. We believe such mixed methodologies provide a useful basis for collaboration between data-driven computational sciences and behavioural sciences (linguistics and psychology of language). Tools: We have developed software for text reformulation based on applying transformation rules to typed dependency representations. The software and a small set of transformation rules is available on request; a full release is planned following further developmental work. Networks: This project has lead to new collaborations. Dr Katsos is working with Dr Naveed Ahmed, from the University of Islamabad, on a 9- month project on simplifying legal text for lay audiences, with emphasis on laws on women's rights in pakistan. Dr Siddharthan has demonstrated the text simplification software developed on this project to researchers working on deaf education and on ageing. Based on their feedback, various extensions to the system are planned. Further, the University of Aberdeen is funding a Ph.D. Student (Andrew Walker, 2010-2013) supervised by Dr Siddharthan to research Lexical Simplification. Dr Siddharthan is also a Co-I on a large collaborative proposal submitted to the EPSRC for summarising medical information for lay audiences (e.g., patient records). This involves new research networks with Aberdeenshire GP practices and Aberdeen and Dundee Medical Schools. |
Exploitation Route | The key outcome is the development of a framework for lexico-syntactic text reformulation. This has been developed further under grant EP/J018805/1 and the software for text simplification (RegenT) is in the process of being commercialised. |
Sectors | Digital/Communication/Information Technologies (including Software) |
Description | The key outcome is the development of a framework for lexico-syntactic text reformulation. This has been developed further under grant EP/J018805/1 and the software for text simplification (RegenT) is in the process of being commercialised. |
First Year Of Impact | 2014 |
Sector | Digital/Communication/Information Technologies (including Software) |
Description | EPSRC First Grant |
Amount | £97,364 (GBP) |
Funding ID | EP/J018805/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 03/2013 |
End | 08/2014 |
Title | Acceptability and recall of different formulations of causality |
Description | This is the dataset described in the following paper : 'Learning the acceptability of different formulations of causality' by A. Siddharthan & N. Katsos (submitted to journal). It is provided as is, in particular : (a) There are no warranties and no representations that the data will be suitable for any particular purpose, or for use under any specific conditions. (b) In no circumstances will we accept any liability for loss of profits, goodwill or any kind of consequential losses of any nature even if such loss was foreseeable. There are three files: 1) reformulations.xml : set of manual reformulations of sentences expressing causality. 2) acceptability.data : data from acceptability experiment. 3) recalled-sentences.xml: data from sentence recall experiment. Examples of usage taken from the British National Corpus (BNC) were obtained under the terms of the BNC End User Licence. Copyright in the individual texts cited resides with the original IPR holders. For information and licensing conditions relating to the BNC, please see the web site at http://www.natcorp.ox.ac.uk. |
Type Of Material | Database/Collection of data |
Provided To Others? | No |
Impact | It lead to the development of software for text simplification (the RegenT tool and framework for regenerating text) |
Title | RegenT Text Simplification |
Description | The RegenT text simpification offers functionality for a range of lexico-syntactic text simplification operations. |
Type Of Technology | Software |
Year Produced | 2014 |
Impact | none to date |
URL | http://homepages.abdn.ac.uk/cgi-bin/cgiwrap/csc323/RegenT/demo.cgi |