An EPSRC-funded collaborative project (January 2007 - April 2010) between Anthony Hunter (UCL) and Weiru Liu (QUB).
There is a huge and rapidly expanding amount of information available for scientists in various online resources. However, this wealth of information has created challenges for scientists who wish to locate and analyse knowledge from heterogeneous sources. Key problems that exist are that there is much uncertainty in individual sources of scientific knowledge, and many conflicts arising between different sources of scientific knowledge. Scientists therefore need tools that are tolerant of uncertainty and inconsistency in order to query and merge scientific knowledge.
This project has aimed to facilitate the analysis of scientific knowledge by the development of technology for structured scientific knowledge (SSK). SSK is represented by a set of SSK reports each of which is a structured report that describes one or more scientific data sources (such as one or more journal articles, empirical datasets, etc). The format is an XML document with entries restricted to individual words, values, simple phrases in scientific terminology or formulae of logic or statistics. Each SSK can be constructed by hand, by information extraction technology, or as a result of analysing data sources.
In this project, we have extended our existing work for merging and analysing heterogeneous structured information by harnessing formal theories for representing and reasoning with uncertain and inconsistent information. We believe that we need a range of formalisms for representing aspects of scientific knowledge since no one formalism can effectively capture all aspects of scientific knowledge, and so we have been working with a variety of numerical based theories and logical formalisms including some extended with probability theory or possibility theory. For using the scientific knowledge, we have been developing a range of formal techniques including measures of inconsistency, fusion/merging operations based on social choice theory, and argumentation systems that provide arguments and counterarguments for claims.
The results of the project include substantial developments of our general theoretical systems for handling uncertainty and inconsistency, and demonstrations of our approach in specific applications including handling biomedical and biochemistry knowledge undertaken in collaboration with domain experts in meta-analysis and biosciences. Two particular application focuses of the project were for handling results from clinical trials and on rapid screening for substrate prediction in bioscience. Often, when considering results from a number of trials, there is uncertain and conflicting information. To address these issues, we developed techniques for performing meta-analysis with missing data, for querying conflicting trials results using ontological information to describe the patient and intervention classes, and for constructing arguments for determining relative superiority of particular interventions based on the available evidence. Parallel to this, a rapid screening method was developed to identify useful substrates based on previous experimental data. The results of these studies have been published in computer science and biomedical informatics forums. We have also written a state of the art review of technology for representing and reasoning with scientific knowledge that is published in Knowledge Engineering Review in 2010.
Personnel Involved (QUB)
Key publications from the project (total: 28 refereed journal and peer-reviewed conference publications and two software demos). Further details and more publications can be found at ssk@ucl