 |
Knowledge Discovery from Scientific
Databases, Internet and Medical Records
|
Summary
There is only a limited amount of safe knowledge
on the course of schizophrenic psychoses, although more than 100
years have passed since first description of "Dementia praecox"
first was described.
Text knowledge enables you to find relationships in nonstructured
or less structured textfiles. About 80 - 90 percent of information
in scientific papers and other sources on the Internet are textual.
The most valuable information is often hidden not in organized searchable
tables, but in freeform text.
By using Text Mining technology it is likely that hidden links in
the scientific literature will be uncovered implying new hypothesizes
for further brain research.
Project aims
The main objective of this project is to
create a Data Warehouse of information and use new and efficient
techniques for computer structuring of large textual data on schizophrenia
and related psychotic disorders. By doing that, it is likely that
new insights in brain research will be discovered and that hidden
links in the scientific literature will be uncovered.
Furthermore, the categorization of data will be used to define relevant
documents and forward them to different project leaders of the HUBIN-project.
The information extracted will also be used in a World Wide Web-project
for scientists, journalists and for the general public.
The need for software tools to deal with online documents on the
Internet is already large and is growing larger. Data sensing, acquisition
and storage technologies have led to vast observational data sets
being routinely reported in almost every aspect of biology and medicine.
Unstructured data- such as text - will still become the predominant
data type stored online. The problem with text is that it is not
structured like the tabular information typically stored in databases.
Advanced text analysis for discovering crucial information in scientific
documents are needed. To be able to mine text for information, extraction
tools are needed. Tools that can for example:
Extract key information from text
Organize documents by subject
Find predominant themes in documents
Looking for missing links between documents Relation maps between
keywords
This information can be used as metadata
about the documents and used in turn for Data Mining and allowing
the computer to generate new hypotheses.
Project leader: Jan-Eric
Litton

|