Human Cytome Project - A framework for cytome exploration - Update 13 Sept. 2005
As the on-line version of my article on the Human Cytome Project and the
application of cytomics in medicine and drug discovery (pharmaceutical
research) evolves, I put the updated version in this newsgroup for
reference. The original "question" on a Human Cytome Project was posted in
this newsgroup on Monday 1 December 2003.
A Human Cytome Project - an idea [Only registered users see links. ]
A framework for cytome exploration
By Peter Van Osta
To create an analog to digital workflow concept which can be applied to
ultra large scale research of human cellular diversity to improve our
understanding of cellular disease processes and to develop better drugs
(less attrition due to better functional predictions).
Allow for managing a highly diverse quantitative processing of cellular
structure and function. Create in-silico multi-scale and multidimensional
representations of cellular structure and function to make them accessible
to quantitative content and feature extraction.
The frontend technology mainly refers to optical systems, but CT, NMR, etc.
can also be used for molecular medical research.
This document only provides basic ideas and thoughts on a framework to
perform large scale cytome research, not yet with the concept of operations,
user requirements or functional requirements, etc. . At the moment it does
not provide a complete roadmap towards an entire system to achieve the goals
outlined in the Human Cytome Project (HCP) idea. The potential impact of the
human cytome on drug discovery and development is being discussed in Human
Cytome Project and Drug Discovery.
For those readers who are interested in a methodology to implement the
system under consideration (the software), I can recommend the Guide to the
Software Engineering Body of Knowledge. For project management principles I
can recommend the PMI Project Management Body of Knowledge. The choice of
which development process model (Agile, Extreme Programming, RUP,
V-Model,..) to use to develop the system under consideration is beyond the
scope of this document and it is left to the reader to decide (see SEI
CMMI). For more information on software engineering you can read my webpage
on Software for Science.
Let us now start with the thoughts and ideas for the framework. An entire
organism is an anisotropic, densely packed, 4D grid (or matrix) of a high
order of "recursive" information levels. We can study its structure and
function at multiple levels, where the structure and function at each level
is intertwined with over- and underlying structures and their function. The
genotype and the phenotype both exist in a continuum of (bidirectional)
interacting organizational levels.
Here I want to present and discuss some ideas on the exploration of the
cytome and the conversion of the spatial, spectral and temporal properties
of the cytome and its cells into their in-silico digital representation. It
is a set of ideas about a concept which is still changing and growing, so do
not expect anything final or polished yet. For readers with a good
understanding of biotechnology and software engineering, the concepts in
this article should be clear and easy to understand.
A modular and distributed framework should provide a unified approach to the
management of the quantitative analysis of space (X, Y and Z), spectrum
(wavelength) and time (t) related phenomena. We want to go from physics to
quantitative features and finally come to a classification and understanding
of the underlying biological process. We want to extract attributes from the
physical process which are giving us information about the status and
development of the process and its underlying structures.
First we have to create an in-silico digital representation starting from
the analogue reality captured by an instrument. The second stage (after
creation of an in-silico representation) is to extract meaningful parts
(objects) related to biologically relevant structures and processes. Thirdly
we apply features to the extracted objects, such as area and (spectral)
intensity, which represent (relevant) attributes of the observed structure
and process. Finally we have to separate and cluster objects based on their
feature properties into biologically relevant subgroups, such as healthy
In order to quantify the physical properties of space and time of a
biological sample we must be able to create an appropriate digital
representation of these physical properties in-silico. This digital
representation is then accessible to algorithms for content extraction. The
content or objects of interest are then to be presented to a quantification
engine which associates physical meaningful properties or features to the
extracted objects. These object features build a multidimensional feature
space which can be inserted into feature analyzers to find object/feature
clusters, trends, associations and correlations.
Managing the flow
Continues on: [Only registered users see links. ]