Go Back   Science Forums Biology Forum Molecular Biology Forum Physics Chemistry Forum > Molecular Research Topics Forum > Cell Biology and Cell Culture
Register Search Today's Posts Mark Forums Read

Cell Biology and Cell Culture Cell Biology Forum. Cell Culture Forum. Post and ask questions about cell culturing, cell lysis, cell transfection, cell growth, and cell biology.

A Human Cytome Project - an idea - Update 14 March 2005

A Human Cytome Project - an idea - Update 14 March 2005 - Cell Biology and Cell Culture

A Human Cytome Project - an idea - Update 14 March 2005 - Cell Biology Forum. Cell Culture Forum. Post and ask questions about cell culturing, cell lysis, cell transfection, cell growth, and cell biology.

LinkBack Thread Tools Display Modes
Old 03-14-2005, 02:27 PM
Peter Van Osta
Posts: n/a
Default A Human Cytome Project - an idea - Update 14 March 2005


As the on-line version of my article on the Human Cytome Project and the
application of cytomics in medicine and drug discovery (pharmaceutical
research) evolves, I put the updated version in this newsgroup for
reference. The original "question" on a Human Cytome Project was posted in
this newsgroup on Monday 1 December 2003.

On-line version (split version):
A Human Cytome Project - an idea
[Only registered users see links. ]

Human Cytome Project and Drug Discovery
[Only registered users see links. ]

Human Cytome Project - How to Explore
[Only registered users see links. ]

A framework for cytome exploration
[Only registered users see links. ]

A Human Cytome Project - an idea

By Peter Van Osta

The completion of the Human Genome Project holds many promises for the
understanding of the genetics of man and the involvement of genes in human
diseases. However the use of this information has to be viewed from
another perspective as is currently being done, if we want to use this
knowledge to improve medicine more efficiently. Predicting the dynamics of
the cell and its fate in diseases from the genome upwards is likely to
fail due to the complexity of metabolic processing and environmental
influences on the cellular metabolism and the phenotype of the entire

The clinical reality of disease processes extends beyond the present-day
disease models and the (current) boundaries of scientific development.
When we close the doors of our labs behind us and as physicians are
confronted with the clinical reality of diseases in the outside world, our
disease models fail all too often, as we can witness in the diagnosis and
treatment of complex diseases. This is also painfully obvious in the
dramatically high attrition rates during clinical development of new

When the endpoint of research is not only an experiment in a laboratory,
but to have an impact on the clinical reality of everyday pathological
processes, we fail to deliver in more than 80 to 90 percent of all drugs
being developed. Reality extends beyond the frontiers of science. Outside
the boundaries of scientific knowledge, significant parts of
(biological/clinical) reality remain un-explained for and not well

Drug discovery and development has to come up with drugs which can stand
the test of clinical reality, but is being squeezed between the failing
(theoretical) disease models and the demands for success of pharmaceutical
companies and society. Applied research has to provide the step stones to
cross the river from basic theoretical disease models to clinical reality,
ideally without getting our feet wet or drowning before we reach the other
side of the river.

How do we close the gap from model to clinic and find new directions for
research? The functional correlation between genome structure and
clinically expressed disease is too low to lead to functional predictions
from the genome and even proteome level upwards, without taking into
account the spatial and temporal dynamics of cells, organs and organisms.
Pathological processes have to be viewed from another organizational level
of biology in order to capture the dynamics of in-vivo processes involved
in diseases.

The current bottom-up view on genomic and proteomic research suffers from
a correlation and prediction deficit in relation to the entire organism.
The genome and proteome are the omega of biological research, not the
alpha of drug discovery or disease treatment. From disease to gene we may
find a link, but turning around and go back to develop a treatment for the
clinical disease fails in many cases. We may find that a gene or genes may
be part of a disease process, but we cannot explain the entire disease
process from the genome level alone. A gene may be involved in a disease,
but the entire disease process is not contained within the gene. To
discover the involvement of a gene or protein in a disease, does not
predict the potential for successful development of a treatment for the
clinical disease entity as such.

The extraction of the appropriate attributes of a biological process in
health and/or disease requires capturing the spatial and temporal dynamics
of its manifestations at multiple scales and dimensions of biological
organization. Disease entities express themselves in a space-time
continuum in which their physical and chemical attributes evolve in a
highly dynamic way. Capturing the appropriate features and disease
describing parameters from the background noise of their surrounding
processes and structures is more difficult than finding a needle in a

On Monday 1 December 2003 I posted a message about the idea of a Human
Cytome Project (HCP) to the bionet.cellbiol newsgroup (Van Osta P, 2003).
It seems that it was the right moment to ask the question, as there were
already ideas emerging on the role of the cell as the final arbiter in the
production of metabolic products and also the concept of predictive
medicine by cytomics (Valet G, 2003).

The idea of a Human Cytome Project is already being discussed at
scientific conferences (FOM 2004, ISLH 2004, ISAC XXII, EWGCCA 2004 …).
At Focus on Microscopy (FOM) in Philadelphia on Wednesday afternoon, 7
April 2004, the idea of a Human Cytome Project was for the first time
discussed at a scientific meeting. A round table discussion was held at
the European Microscopy Congress (EMC) and already articles start to
appear on the idea (Valet G, 2004; Valet G, 2004b; Valet G, 2004c). As the
idea of a Human Cytome Project seems to have generated some interest in
the scientific community, I decided to put the original message and
question on my personal website for reference, so here it is. Monday, 1
December 2003 10:57:46 +0100 Hi,

I was wondering if there is already something going on to set up a sort of
"Human Cytome Project”? In my opinion the hardware and most of the
software seems to be available to set up such a project? For the cellular
level, light-microscopy based reader technology would be very interesting
to use?

Studying and mapping the genome, transcriptome and proteome at the
organizational level of the cell for various cell types and organ models
could provide us with a lot of information of what actually goes on in
organisms in the spatio-spectro-temporal space?

I have been thinking (working) about a concept which could provide the
basic framework for exploring and managing this cellular level of
biological organization research on a large scale, but I would like to
know if there is already some thought/work going on in the direction of
setting up an initiative such as a "Human Cytome Project" ?

This is just an idea, so I am really interested to hear if there is
something in it, or even if it is not worth while what I just wrote.

Best regards,

Peter Van Osta.
The path which lead to the idea of a Human Cytome Project

I will give a bit more background to the path which for me has lead to the
idea that something of a Human Cytome Project might be feasible. The idea
for large scale screening of the dynamics of the (living) cell came when I
visited the Sanger Center in the UK in 2001 and was shown a big room
filled with DNA-sequencers. From then on I wanted to create a system which
could mean for cell-based research what DNA-sequencing had meant for Human
Genome research.

However I did not want to create a catalog of the cytome, but to allow for
the functional exploration of the cell in order to capture and describe
the dynamics of cellular processes and not only create a catalog of its
components. The multidimensional world of the cell requires a
higher-dimensional approach than the linear world of DNA and also a
different inner- and outer resolution is needed for each level of
biological integration. It became clear to me that the cellular level is
the lowest level of biological organization close enough to the complex
dynamics of a disease process. Only a high correlation to the disease
process itself allows a model to be used as a valid disease model.

Today powerful techniques to explore the cytome are available, such as
flow cytometry (Edwards B.S., 2004) and advanced digital microscopy (Price
J. H., 2003; Tsien R, 2003), which enables the exploration of the cellular
function and phenotype. There are now exciting technological developments
going on in what is called High Content Screening which will allow us to
explore cellular systems on a large scale (Taylor DL, 2001; Giuliano KA,
2003). These developments and other technological advances made me feel
confident that the exploration of the human cytome would be feasible. We
should be able to open the door to the cell wide open to look at cellular
structure and dynamics better than we do now by just looking through the
keyhole. My personal interest and research

I myself wanted to know if a framework to explore cells on a very large
scale could be implemented and would work. Managing the flow of data from
physics to features is the centerpiece of such as system. I wanted to
transform the space-time continuum of biological processes in cells into
their digital representations on a truly massive scale. Once a process is
represented in a digital state it becomes accessible to quantitative
content extraction and analysis.

As technologies evolve, it should be easy to exchange components of a
system or expand it with new technologies. The system should therefore be
modular and scalable, the core of the system should be of a different
design than the interface to the outside world and they should evolve
separately, only linked to each other for the exchange of information. The
concept should allow for up-scaling the system for processing massive
amounts of high-dimensional data.

The core has to be able to deal with multidimensional spaces and datasets
and manage the dataflow between modules, each module dealing with a part
of the entire process, from acquisition and detection to data generation.
From center to periphery, the system becomes increasingly machine and
technology related, while the core is only a data-transfer module unaware
of technical or physical constraints. Each machine which becomes connected
to the core enables to explore a subset of a physical space and informs
the core about its capacities and restrictions (0D up to 5D, spatial,
spectral and temporal).

A device attached to the system as such should allow for the exploration
of a part of this spatio-spectro-temporal continuum. Devices differ in
their sampling of the electromagnetic spectrum (LM, EM, CT, NMR …), the
spatial scale at which they can operate (nm, microns, mm …) and their
temporal resolution (nsec, msec, sec, min …). A given device has an
inner and outer spatial, spectral and temporal resolution limit. All
(imaging) devices generate pixel or voxel density profiles which can be
used for (semi-) quantitative exploration. A given input data point
represents a spatial, spectral and temporal sampling of the
spatio-spectro-temporal continuum. The basic principles remain the same,
only our point of view and our perspective of the physical boundaries in
space and time change.

The physical dimensions of the high-dimensional space and the meaning of
each pixel/voxel are only relevant for the quantification module as the
detection module only deals with “density” patterns in a 5D space.
Anisotropy in spatial, temporal and spectral sampling are only accounted
for at the periphery of the system, as they have an impact on the
quantification of objects. Each dimension (XYZ, spectral, temporal) is
regarded as a continuum, sampled at discrete intervals, each with its own
inner and outer resolution.

The system design allows for distributed operation, so a system could run
on different platforms and interact with components over a network. It
should use open standards for its communication with the outside world to
allow for easy integration in a heterogeneous environment (XML, CORBA
…). The output of the system should be a set of linked feature
hyperspaces, each describing structural and functional attributes of the
individual cell and its components. The data output must be in a format
which can easily be parsed and fed into data analysis and visualization

The system extracts features, their meaning as such is not relevant for
the system itself, but for the observer of the feature space. Capturing
attributes is not the same process as assigning a meaning to the features
we extract from a biological system. The meaning of a change in the
multidimensional feature space can be built into a postprocessing system,
but the content extraction process has to capture quantitative data and
not interpretations of data. The ultimate data reductions are the
assignment of a meaning to a quantitative feature or attribute change, not
the extraction of only a minimum of features.

Since 2001 I have been thinking about, and working on, the design of such
a scalable system, of which the first version of the M5 framework is now
operational and it allows me to study its practical use in more detail
(Van Osta P., 2004). The core of the system is being built into a
framework for the exploration of cells, tissues and model organisms by
using a microscopy based reader. The system is designed to be used as a
discovery process plug-in which enables cell based experiments to flow
through its modules to convert physical events into a feature space for
exploration and interpretation. The roots and predecessor of my own work

The predecessor of this system and a source of inspiration dates back to
the late eighties and early nineties of the twentieth century (Geerts H,
1987; Ver Donck L, 1992; Cornelissen F, 1993; Geerts, H, 1992; Geusebroek
J.M., 2000; Van Osta P, 2002).

This use of digital microscopy in drug discovery originated from Nanovid
microscopy long ago (De Mey J., 1981; De Brabander M., 1986; De Brabander
M, 1986b; Geuens G, 1986; Geerts H, 1987; De Brabander M, 1989; Geerts H.,
1991). Nanovid microscopy itself had its origin in the study of
microtubules (De Mey J., 1976; De Brabander M., 1977). Automated Calcium
(Ca2+) ratio imaging was used for studying the effect of drugs on isolated
cardiomyocytes. This research dates back to halfway the eighties of the
twentieth century (Borgers M, 1985; Ver Donck L, 1986; Ver Donck L, 1987;
Borgers M, 1988; Ver Donck L, 1988; Ver Donck L, 1990; Geerts H, 1989;
Olbrich HG, 1991; Ver Donck L, 1991; Ver Donck L, 1992; Cornelissen F,
1993; Ver Donck L, 1993; Cornelussen RN, 1996).

Drug discovery research by using cellular disease models with automated
microscopy based systems was done in this environment for many years,
before it became fashionable in the outside world (Geerts H, 1989; Ver
Donck L, 1992; Cornelissen F, 1993; Nuydens R, 1993; Nuydens R, 1995;
Nuydens R, 1995b; Geerts H, 1996; Nuydens R, 1998). Why a Human Cytome
Human Genome Project

The Human Genome Project (Lander ES, 2003; Venter JC, 2003) has set a new
milestone in medicine and the understanding of human biology (Guttmacher,
A., 2002; Guttmacher, A., 2003). Since its conception in 1986, it has
answered many questions, but it has also left us with more questions to
answer and it opened new horizons for exploration (Dulbecco R., 1986;
Collins F., 2003). The results of the Human Genome Project lead to a first
estimate that there are only about 34,000 genes in the human genome and by
the end of 2003 the number was reduced to some 25,000 genes (Claverie
J.-M., 2001; Wright F. A., 2001; Pennisi E., 2003). Now at the end of 2004
the euchromatic sequence of the human genome is complete, the number of
genes is estimated to be about 20,000 to 25,000 (Collins FS, 2004).

The Caenorhabditis (C. elegans) genome is comprised of over 18,000 genes.
The fruit fly (D. melanogaster) genome consists of about 13,000 genes and
as such it has fewer genes than C. elegans, although as an organism it is
far more complex. Gene number alone does not predict functional
complexity. Although there is much more variation in the sizes of the
genomes, this is not reflected in the number of genes.

The functional uncoupling of the dynamics of cellular function to its
genomic gene-count came as a shock. The complexity and diversity of
organisms is not reflected in the structural complexity of their genomes
alone, but to a large extent it is hidden in the dynamics of gene
expression and cellular processing. As there is no linear relation between
the complexity of an organism and the physical structure of its genome,
there is also no one-on-one relation between the phenotype of an organism
and its genome. Relatively small differences between organisms, such as
man and chimpanzee do result in large functional differences in gene
processing and functional expression.

The structural relatedness of the human and chimpanzee genome, does not
explain the large difference in brain function for which gene expression
profiles in the brain are a better predictive instrument (Caceres M, 2003;
Uddin M, 2004). Functional differences between chimpanzee and man are more
outspoken in the brain than in other organs. Gene expression differences
are more related to cerebral physiology and function in humans than gene
sequences. Epigenetic phenomena within individual cells and differential
processing in different cell types have more predictive power than the
piecemeal and one-dimensional gene sequence approach, when applied on
complex structures such as the brain (Wilson KE, 2004). From single gene
and genome to the entire cell

Now we are starting to use the information coming out of the Human Genome
Project, people start to understand that the dynamics of the cell and its
fate in disease processes cannot simply be explained from its individual
genes, genome or its proteome. Although all cells in the human body share
the same genome, there is considerable heterogeneity in their phenotype
and dynamics. Structural information alone or information from too low an
organizational level cannot sufficiently predict higher-order phenomena as
it does not sufficiently take into account interactions at higher
organizational levels and influences from outside the low-level
organizational unit. Cells have come up with compensation mechanisms to
maintain their structural and functional integrity in the face of
perturbations and uncertainty (Stelling J, 2004). Organisms are capable of
buffering genetic variation (Hartman JL 4th, 2001). Genetic buffering
mechanisms modify the genotype-phenotype relationship by concealing the
effects of genetic and environmental variation on phenotype (Rutherford
SL., 2000).

So if the structure of the genome alone cannot explain the differences
between species, disease processes and the dynamics of the cell, where
does our functional complexity and interspecies differences come from? How
do we continue in the post-genome era to study the dynamics of the cell
and entire organisms? How are genes related to the function of an organism
and where do we loose track? These questions are not of academic
importance alone, but their answers have a significant impact on the
diagnosis and treatment of (complex) diseases, drug discovery and

Let us take a walk from gene to protein and take a closer look at “The
Central Dogma of Molecular Biology”, which I personally prefer to call
an axiom instead of a dogma. Science should only have axioms and leave
dogmas to religion. Associating genes with diseases

In order to start studying the contribution of a certain gene to a disease
we must first find the gene(s) which might play a role in a given disease.
The strength of the association must be detectable by the method being
applied, which in complex gene-disease relationships has to find the
association on a background of significant functional and phenotypical
noise, such as in multifactorial diseases like diabetes (Doria A., 2000).
Variation in the phenotypical expression of many quantitative traits
(length, weight …) is due to the simultaneous segregation of multiple
quantitative trait loci (QTL) as well as environmental influences. Genetic
dissection of complex traits and quantitative trait loci is a complex
process (Darvasi A., 1998; Darvasi A, 2002).A mono-factorial approach is
likely to fail in a multifactorial process of pathogenesis (Templeton AR.,

Giving a gene its place in a disease process is not a trivial endeavour
and it is complicated by both technological and methodological
difficulties. Association studies offer a potentially powerful approach to
identify genetic variants that influence disease processes (Lohmueller KE,
2003; Roeder K, 2005). The density of Single Nucleotide Polymorphisms
(SNP) makes them a popular target for studying gene-disease associations.
However it is not only the density alone which counts, but also the
information content of a given polymorphism (Bader JS. 2001; Ohashi J,
2001; Byng MC, 2003; Chapman JM, 2003; Garner C, 2003).

False positive correlations of genetic markers with disease are reported
due to a flawed statistical analysis (Nurminen M., 1997; Edland SD, 2004;
Wacholder S, 2004). In microarray experiments defining the appropriate
sample size to find differentially expressed genesis is an important issue
(Wang SJ, 2004). In complex diseases in which not only multiple genes and
the dynamics of gene products play a role, associating particular genes
with a disease entity is even more difficult than in so-called monogenic
diseases (Carey G., 1994; Long AD, 1999). Proper subgroup analyses in a
randomised controlled trial (RCT) require careful design (Brookes ST,

Turning a gene-disease association into determining its role in the actual
causation of a disease process is even further away from finding and
establishing a positive correlation (Templeton AR., 1998). From genome
sequence to gene activity

The genome sequence alone does not allow us to predict the functional
impact of sequence variations as epigenetic modulation influences
functional gene expression. Epigenetic modulation of gene function is a
cause of non-Mendelian inheritance patterns and variability in the
expression and penetrance of a disease. Even transmission of an identical
gene sequence is not a guarantee for identical gene expression as the
(in)-activation of a gene by epigenetic modulation occurs differently when
a gene is of paternal or maternal origin. Where (in what cells or tissues)
and when (at what stage of development or under what conditions) genes are
expressed is a highly dynamic process. These spatial and temporal gene
expression patterns can be assembled into "localizome" maps (Dupuy D,

Epigenetic modulation of gene expression is heritable during cell division
but is not contained within the DNA sequence itself (Reik W, 2001;
Bjornsson HT, 2004; Kelly TL, 2004; Chong S, 2004). Epigenetic modulation
is one of the problems encountered when cloning, as the cloning process
differs in its epigenetic regulation of (embryonic) gene expression (Mann
M, 2002).

This differential inactivation of genes from maternal and paternal origin
even leads to functional X-chromosome mosaicism in women as their cells at
random inactivate one of their X chromosomes. X-inactivation occurs early
in embryonic development and all cells subsequent inherit a different
functional X chromosome. The inactivated X chromosome can be seen in a
microscope as a Barr body in the interphase nuclei of female mammals.
Differential activation of genes creates a functional chimera.

Chemical modification by methylation of cytosine residues is a major
regulator of mammalian genome function and plays an important role in the
intra-uterine development of an organism and the regulation of gene
expression (Urnov FD, 2001). Tissue specific imprinting in genes leads to
differential gene expression in different tissues (Weinstein LS, 2001).
Aberrant DNA methylation has been implicated in the pathogenesis of a
number of diseases associated with aging, including cancer and
cardiovascular and neurological diseases (Walter J, 2003; Jiang YH, 2004;
Macaluso M, 2004). A dietary component such as folic acid is a key
component of DNA methylation during in utero development, disease
development and aging (McKay JA, 2004). Genes and environment interact and
this might play a critical role in the pathogenesis and inheritance of
complex diseases (Vercelli D, 2004).

The gene expression flow from mRNA to tRNA is not a smooth unregulated
process in itself. Cells use RNA-induced silencing complexes (RISCs)
programmed with small interfering RNA (siRNA) to knock down target RNA
levels (Robb GB, 2005). RNAi is used by Eukaryotes for sequence-specific,
post-transcriptional gene silencing (Scherr M, 2003). This mechanism adds
another feedback loop onto the multiple layers of gene expression
regulating mechanisms.

The correlation of even a gene sequence to the first steps in its
expression does not show a one-on one relation to the gene sequence
itself. Modulators and regulators of transcription and translation are
showing a highly dynamic process regulation mechanism. Cells use several
mechanisms to create functional flexibility from (relative) structural
(genome sequence) rigidity. The genome is a repository of our genetic
potential, but only a part of it is active at different spatial and
temporal locations during our lifetime. It is not only important to know
what we can do within the limitations of our genomic boundaries, but also
how we deal with this potential in spatial and temporal patterns during
our lives. We do not deploy the full potential of our genome at every
moment of our life and in all our cells in the same way. Although all our
cells share the same genome, they are highly diverse in their structure
and function, not only are they spatially differentiated but also
temporally. The relation of gene structure to its function is a
bidirectional process of which our understanding of the impact of
different modulators is still not sufficient to create highly correlating
disease models. From gene to protein, a bumpy road

A eukaryote, such as Homo sapiens, has no one-on-one relation to its
genes. The dynamics of gene expression is regulated by hypo-, iso- and
epigenetic operators. The gene may be the structural unit of inheritance,
but the protein domain is the functional unit of metabolism.

When we talk about protein structure, the primary structure refers to the
amino acid sequence in a protein (1D). The primary structure is most
closely related to mRNA and as such the gene sequence and gene structure
from which the protein originates. The terms secondary and tertiary
structure refer to the 3D conformation of a protein chain. Secondary
structure refers to the interactions of the backbone chain (alpha helical,
beta sheet, etc.). Tertiary structure refers to interactions of the side
chains. Quaternary structure refers to the interaction between separate
chains in a multi-chain protein (4D). The combined shape of the secondary
and tertiary structure and the quaternary structure is referred to as the
conformation of the protein. With increasing dimensionality, the relation
between a higher order organization of protein structure and its gene
relaxes as other physical and chemical influences play an increasingly
important role in its physical and functional integrity.

In a mature enzyme, only a relatively small number of its amino-acids
interact with a ligand, the majority of amino-acids help to create the
appropriate 3D and even 4D structures required for its in-vivo
functionality. Structural proteins and enzymes may show interactions over
larger parts of their molecular surface to form functional homo- or
hetero-polymers in their quaternary structure. From a single gene to a
protein, we have to deal with the dynamics of gene expression regulation
and mRNA formation (promoters, cis- and trans-regulation, transcription,
splicing). We have to deal with the interaction of tRNA with mRNA in the
translation of an mRNA sequence into a protein sequence and
post-processing of the protein sequence into a functional 3D and 4D
structure (Wobble, sequence processing, protein folding and interaction).

A structural similarity at the genome level does not lead to functional
similarity, due to epigenetic regulation (Eckhardt F., 2004). Sequence
variation, due to mutations does not bleed through to the protein level
one-on one. Basic mechanisms act as powerful uncouplers of gene structure
from protein function. Mutations in the DNA and errors during
transcription of the DNA-sequence into mRNA are not linear predictive for
the structure and function of the protein resulting from the translation
of the DNA-sequence into the protein-sequence, due to the degeneration of
the genetic code. The deleterious effects of sequence variations are up to
a certain extent suppressed by the Wobble-mechanism used in base-pairing
in translating mRNA to protein (Crick F, 1966).

Protein sequence = k x gene sequence

In this formula, ‘k’ is always smaller than one for most amino acids
built into a protein, due to mechanisms such as splicing variation, Wobble

In eukaryotes, a relatively simple genome compared to their functional and
structural complexity can be used, because of the existence of introns and
exons. An exon in general defines a functional domain and these domains
are rearranged to create a more complex proteome than the genome it is
derived from. Constitutive and alternative splicing of genes is
dynamically regulated at the moment of transcription and pre-mRNA splicing
by cis- and trans-acting factors (Kornblihtt AR, 2004). Before the
completion of the Human Genome Project was finished it was expected that
man would need about 100,000 genes to explain the structural and
functional complexity of our species. This number has collapsed to about
25,000 genes and is about four times (75 percent) lower than expected
(Collins FS, 2004). The functional differences between species are more
related to differential processing, due to different up- and down
regulation of genes in different cell types and organs. The use of
different promoters and splicing variants is used to tune protein and
enzyme structure and function in different cell locations and organs
(Ayoubi TA, 1996, Masure S, 1999; Nogues G, 2003, Yeo G, 2004). Promoter
variation and differential splicing allows for spatiotemporal
differentiation in protein expression, while the organism does not have to
manage an explosion in genomic size and sequence-complexity. This
mechanism helps to uncouple the protein from the rigidity of the gene
sequence in order to allow for functional variation while restricting
structural variation at the genome level. Functional differentiation in
gene expression allows for a better adaptability to changing conditions,
without the need for fast-paced changes in gene structure.

Protein folding of a linear amino-acid sequence into a 3D protein also
acts as a functional uncoupler of gene sequence to protein function.
Changes in the physical and chemical environment of the protein may change
the shape and alter the conformation of a protein. By putting a protein in
a different physical and chemical environment which will change the
ability of the van der Waals, hydrogen, ionic and covalent bonds which
hold the protein together in its particular conformation, it is possible
to cause the molecule to unfold by breaking those bonds and make it change
or even lose its function (denaturation). 3D and 4D protein folding is a
complex process. Even today the protein folding problem remains one of the
most basic unsolved problems in computational biology. Predicting protein
folding from the gene upwards ignores the influence of the
post-translational modification (PTM) and the influence of the in-vivo
physico-chemical environment of the protein. Proteoglycans and
glycoproteins are not derived from a gene sequence as such, but their
structure is the result of extensive post-translational modification. Cell
membranes contain phospholipids, which are not encoded by DNA as such, but
they result from metabolic processing and nutritional components.

While the protein-sequence at the moment of translation is related to the
gene-sequence, the final structure and function of an enzyme is in
addition defined by post-translational modification (PTM) and its
physico-chemical environment (Kukuruzinska MA, 1998; Uversky VN, 2003;
Schramm A, 2003; Seddon AM, 2004). Studying protein folding is a
computational complex process and still the focus of intensive research
(Murzin A. G., 1995; Orengo, C.A., 1997; Dietmann S, 2001; Day R, 2003;
Harrison A, 2003; Pearl F, 2005). Epicellular regulation of protein
glycosylation also plays an important role in the dynamics of protein
activity (Medvedova L, 2004).

The majority of proteins are subjected to a multitude of
post-translational modifications. Post-translational modification involves
cleaving, attaching chemical groups (prosthetic groups), internal
cross-linking (disulfide bonds). Already more than hundred different types
of PTM are known, which act as functional uncouplers of protein structure
from the gene sequence (Hoogland C, 2004). A protein precursor may be
differently processed in different cell types and, in addition, diseased
cells may process a given precursor abnormally (Dockray GJ., 1987; Poly
WJ., 1997; Rehfeld JF., 1990; Rehfeld JF, 2003). Post-translational
protein modifications finely tune the cellular functions of each protein
and play an important role in cellular signaling, growth and
transformation (Parekh RB, 1997; Seo J, 2004).

In a functional protein only a very few specific residues are actually
responsible for enzyme activity, while the fold is much more closely
related to ligand type (Martin AC, 1998). The effect of an amino-acid
change on protein structure and function depends on the location of the
amino-acid in the 3D structure, its physico-chemical properties and the
physico-chemical environment it is being processed and used. Amino-acids
which are distant neighbours in the protein sequence can become close
neighbours in the 3D structure of the protein and as such a protein
sequence variation is only a weak determinant of the function of a mature

By just going from DNA-sequence to 3D protein structure, the relation
between genome sequence and the functional status of a cell begins to
fade. By taking this relation even further from gene to organism, we lose
additional predictive power. How will be able to design models that will
allow us to predict the functional outcome of a disease, when we use a
fuzzy model to start with? Powerful uncouplers of the structural relation
of even a protein to the gene it is primarily derived from, do not allow
us to draw hard conclusions about impact on the functional status of an
organism from the gene and genome sequence. From proteome to cell

Eukaryotic cells are highly compartmentalized; proteins do not exist in
the cell as in a homogeneous fluid, but in different compartments of the
cell, each with a different physico-chemical environment. The 3D and 4D
structure of a protein and its functionality is highly dependent from the
in-vivo physico-chemical environment of the protein.

Studying proteins without taking into account their spatial and temporal
organization in a cell, ignores the complexity and dynamics of protein
expression and interaction in a cell. Studying proteins in-vivo reveals
more about their function and dynamics (Chen, X., 2002; Hesse J, 2002;
Pimpl P, 2002; Viallet PM, 2003; Murphy R. F., 2004). Without information
about the relation between cellular structure and function, a lot of
information is lost. A 2D protein-profile may show the entire protein
content of a cell, but we lose all information about the intracellular
spatial and temporal distribution of these proteins.

Eukaryotic cells are highly spatially differentiated structures. Proteins
involved in trans-membrane trafficking, require a membrane to do their
work and cannot do their work outside this specific physico-chemical
environment. A protein has to reach the appropriate physico-chemical
environment in the cell in order to do its work properly (Graham TR.,
2004). Studying a protein outside its in-vivo physico-chemical context
leads to a loss of correlation with its in-vivo dynamics.

There are three main cellular compartments in a eukaryotic cell, the
nucleus, cytoplasm and the cell membrane. The nucleus itself is a highly
organized 3D structure with highly spatial and temporal differentiated
DNA- and RNA-processing machinery (Lamond AI, 2003; Politz, J., 2003;
Pombo, A., 2003; Iborra F, 2003; Spector DL., 2003; Cremer T, 2004). Both
transcription and splicing of the mRNA message are carried out in the
nucleus (Sleeman JE., 2004). The distribution of eu- and heterochromatin
changes throughout the cell cycle, chromosomes and spindles appear during
cell division. The dynamics of gene transcription is visible in the
chromatin condensation patterns in the nucleus.

The cytoplasm itself contains several organelles, smooth and rough
endoplasmatic reticulum (SER and RER), ribosomes, the Golgi apparatus,
mitochondria, lysozomes and the cell membrane. Each organelle deals with a
different set of processes necessary for cell development and maintenance.
The membranes of organelles are highly dynamic structures which undergo
profound changes during the life cycle of a cell (Ellenberg, J. 1997;
Zaal, K. J. M., 1999). The endoplasmic reticulum (ER) is a multifunctional
signalling organelle that controls a wide range of spatially and
temporally differentiated cellular processes (Berridge MJ., 2002).

The structural compartmentalisation of the intracellular environment
allows for a functional differentiation and provides a process flow
management mechanism. The membrane structure and the mitochondrial
membrane potentials (MMP) of mitochondria play an important role in their
function. (Zhang H, 2001; Pham N.A, 2004). Microtubules play an important
role in cellular function and their organization and dynamics are being
studied by microscopy based techniques (De Mey J., 1981; De Brabander M.,
1986; Geuens G, 1986; De Brabander M, 1989; Geerts H., 1991; Olson KR,

The dynamics of intracellular ion-fluxes such as for calcium (Ca2+) is
organized in a highly dynamic and spatial and temporal complex pattern.
Ions are themselves not encoded by the genome, but play an important role
in cellular function. The intra- and extra-cellular dynamics of ions
(concentration, flux) interact with a spatial and temporally regulated
pattern for protein expression and differential protein activity. The
complexity of intracellular calcium-signaling extends beyond the mere
expression profiles of genes encoding the proteins involved in
calcium-dynamics (Berridge MJ., 1981; Bootman MD, 2002; Cancela JM, 2002;
Berridge MJ., 2003; Berridge MJ, 2003b). For their proper function and
survival cells have to manage Ca2+ concentration and flux in space, time
and amplitude (Bootman MD, 2001). Calcium is involved in the delicate
process of spatially and temporally organization of cellular communication
(Berridge MJ., 2004).

As an example of spatial compartmentalisation in the cell, hydrolytic
lysozomal enzymes require a specific physical and chemical environment to
do their work, which inside the cell only exists inside the lysozomes (De
Duve C, 1955). The boundary membrane of the lysozome keeps the hydrolytic
enzymes away from the rest of the cytoplasm and so controls what will be
digested (De Duve C., 1966).

The cell membrane separates the interior of the cell from its environment,
but is a highly dynamic structure (Kenworthy, A. K., 1998; Varma, R.,
1998). The appropriate spatial and temporal dynamics of the cell membrane
are vital for the survival of the cell. The cell membrane provides the
physical boundaries in which the cell can maintain a highly dynamic
physical and chemical environment. Cell-to-cell communication is
dynamically managed at the level of the cell membrane (Nohe A, 2004).

Proteins do their work in spatially different cellular environments and
with different spatial and temporal patterns. A protein can be mobile in
one cellular compartment and immobile in another (Ellenberg J., 1997).
Co-expressed proteins may in reality never interact with each other
because they do their work in separate cellular compartments. The
substrates of proteins may migrate through different cellular compartments
in order to be subjected to a highly dynamic interplay of enzymatic
processes. Proteins which do their work in the same cellular compartment
may only be expressed at different stages during the life cycle of a cell.
Spatial and temporal protein localization information can help us to find
entries into eukaryotic protein function (Kumar A, 2002).

An important temporal differentiation of cellular processes occurs during
the cell cycle. The different stages in the cell cycle each depend on the
spatial and temporal expression of multiple proteins. The passage of the
cell through the cell cycle is controlled by proteins in the cytoplasmic
compartment, such as different Cyclins, Cyclin-dependent kinases (Cdks)
and the Anaphase-Promoting Complex (APC). First there is the G1 phase
(growth and preparation of the chromosomes for replication). Secondly the
cell enters the S phase (synthesis of DNA and centrosomes) and finally the
G2 phase which prepares the cell for the actual mitosis (M). The mitosis
itself consist of a spatial and temporal sequence of events, called the
prophase (mitotic spindle), prometaphase (kinetochore), metaphase
(metaphase plate), anaphase (breakdown of cohesins) and telophase where a
nuclear envelope reforms around each cluster of chromosomes and these
return to their more extended form.

However our understanding of the cell cycle is still far from complete.
The regulation of the cell cycle by G1 cell cycle regulatory genes is more
complex than we thought (Pagano M, 2004).

Cells also operate in a temporal pattern based on internal and external
clocks. Cellular events must be organized in the time dimension as well as
in the space dimension for many proteins to perform their cellular
functions effectively (Okamura H., 2004). Circadian molecular clocks
regulate protein dynamics in temporal paterns (Crosthwaite SK., 2004;
Hardin PE., 2004; Harms E, 2004; Hastings MH, 2004; Ikeda M, 2004; Rudic
RD, 2004; Schwartz WJ, 2004; Shu Y, 2004; Takahashi JS., 2004).

We need to study and understand the intracellular in-vivo dynamics of
protein metabolism and its spatial and temporal organization in different
cell types. We need to study intracellular protein ecology, not just
ex-vivo protein interactions or building a protein catalogue of only
scalar dimensions. The spatial and temporal patterns of intracellular
protein dynamics are an important factor in health and disease. The
dynamics of cellular function

Taxonomy is the science of organism classification and refers to either a
hierarchical classification of things, or the principles underlying the
classification. Today the emphasis of biological research is on
classifying genes, proteins in large catalogues, instead of studying the
spatial and temporal dynamics of cellular processes in vivo. The global
analysis of cellular proteins or proteomics is now a key area of research
which is developing in the post-genome era (Chambers G, 2000; Ideker T.,
2001; Aitchison J.D, 2003). Proteins show functional grouping into modules
which can be grouped into elegant schemes (Hartwell, L.H., 1999; Segal,
E., 2003).

In-vivo however the spatial and temporal distribution and interaction of
proteins with other proteins, substrates, etc., adds another layer of
complexity which is not taken into account by functional studies alone.
Expression studies, no matter how we group them, do not reveal the
intracellular spatial and temporal distribution of proteins and the
functional outcome of their metabolic activity (spatial and temporal
substrate trafficking) in various cellular compartments. Studying proteins
only from a functional point of view ignores the impact of their
intracellular spatial and temporal dynamics. Molecular taxonomy or systems
biology (genomics, proteomics) will not provide us with the functional
answers we need to know.

Systems biology studies biological systems systematically and extensively
and in the end tries to formulate mathematical models that describe the
structure of the system (Ideker T., 2001; Klapa MI, 2003; Rives A.W,
2003). However the level of biological integration which is being studied,
genes, proteins, pathways is still too far away from pathological reality
to allow for the development of highly predictive and highly correlating
disease models. The end-point of present day systems biology only takes
into account infra-cellular dynamics and loses track when iso- and
epi-cellular phenomena interfere with the dynamics of the model. Studying
the physics and chemistry of protein interactions cannot ignore the
spatial and temporal dynamics of cellular processes.

The cell is at the crossroads of life itself, being the lowest order
functional unit operating in a functional complete way. As such the cell
is for life what the atom is for physics, the smallest biological level of
organization, operating as a functional unit. Dysfunctional cells by
whatever cause, either gene malfunction, infection, nutritional or
environmental problems will eventually cause the entire organism to lose
its functional integrity. The dynamics of cellular systems allow for the
adaptation of the cell to a wide variety of conditions and challenges, a
relatively uniform physical structure combined with a web of interacting
dynamic processes leads to the multitude of cells which we see in living
organisms. In a living organism there is no such thing as an average cell
type from a functional point of view. Cells are functionally highly
diverse in both spatial and temporal dimensions.

The stochastic variation of cellular processing at the molecular level is
another cause of functional uncoupling of the cytome from the genome and
ads to the variability in functional behavior between cells (McAdams H.H.,
1999; Raser J.M., 2004). Structural research alone underestimates the
complexity of dynamic processes as it does not capture sufficiently the
dynamic complexity of the cell. The dynamic interaction of processes in
multiple pathways is the centerpiece of cellular life, not the individual
components or even individual enzymatic reactions in the cell. There is no
monotonic sequence of causation from genome structure to cellular

Cellular function can be compared to a symphony in which multiple
“instruments” contribute to a complex, but in a healthy state
harmonic, “sound”. Genes and the dynamics of disease processes

The challenges faced by the medical world today are no less today than the
ones we faced a century ago. The spectrum of diseases may have changed
through time, as degenerative diseases and cancer play an increasing role
in modern society. On the other side an old enemy is back on the rise, how
much we thought that infectious diseases were a thing of the past; they
are back and with a new and frightening face.

Our increase in the knowledge of the involvement of our genes and large
scale proteomics in disease processes has not lead to an increase in the
productivity of pharmaceutical research (Drews J., 2000; Huber, L.A.,
2003; Lansbury PT Jr., 2004). The gap between the gene and the functional
outcome of a disease is too wide to bridge it from one direction only
(Workman P., 2001). Much thought has gone into finding a way how the
knowledge coming out of genomics and proteomics could revolutionize drug
discovery, such as for drug target discovery (Lindsay MA., 2003). The
target of a drug molecule may be a protein, but the target of disease
therapy is the entire cell and by extension the cell population of an
organism. Every drug and its target may be part of a disease therapy, but
the therapy is not restricted to the drug and its target. Every target is
part of a therapy, but not every therapy is confined to a traditional drug

In the case of diseases where we have already found a genetic basis, this
does not always allow us to create a model for the disease process. To
discover the involvement of a gene in a disease process does not tell us
anything about its place and relative importance in the multiple and
multilevel elements involved in the causation of a disease, such as genes,
nutrition, infectious agents and the environment. To discover a causative
element is not the same as understanding and predicting its dynamic
involvement in a disease process. What we do know is that all causation
has to pass through cells, as they constitute the “quanta” of the
organism itself.

Many diseases of clinical importance have heterogeneous mechanisms which
lead to the disease and only in a subpopulation the diseases can be traced
back to a single gene. In most cases a multiplicity of mechanisms
contributes to the diseases process. Genetic information has a high
predictive value in only a minority of cases.

Non-coding sequences, inter-gene and epigenetic interactions have a
significant impact on the prediction of the age of occurrence, severity,
and long-term prognosis of diseases (El-Osta A., 2004, Perkins DO, 2004).

The importance of the dynamics of the cell and its involvement in
pathological processes and current therapeutic efforts also requires a
better understanding of its function and phenotype in its relation to
pathological processes in diseases, such as in cancer, Alzheimer disease
and infectious diseases, such as AIDS, tuberculosis (TBC), influenza
(flu), etc.

Trying to predict a disease process from the genome (proteome) upwards, is
like trying to solve a higher order polynomial while omitting the majority
of elements and expecting that the equation will work:

e.g.: Disease process = a x x + b

Instead of using a higher order multi-dimensional model, closer to in-vivo
functional dynamics in which a matrix or web of causation and consequences
interacts in a high-dimensional space-time continuum:

e.g.: Disease process = a x un + b x vo + c x wp + d x yq + e x zr

In addition, each parameter which is being used in an equation is in
itself the result of an underlying or “overlying” dynamic process.
Each layer of organization can be fed into higher or lower order levels of
organization as there is always a cross-influence in both directions. It
is a matter of expanding or collapsing the set of parameters and taking
into account or ignoring underlying “modifying” influences. Reducing
the complexity allows for a better understanding of a simplified model,
but has a decreased match to the complexity and dynamics of biological
reality. When we create a model, we should not regard it as a one-on-one
substitute for reality which we capture only partially into our model.
Infectious diseases

Infectious diseases still pose a significant threat to the health and well
being of (modern) society. After years of relative neglect, nations are
increasingly aware of the present and future threats of infectious
diseases and are even setting up new agencies, such as the European Centre
for Disease Prevention and Control (ECDC) or expand the role of existing
organizations, such as the Centers for Disease Control and Prevention
(CDC). Beside their political and economical impact on society, how do we
deal with infectious diseases in science?

In infectious diseases the environment, in this case the infectious
agents, interacts in a complex way with the host defense system of which
much remains to be explored. We must be aware of the fact that the golden
era of antibiotics is already behind us as many infectious agents (e.g.
TBC, MRSA and other bacterial diseases) are showing an increasing
resistance against most classes of antibiotics which are available today
(Davies J, 1994). We have succeeded in less than a century to destroy our
best weapons against infectious diseases, due to misuse of antibiotics
both by physicians and their patients. Only the elderly remember the days
when mortality due to infections was a major cause of premature death, but
the moment is approaching when this nightmare will return. Emerging
infectious diseases (EIDs) and re-emerging infectious diseases challenge
our defenses (Ranga S, 1997; Fauci AS., 2004; Morens DM, 2004).

Viral diseases (e.g. AIDS, influenza) are even harder to fight as they use
the cellular machinery of the body itself to reproduce. We need to study
the pathological process in cells in more detail and in a different way,
in order to have a chance to succeed in the new therapeutic challenges
ahead of us. Viruses, under selective pressure of modern antiviral drugs
are also showing increasing resistance to treatment. We are running out of
time in our battle against infectious diseases and a systematic approach
will only give us the answers when it will be too late. We are not setting
the agenda, but the diseases are taking the lead.

Due to modern technology, the time to respond to a new infectious
challenge is being reduced. In modern times, diseases take planes too,
which makes it even harder to fight them by classical isolation or
quarantine. Airplanes may be safe to travel with, compared to other
transport systems, but they can cause secondary mortality by transporting
pathogens over large distances at a speed unknown to previous generations,
which gives a new meaning to airborne infections (Gerard E, 2002; Van
Herck K, 2004; Blair JE, 2004). Infectious diseases may initially go
unnoticed in underdeveloped areas of the world (e.g. Ebola virus Lassa
fever, Marburg virus), but as soon as they board a plane, it is modern
technology which will give them free access to the world (Clayton AJ,
1979; Gillen PB, 1999). A relatively long incubation time combined with a
high mortality rate will allow a disease to spread widely and cause a
pandemic, before we even can start a treatment program. If an unknown
disease causes such a pandemic, we may run out of time before we can find
a cure as we first have to develop a diagnostic tool. A recent example
which is a model of what can happen was the Severe Acute Respiratory
Syndrome or SARS (Peiris, J.S.M. 2003, Berger A, 2004; Heymann DL, 2004;
Tambyah PA, 2004).

Robert Koch presented his work on Tuberculosis on 24 March 1882 before the
members of the Berlin Physiological Society, which meant a breakthrough in
the understanding of this terrible disease (Winkle S, 1997, pp. 137-141).
Now after more than 100 years of research and drug development, TB is on
the rise again. In the war against infections such as Tuberculosis, there
are no easy wins. We may win a fight but for the majority of pathogens we
can only reach a status quo, but never completely win the war. Variability
by mutating is a powerful weapon against our drug treatments and pathogens
use it to their great advantage.

We must keep our defenses up to date and changing in order to outsmart our
bacterial and viral enemies. New antibiotics are not found within the
human genome. Penicillin was discovered by accident and many important
antibiotics were found at the most unlikely places (Fleming, A, 1929). No
hypothesis or model can be formulated to find the unexpected, but we have
to find new antibiotics as bacteria are closing in on us and some of our
worst enemies are even winning the race.

Scientists are waiting with fear for the next influenza pandemic which
will hit us some day (Gust ID, 2001; Capua I, 2004). Scientists are trying
to understand the lethal potential of the deadliest influenza epidemic of
all times, which occurred after the first World-War. Soon the virus which
caused the influenza pandemic, called the ‘Spanish flu’ will re-emerge
out of the test tubes of the laboratory. Recent outbreaks of avian flu
have given us a preview of what can happen and evidence is increasing that
the possibilities for spreading avian influenza A virus (H5 or H7 subtype)
are worse than previously was assumed (Koopmans M, 2004; Kuiken T, 2004).

New pathogens can have a devastating effect on a human population.
Examples of what can happen when a new infectious agent hits a population
with little or no immunological “experience” with a (re-)introduced
pathogen, can be found in the histories of indigenous people confronted
with infectious diseases introduced by European colonization as in
Australia and Tasmania. Within 100 years of European colonization the
total population of full-blood Aboriginal people in Tasmania became
extinct. Introduced infectious diseases killed many more Aborigines than
did direct conflict. Infectious diseases such as smallpox, measles, and
influenza were major killers and even chickenpox was deadly as the
Aboriginals had no immunological history even with chickenpox. Of the 90
percent of the Aboriginal population that died out as a result of European
contact, it is estimated that around 80 or 90 percent of the deaths were
the result of disease.

Most people have no idea of the role smallpox played in the destruction of
an entire civilization after it was brought to America by the
conquistadores. About 50 to 90 percent of the Native American population
died of smallpox and the speed at which people died is beyond our
imagination (McMichael AJ, 2004; Winkle S., 1997, pp. 855-861). A
mortality of 50 percent for a new disease, for which we have no immunity,
could kill half of the population of a country or an entire continent.
Western society now has to fear the introduction of new pathogens from
distant places and when the disease has the right pathological profile; it
will spread extensively into the population before it is being diagnosed
(e.g. AIDS). Re-emerging infectious diseases are a global problem with a
local impact. It is an unpleasant thought that this time we will face the
fate of the indigenous people during European colonization. In modern
times we not only have to fear the accidental spreading of infectious
diseases, but bio-terrorism will challenge our defenses sooner or later
(Broussard LA, 2001, Gottschalk R, 2004).

Finding the infectious agent for a new and unknown disease requires
something else than sequencing a genome as this approach only works when
we have the time to do the sequencing while the pathogen takes its course.
Analyzing the genome sequence of a new infectious agent can only start
after it has been isolated by more traditional means (Berger A, 2004).
Once we know the new pathogen, we can use its genome sequence to develop
rapid diagnostic tools, based on PCR, but in order to do this we must
first isolate it from the patient. Developing a therapy after this, takes
much longer and the genome sequence itself without additional functional
information is not enough. Only after Koch's postulates had been
fulfilled, the WHO officially declared on 16 April 2003 that a previously
unknown coronavirus was the cause of SARS.

Modifying the disease progression requires an interaction with the actual
disease process which extends beyond understanding the genome structure of
the pathogen. Focusing more on the dynamics of the interaction of cellular
systems with pathogens and using tools for functional research of the
disease process at the cellular level (and beyond) will hopefully allow us
to respond in time when we are faced with an unknown pathogen.

When we do not already have an antibiotic, antiviral drug or vaccine at
hand at the moment a new disease hits us, either by accident or on purpose
in biological warfare or bioterrorism, we are in serious (and lethal)
trouble. In this case the only thing left is the medieval solution of
quarantaining the infected people, which only works if we are able to
contain them before they spread over a country or even the planet (e.g.
Ebola, SARS or HIV).

Although all cells in the human body may share the same genome, there is a
high spatial and temporal differentiation in gene expression and metabolic
dynamics in different cell types and organs. In HIV, it is the CD4
lymphocytes which express the receptors by which the virus can enter the
cell (Fauci AS, 1996). A hepatocyte may share its entire genome with a CD4
lymphocyte, but it does not express the proteins encoded by the gene which
allows the virus to enter the cell. The progress of a HIV infection is
also a highly dynamic process of interaction between the host and the
virus (Wei, X., 1995). The observation of differences in disease progress
leads to the discovery of a genetic restriction of HIV-1 infection and
progression to AIDS by a deletion allele of the CCR5 structural gene (Dean
M, 1996). The emerging picture on infectious diseases is one of highly
polygenic patterns, with occasional major genes, along with significant
inter-population heterogeneity (Frodsham AJ, 2004). The complex
interactions and regulation of the Interleukin-1 (IL-1) family of proteins
is just one of the issues in elucidating the dynamics of the human immune
system (Laurincova B., 2000).

Clinical observations lead to genetic conclusions, but the way back to
clinical treatment of diseases is a long and winding road for which the
gene sequence or protein structure does not provide us with all the
necessary information about the dynamics of the disease process. Studying
the cellular dynamics of disease processes provides us with one of the
step stones from gene to clinic. By focusing on genomics and proteomics
alone, there remains a correlation and predictive deficit in our disease
models. Mendelian diseases

Mendelian inherited and monogenic diseases have always been at the center
of attention in the relation of genetic variation to diseases. Monogenic
diseases served as a model to prove the use of genetic information to the
development of a disease and the outcome of a disease process.
Phenotype-genotype relationships are complex even in the case of many
monogenic diseases. Increasingly complex interactions have now been
demonstrated in a number of monogenic Mendelian diseases (Nabholz CE,
2004). The (phenotypical and functional) expression and development of
even a monogenic disease depends on its context, which comprises both
other genes and environmental factors. These inter-gene and epigenetic
interactions have a significant impact on the prediction of the age of
occurrence, severity, and long-term prognosis of even ‘genetic’
diseases (Cajiao I, 2004; Hull J, 1998; Frank RE, 2004; Salvatore F, 2002;
Sontag MK, 2004; Sangiuolo F, 2004).

The beta-thalassemias show a remarkable phenotypic diversity caused by the
action of many secondary and tertiary modifiers, and a wide range of
environmental factors (Weatherall DJ., 2001). Sickle cell anaemia and
cystic fibrosis can serve as an example that genotype at a single locus
rarely completely predicts phenotype (Summers KM., 1996). Although the
gene defect in Huntington’s disease is known for years, the contribution
of the gene defect to the functional out come of the disease is not yet
known (Georgiou-Karistianis N, 2003). Cell based research will help to
elucidate the disease mechanism in Huntington’s disease (Arrasate M,

In cystic fibrosis, the severity of the disease cannot be linked
one-on-one to genetic variation in CFTR (Grody W, 2003). Cystic fibrosis
is the most common autosomal recessive disorder in Caucasians, with a
frequency of approximately 1 in 3000 live births, so finding a cure for
this disease has a high impact on our society. Success stories with rare
diseases may sound impressive from a scientific point of view, but there
is no escape from the economic reality of the size of the patient
population. So let us take a closer look at cystic fibrosis as it is a
disease of which the gene held responsible for the disease was identified
about 14 years ago (Rommens JM, 1989; Collins FS., 1990). The method
(reverse genetics) used to identify the gene, did not require an
understanding of the gene function at that moment or any understanding of
the impact of genetic heterogeneity on the phenotypical expression of the
disease (Iannuzzi MC, 1990; Audrezet MP, 2004). By starting form the gene
for a single genetic disease such as cystic fibrosis, where did we get
after 14 years of hard labour?

A once ‘monogenic’ disease such as cystic fibrosis shows remarkable
phenotypic variation and clinical variation (Decaestecker K, 2004). By now
about 1000 gene mutations of the cystic fibrosis transmembrane conductance
regulator gene (CFTR) have been identified, which leads to a highly
variable phenotypic and clinical presentation of the disease. (McKone EF,
2003). Mutations in the CFTR gene have been classified into 5 functional
categories (Welsh MJ, 1993). A list of 1000 mutations is reduced to 5
functional classes at the protein level, which leads to a ratio of 0.5
percent for each mutation to lead to a distinct CFTR chloride channel
dysfunction. Due to the functional uncoupling of gene structure to protein
function in cystic fibrosis, genetic sequence variation has a low impact
on functional variation on the protein level (1000 to 5). More important
than gene sequence variation is the spatial location of a mutation in the
3D structure of a protein. (Rich DP, 1993). Even more important is the
cellular and organ location of a functional defect as in Cystic Fibrosis
mainly the pathological process (Pseudomonas aeruginosa infection) in the
lungs are a major cause of morbidity and mortality (Elkin S, 2003).

Other genes act as modulators of the disease outcome, even in a disease
such as cystic fibrosis, once regarded as a monogenic disease (Hull J,
1998, Frank RE, 2004; Salvatore F, 2002; Sontag MK, 2004; Sangiuolo F.,
2004). We even need to take into account epigenetic information and
environmental influences on disease outcome, even in a so called monogenic
disease as cystic fibrosis.

Human populations show considerable genetic heterogeneity (allelic
variation) and even geographic variation, which leads to difficulties in
using gene sequence based diagnostic tools (Liu W, 2004; Raskin S, 2003).
So, the sequence of one individual’s genome allows studying one
person’s genetic profile, but does not lead to a population-wide
prediction of genetic profiles. Genetic heterogeneity uncouples clinical
outcome from model gene sequences (Imahara SD, 2004). This problem is not
solved by simply adding more sequence information without a functional
understanding of the meaning of sequence variation on phenotypic
expression and disease outcome in the patient. Structural information
without functional understanding leads to predictive deficits. The
functional understanding of a disease process must be at the level of the
patient and his cells and not at a lower order organizational level, such
as the genome or proteome alone.

Genetic heterogeneity leads to a reduced sensitivity and an increase in
false negative results if a genetic test is not adapted to this genetic
heterogeneity. A mutational test leads to a simpler almost ‘binary’
readout, instead of the more ‘analog’ interpretation of a continuum of
values in a functional test, but this comes at a price. A test which
detects a disease marker at a higher organizational level can detect a
disease more easily and will lead to less false negatives in this case.

The complexity of even monogenic diseases and the web of functional
interactions between at the genome level, protein interactions and
environmental influences on the disease outcome will dilute the predictive
power of structural sequence information and the DNA-level. Using
low-dimensional intracellular data to predict iso- and epicellular
phenomena has a low predictive power to be used in clinical situations as

No pharmaceutical company would like the idea that it requires 14 years of
preclinical research to reach an IND after a new drug target was
identified as in cystic fibrosis. Even if only 1000 genes out of our
25,000 were involved in human diseases and would require the same amount
of work, it would take us the equivalent of 14,000 years of work on the
scale as was needed to achieve the same results as for the cystic fibrosis
gene. But up to this moment no causal (gene) therapy came out of the
identification of the CFTR gene, but an improvement of prenatal
diagnostics (Klink D, 2004).

Pseudomonas aeruginosa lung infection is the major cause of morbidity and
mortality in patients with cystic fibrosis (Elkin S, 2003). Over the past
decades we have seen an improvement of symptomatic therapy, but still no
causal therapy, leaving aside a lung transplant.

How are we going to develop drugs which have a large enough patient
population to pay for the costs of drug discovery and development if we
need to target individual mutant protein molecules? If it can be so
difficult to go from a single gene to develop a therapy based on genetic
information, how do we expect to proceed for the entire genome and

Degenerative diseases and cancer

The increasing longevity of western population is increasingly straining
public healthcare systems, due to an increase in incidence of degenerative
diseases and cancer. A diminishing active population has to support the
growing financial demands of a healthcare system. Improving the health and
self-reliance of the growing number of elderly people by efficient
treatments of degenerative diseases and cancer is an important political
issue. Where are we and where are we going to in science to solve these
fundamental problems of modern society?

Unraveling the pathological mechanism of a complex disease is a major
scientific challenge and still beyond reach of present day science in many
cases. For degenerative diseases, such as Alzheimer disease , cancer,
birth defects, cardiovascular diseases, Parkinson’s disease, diabetes,
and nerve degeneration it is the dynamics of the cellular machinery itself
which fails. Sharing one genome does not lead to sharing the same
pathology, as cellular differentiation leads to a highly diverse spatial
and temporal cellular function and morphology. Differential and
heterogeneous degeneration patterns of different cell types are the
consequence of a highly differentiated spatial and temporal expression
pattern of proteins in different cell types and different sub-cellular

Unravelling part of the genetics of a disease does not yet bring
therapeutic success. Multiple genes and (multiple) environmental factors
contribute to the disease process and its clinical outcome in complex
diseases (Liebman MN, 2002). In Crohn’s disease the gene defect found
does not explain the severity of the disease (Peltekova VD, 2004). In
breast cancer genetic variants of BRCA1 and BRCA2 do not have a consistent
level of penetration and as such their presence alone does not explain the
disease process (Ford D et al, 1998; Hartge, 2003). Although there is
evidence for the involvement of the gene for PPAR-gamma in type 2 diabetes
is, the mechanism by which it contributes to the disease process of
diabetes is not clear and could not be deduced from genetic information
alone (Barroso I, 1999).

In APC (Adenomatous Polyposis Coli) and HNPCC (Hereditary Non-Polyposis
Colorectal Cancer) a genetic origin, only accounts for about 5 percent of
all cases of colorectal cancer (Kinzler, 1996). Genes which are involved
in diabetes, such as GCK (glukokinase) , HNF1A and HNF4A (Hepatic Nuclear
Factor) are linked to less than 5 percent of cases of diabetes (Edlund,
1998, Fajans, 2001).

On of the major emerging health problems of modern society is
Alzheimer’s disease (AD). This is not only because widely known people,
such as the former president of the USA, Ronald Reagan, suffered from the
disease in a long and unpleasant disease process. Today AD is still a
chronic disease without a cure which causes patients to receive long-term
care (Souder E, 2004).

Presently available drugs improve symptoms, but do not have a profound
disease-modifying effect and fail to alter the course of AD, so it may be
time to change the way we think about AD therapeutics (Crentsil V., 2004;
Citron M., 2004; Kostrzewa RM, 2004)? Will we see a breakthrough in the
understanding of the cellular and molecular alterations that are
responsible for the degeneration of neurons in AD patients (Mattson MP.,

In Alzheimer’s disease (AD), only a minority of cases can be linked to a
single hereditary gene mutation, the complexity of the disease process
extends beyond our present understanding and disease models (Selkoe DJ.,
2001; Eikelenboom P, 2004). Neurodegeneration in AD may be caused by
deposition of amyloid beta-peptide in plaques in brain tissue (Amyloid
Hypothesis), but no causal treatment has come out of this in 10 years of
hard work (Hardy J, 2002; Lee HG, 2004; Lee HG, 2004b). Little is
understood about the dynamics of amyloid beta-peptide and its fundamental
role in the disease process of AD (Regland B, 1992; Koo EH., 2002; LeVine
H 3rd., 2004).

A complex disease requires studying and understanding a complex in-vivo
pattern of a spatially and temporally changing metabolic process, which
goes beyond studying gene expression profiles, either single or
multiplexed. Studying the multi-scale spatial and temporal dynamics of a
complex disease process in a long-term space-time continuum is a
tremendous scientific challenge. Instead of focusing on individual
(molecular) targets in drug research and therapy, complex diseases may
require pathway-engineering or cell replacement to restore the appropriate
dynamics of spatial and temporal patterns of intracellular molecular
processes. Functional or structural protein (re-) modeling or restoration
in-vivo may be a better approach for complex diseases than just docking a
small molecule to an active binding site?

At this moment the cell is the target for many therapeutic efforts to come
to a causal therapy of complex diseases, which we can now only treat with
external substitution, such as diabetes. Many diseases are far more
complex and multi-factorial than monogenic diseases and should be studied
with more power at a higher biological level than the genome or proteome
to capture the complexity of the disease process.

One of the most promising domains of research today is stem cell research
(He Q, 2003; Doss MX, 2004). Since the isolation and growth in culture of
proliferative cells derived from mouse embryos in 1981, stem cell research
has come a long way (Evans MJ, 1981; Martin GR., 1981). Instead of
treating complex disease processes with a multitude of drugs, each with
its own spectrum of sometimes serious and cumulative side effects, failing
components of the human cytome could be engineered or replaced by stem
cells (adult or embryonic) differentiated into the appropriate cell type.

When the distortion of cellular metabolism goes beyond a mere dysfunction
of a single protein, a complete replacement of the dysfunctional cells has
a better change to restore the complex and delicate balance and regulation
of metabolic processing. The fine dynamics of spatial and temporal
regulation of cellular metabolism and its response to changing demands of
an organism in complex diseases are best met by replacing the failing part
of the cytome with a well balanced cellular substitute. Those parts of
cellular processes which are beyond the reach of (present-day) drug
therapy or which are insufficiently treated by non-cellular means have the
prospect of being restored to a physiologically appropriate level. With
stem cell therapy we would be able to replace a non-functional part of the
human cytome with a set of functioning and dynamically regulated cellular

Several diseases which currently cannot be treated or cured completely are
the target of intensive research. In diabetes long term insulin
replacement therapy does not prevent a multitude of chronic and severe
side effects, such as circulatory abnormalities, retinopathy, nephropathy,
neuropathy and foot ulcers. In juvenile diabetes however there is an
immunological component which complicates treatment. The prospect to find
a cure for diabetes which would restore the dynamics of insulin production
is an important scientific and social challenge (Heit JJ, 2004).

There is hope for the development of stem cell therapies in human
neurodegenerative disorders (Kim SU., 2004; Lazic SE, 2004; Lindvall O,
2004). Much research goes into finding a cure for degenerative diseases
such as Parkinson’s disease (Drucker-Colin R, 2004; Hermann A, 2004;
Roitberg B, 2004). Scientists are investigating the possibility to treat a
failing heart with cellular cardiomyoplasty (Wold LE, 2004)

When we want to use stem cells for disease therapy we have to deal with
the functional and structural characteristics of cells which are being
used (Baksh D, 2004). The differentiation of stem cells of either adult or
embryonic origin, into mature and functional cells is a complex and
dynamically regulated process. Understanding the differentiation pathways
of embryonic and adult stem cells and their spatio-temporal dynamics of
differentiation and structural organization will require intensive
research (Raff M., 2003). When using stem cells from an individual which
suffers from a degenerative disease, the disease may not be cured when the
same deficient pathway is activated in the differentiating stem cell. The
molecular process may need to be corrected first in this case, for
instance by gene therapy or by using exogenous stem cells.

Gene therapy also holds many promises for the therapy of life threatening
diseases, but in order to improve gene therapy we will need a better
understanding on what goes on inside the cell and what the consequences
are on the cellular metabolism when we modify its function by inserting
genes. At this moment monogenic diseases are the target for gene therapy,
but in the future entire parts of pathways may need reconstruction. The
gene is the means to achieve the ultimate goal to change the cellular
metabolism to cure a disease.

The scientific challenges posed by complex diseases, such as many
degenerative and chronic diseases and cancer will keep scientists busy,
far beyond the current scope of present day science. Drug discovery and

How to explore and find new directions for research Conclusion

The future development of this idea will decide if a Human Cytome Project
(HCP) will become reality. The road from gene to phenotype is not a simple
path, but a multidimensional space built from an extensive web of
interacting processes. I can only provide ideas and explain why it would
benefit society and science to explore the cytome in a more organized and
systematic way as is currently being done. The cellular level of
biological organization deserves more in-depth exploration and
quantitative analysis to improve our understanding of important human
disease processes in order to allow us to deal with the scientific and
medical challenges we are facing today and will be facing in the future.
History of article

Original HCP message, 1 December 2003

Update and first article on website 30 Jan. 2004

Posting of HCP article version 24 Sept. 2004

Posting of HCP article version 12 Oct. 2004

Posting of HCP article version 19 Oct. 2004

Posting of HCP article version 25 Oct. 2004

Posting of HCP article version 10 Nov. 2004

Posting of HCP article version 22 Nov. 2004

Posting of HCP article version 6 Jan. 2005 Meetings

* Focus On Microscopy, Philadelphia, USA - 2004 * ISAC XXII,
Montpellier, France - 2004 * European Microscopy Congress, Antwerp,
Belgium - 2004 * EWGCCA, Mol, Belgium - 2004 * 10th Leipziger
Workshop, Leipzig, Germany - 2005 * ISAC XXIII, Quebec, Canada - 2006


* Towards a Human Cytome Project
* Draft: Human Cytome Project
* Cytomics
* Functional genomics
* Cyttron
* Prediction in Cell-based Systems (Predictive Cytomics) * Biomedical
Structural Research


I am indebted, for their pioneering work in automated digital microscopy,
to Frans Cornelissen, Hugo Geerts, Jan-Mark Geusebroek, Roger Nuyens, Rony
Nuydens, Luk Ver Donck and their colleagues. Many thanks also to the
pioneers of Nanovid microscopy, Marc De Brabander, Jan De Mey, Hugo
Geerts, Marc Moeremans, Rony Nuydens and their colleagues. References

References can be found here
Copyright notice and disclaimer

My web pages represent my interests, my opinions and my ideas, not those
of my employer or anyone else. I have created these web pages without any
commercial goal, but solely out of personal and scientific interest. You
may download, display, print and copy, any material at this website, in
unaltered form only, for your personal use or for non-commercial use
within your organization. Should my web pages or portions of my web pages
be used on any Internet or World Wide Web page or informational
presentation, that a link back to my website (and where appropriate back
to the source document) be established. I expect at least a short notice
by email when you copy my web pages, or part of it for your own use. Any
information here is provided in good faith but no warranty can be made for
its accuracy. As this is a work in progress, it is still incomplete and
even inaccurate. Although care has been taken in preparing the information
contained in my web pages, I do not and cannot guarantee the accuracy
thereof. Anyone using the information does so at their own risk and shall
be deemed to indemnify me from any and all injury or damage arising from
such use. To the best of my knowledge, all graphics, text and other
presentations not created by me on my web pages are in the public domain
and freely available from various sources on the Internet or elsewhere
and/or kindly provided by the owner. If you notice something incorrect or
have any questions, send me an email.

Email: pvosta at cs dot com

The author of this webpage is Peter Van Osta, MD.

A first draft was published on Monday, 1 December 2003 in the
bionet.cellbiol newsgroup. I plan to post regular updates of this text to
the bionet.cellbiol newsgroup.

Latest revision on 5 March 2005
Reply With Quote

2005 , cytome , human , idea , march , project , update

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Hen party Ayn colby Botany Forum 0 07-16-2009 03:49 PM
The Human Genome Project Video - 3D Animation Introduction molecule2005 Genomics Forum 0 09-13-2007 05:31 AM
New Saccharomyces Sequences 09/08/04 SGD Sequences Yeast Forum 0 09-13-2004 10:07 PM
New Saccharomyces Sequences 08/11/04 SGD Sequences Yeast Forum 0 08-12-2004 12:26 AM
Breakthrough in Cosmology Kazmer Ujvarosy Forum Biologie 0 05-21-2004 06:32 AM

All times are GMT. The time now is 09:07 PM.

Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2015, Jelsoft Enterprises Ltd.
Copyright 2005 - 2012 Molecular Station | All Rights Reserved
Page generated in 0.42227 seconds with 16 queries