Hi Austin,

Thanks again for replying. The k-means algorithm should be a snap. But

how do I convert the proteins, which are in the format

"UPSP_SLDJK_HUMAN_P12182" to vectors that can be handled by the

mathematical algorithm (i.e. what is the "distance" between two

proteins)? Is there already a program that does this? (I understand

there's something on the NCBI's website.)

It seems there is no before-and-after. This is a simple measurement of

proteins in people with a disease and without a disease. I do not know

which is which; I simply have a list of proteins.

Mathematical background: graph theory, combinatorics, probability,

theory of computation, linear algebra, multivariable calculus,

differential equations. Also a lot of programming.

I can't tell you how much I appreciate your help. My friend is swamped

and trying to make a deadline. I'm trying to give him a hand but am not

up to snuff it seems

Best,

Rex

Austin P. So (Hae-Jin) wrote: