| | |||||||
| Register | Search | Today's Posts | Mark Forums Read |
| Protocols and Methods Forum Post Any Protocol, Method, Technique, Procedure or Tips / Troubleshooting for any Molecular Biology Technique. |
| | LinkBack | Thread Tools | Display Modes |
|
#1
| |||
| |||
| Hello, I apologize in advance for the extreme vagueness of the following question -- a friend is asking me if I can help with a project of his on protein microarrays, and I don't know anything about the field myself. Here goes: for someone new to the field, how long would it take to learn to do some meaningful analysis of protein microarrays with a software package? I heard TM4 is good ([Only registered users see links. ]). I have a strong mathematical background but little knowledge of molecular biology. Is this generally the kind of thing that can be picked up in a week or so? Any kind of response would be greatly appreciated. Thank you, Rex |
|
#2
| |||
| |||
| Rex Eastbourne wrote: Ultimately it depends what you want to do. There is no need for a molecular biological background if you want to help your friend to analyze his data. If you know how to analyze a robust data set in general, then you can help him out. All you need to be aware of is that essentially the data will be in the form of a large matrix, typically each row representing a gene/protein/cell, and each column representing different sample/time/dose. You want to arrange the data by some algorithm either by row or by column for all the data points, so that some pattern will reveal itself. If you have a mathematical model, you can see which data fit the model. If you don't, you can build a model (most people use a linear model or a fourier) based on the profile generated. Then let your friend interpret the results. In some ways it is "better" because you cannot be biased in your analysis. One package that is commonly used is "R". The advantage is that it is just a good statistical package but the learning curve is a bit steep. Or you can simply use any software that allows you to decompose a data matrix or cluster it (Eisen's group has freeware that does heirarchical clustering). Just be aware that any analysis is only as good as the quality of the data obtained, and there is no standard in quality for any arrays out there (i.e. the data is very noisy). Austin |
|
#3
| |||
| |||
| Thank you, Austin! Very helpful. I've got a better description of what I need to do now. I have a collection of proteins in a spreadsheet, and need to find out: -Are some of them related to one another? -Do they participate in a single transduction pathway in the cell or between cells? Does this sound like a hard task for someone with lots of math but no bio? Thanks again, Rex |
|
#4
| |||
| |||
| Rex Eastbourne wrote: Hi Rex... If it is a matter of "relationship" (in the sense that a set of proteins behave i.e. go up or down in the same way over a given treatment), then any k-means algorithm will do the trick. I'm pretty sure this would be a no-brainer for you. I'm actually surprised they don't give it a go by themselves... Anything else (like your second point...though I'm curious about the experimental design) you'd have to give more details, say: col1=proteinID col2=expt1 col3=expt2 etc... BTW...If you don't want to do it, then don't. Technically, if the person is doing a PhD, then he/she should know how to do it themselves anyway and not just hand it off... Austin P.S. what is your math background? I should have probably asked that first... |
|
#5
| |||
| |||
| Hi Austin, Thanks again for replying. The k-means algorithm should be a snap. But how do I convert the proteins, which are in the format "UPSP_SLDJK_HUMAN_P12182" to vectors that can be handled by the mathematical algorithm (i.e. what is the "distance" between two proteins)? Is there already a program that does this? (I understand there's something on the NCBI's website.) It seems there is no before-and-after. This is a simple measurement of proteins in people with a disease and without a disease. I do not know which is which; I simply have a list of proteins. Mathematical background: graph theory, combinatorics, probability, theory of computation, linear algebra, multivariable calculus, differential equations. Also a lot of programming. I can't tell you how much I appreciate your help. My friend is swamped and trying to make a deadline. I'm trying to give him a hand but am not up to snuff it seems Best, Rex Austin P. So (Hae-Jin) wrote: |
|
#6
| |||
| |||
| Rex Eastbourne wrote: So, if I understand the format of the data: 1. "UPSP_SLDJK_HUMAN_P12182" is just a name...say it is a row id. 2. with that name (i.e. in each row), you will have a series of data points, each data point corresponding the amount of protein found in patient X (technically you don't have to know if they have the disease or not). 3. each column (i.e. patient data) will therefore be a (multidimensional) data vector, with each protein being an "axis". patient1 patient2 patient3 patient4 protein1 1 50 49 3 protein2 2 35 30 1 protein3 30 20 20 31 In this way you can apply (hierarchical) k-means clustering on the column "vectors". Note that you may not get anything either since ultimately your analysis is only as good as your data... Austin |
|
#7
| |||
| |||
| Hi Austin, I just have a plain list of 200 proteins, without data from the experiment. I need to cluster the proteins by their inherent characteristics (function, ancestry). I used the protein database on the NCBI website to get the sequences. Now, I want to take all these 200 sequences and get some measure of how similar each is to each other. I figure this would require some specific software that would allow me to enter all the proteins and see how they're related. I found ProtoNet, but it seems you can only enter one protein and explore its specific cluster. Are there any other tools for this I might not be aware of? I'm sorry to keep asking you questions like this -- just referring me to a website that explains this would be greatly appreciated. Thank you, Rex Austin P. So (Hae Jin) wrote: |
|
#8
| |||
| |||
| Rex Eastbourne wrote: Oh...then just ignore everything I've said...you seem to be asking a number of different unrelated questions though... This is not an easy project to do for you then, if you are going to go outside of the standard tools at NCBI. A simple comparison can be made through the BLAST server: [Only registered users see links. ] A good resource for constructing phylogenic trees (which is what you seem to want to do here): [Only registered users see links. ] Good luck austin |
|
#9
| |||
| |||
| Thanks a lot Austin! Austin P. So (Hae-Jin) wrote: |
|
#10
| |||
| |||
| Rex Eastbourne wrote: The standard way to do this is to use ClustalX, which does an alignment of amino acid sequences (Needleman/Wunsch algorithm) of every protein with every other and thus calculates a similarity matrix. With this matrix a phylogenetic tree is calculated and printed in a text format. Programs like TreeView can read this and display the trees graphically. All these programs are freely available on the net. Note however that this has nothing to do with protein array data. In a protein array experiment you measure the expression of proteins in cells or tissues depending on experimental factors (e.g. the presence or absence of a disease) and then find groups of proteins which react in a similar way (e.g. expression goes up). "Similarity" thus has totally different meanings in the two fields. |
| Tags |
| analysis , microarray , newbie , question |
| Thread Tools | |
| Display Modes | |
|
|
| | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Free Microarray Software for Analysis | moleculardude | Microarrays Forum | 2 | 03-06-2009 08:55 PM |
| Microarray experiment question..Please... | PiNk | Microarrays Forum | 3 | 07-23-2008 05:01 AM |
| Time Series microarray data analysis and loop design | ali_22 | Microarrays Forum | 1 | 05-06-2008 02:32 PM |
| Newbie question about microarray analysis | Rex Eastbourne | Protein Forum | 9 | 06-08-2006 12:13 PM |
| [Protein-analysis] Re: Newbie question about microarray analysis | Derek Potter | Protein Forum | 0 | 05-31-2006 02:49 PM |