I was designing the experiment for a time series microarray data, we are given genome-wide microarrays and i plan to use 15-20 biological replicates as per Churchill and Simons no. of replciates formulas, I need to confirm a step regarding pre-processing stage before we can apply the actual analysis to the data.
As a pre-processing step, after performing normalization on the microarray data do we identify differentially expressed genes (using t-tests, correlation methods or clustering) before applying the Dynamic Bayesian Network-DBN? or do we apply DBN directly after normalization. I think we do need to first find differentially expressed genes, kindly do confirm?
I am thinking like this because identifying differentially expressed genes reduces the no. of genes to many folds, Plus if at time 0, time 1, time 2 and so on... a specific genes does not change its expression then this means that it is most probably not related to our analysis or aim (which in my case is 1.gene cascade prediction and no.2 Identifying differentially expressed genes... basically we are measuring the affect of genes over time in particular condition)
One more issue regarding microarray experiment design of a time series analysis, we always use reference design (compare time0-time1, time0-time2 and so on..... time0-time10) because then its easy to analyze via Bayesian Network/Clustering or any other techniques. The other alternative is Loop design which is more efficient (time0-tim1, time1-tim2,.... time9-time10, time10-time0) in the sense that no. of biological replicates is reduced to half but I think its difficult to analyze the data from the loop design via Bayesian networks, is this true?
ps: I am plannin to use the DBN architecture like the one shown in the attached file