I am to receive a human gene expression dataset output but illumina beadstudio. The data comprises of an AVG_DETECTION column and a DETECTION_P for each signal in every person. This p-value represents the probability that the signal was derived from a non-specific probe hybridisation (i.e. noise).
Does anyone have an insight into the best way I can quality control this data WRT the P-values? For example, should I be removing probes from my analysis if a certain number do not exceed a significance threshold of, say, 0.05? or should I be removing people?
Could somone at least point me to a paper that has done this before?
Thanks in advance.