I need to do a Chi-Squared between two sequence and their reference.
I have a table of 2 columns and 2 rows.
column 1 : cell type 1
column 2 : cell type 2
row 1 : reference nucleotide
row 2 : mutation (SNP)
I wanna know if the difference between the reference and mutation is significant. So we counted the number of read (from sequencer) aligned to the reference and the mutation.
In this objective, I decided to do a Chi-squared with a 2x2 Contingency table with the following formula:
(AD-BC)^2 . N
The problem is everytime that we have a de novo SNP with no read aligned (value 0) to the reference the chi-squared and p-value are returned as null (coz with this formulas you cannot time by 0).
For example if
==> FOR US this means that we have a de novo SNPs so it is significant. BUT X^2 RETURNS a value of 0...
Does anyone has a better idea to improve my method of finding de novo and significant mutation ?
Abbreviations used in X^2 table :
ROW 1 / CELL 1 : A
ROW 1 / CELL 2 : B
ROW 2 / CELL 1 : C
ROW 2 / CELL 2 : D
A + B + C + D = N