Science Forums Biology Forum Molecular Biology Forum Physics Chemistry Forum

Science Forums Biology Forum Molecular Biology Forum Physics Chemistry Forum (
-   Bioinformatics (
-   -   cnv map to gene (

learningbioinformatics 04-23-2012 07:04 PM

cnv map to gene

I am trying to find overlap between CNV genomic region(start and stop bp position) and genes(start and stop bp position). Overlap can be in different ways: i). either there wont be any overlap i.e the gene region will not fall in the CNV region or ii). it will fall completely within the region or iii) it may fall partially i.e gene region might overlap with the CNV start position or gene region overlapping with the stop postion of the CNV. So if there will be an overlap it will return 1 and if not then it will return 0 and the output file finally will have all the overlap and the gene name with which this cnv region is overlapping.
So I am reading in the main CNV file and two other gene list files.

Am I doing this correctly? Any suggestions?


#Open the cnv file with start, stop

my $input_gene_file = "path to file";
die "Cannot open $input_gene_file\n" unless (open(IN, $input_gene_file));

my %size_CNV_start_CNV_stop;
while (chomp($line = <IN>))
my (@columns) = split /\s+/, $line;
my $size = $columns[3];
my $CNV_start = $columns[1];
my $CNV_stop = $columns[2];
$size_and_CNV_start{$size} = $CNV_start;
$size_and_CNV_stop{$size} = $CNV_stop;

#Now open files for genelist1 and genelist2 with respective start and stops

open (MYFILE, "path to file");
@file1 = <MYFILE>;
close MYFILE;
open (MYFILE2, "path to file");
@file2 = <MYFILE2>;
close MYFILE2;

foreach $line (@file1,@file2){
sub overlap {
my ($CNV_start, $CNV_stop, $Gene_start, $Gene_stop) = @_;
if ($Gene_stop < $CNV_start || $Gene_start < $CNV_stop) {
return 0;
if (Gene_stop > $CNV_start || $Gene_stop > $CNV_stop){
return 1;

#Open output file and write the locations of each symbol
die "output.txt" unless(open( OUT,"> output.txt"));

while (defined($symbol = <IN>))

my $CNV_start = $size_and_CNV_start{$size};
my $CNV_stop = $size_and_CNV_stop{$size};
print OUT " $size\ $CNV_start\ $CNV_stop \n";

All times are GMT. The time now is 04:53 AM.

Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2015, Jelsoft Enterprises Ltd.
Copyright 2005 - 2012 Molecular Station | All Rights Reserved

Page generated in 0.09040 seconds with 11 queries