Hi to all,

I have data like this

DATA HAVE

pop A B C D E

P1 T/T C/C C/C T/T C/C

P2 A/A G/G C/C T/T C/C

1 A/A G/G C/C T/T C/C

2 A/A G/G C/C T/T C/C

3 A/T A/C A/G A/T A/C

4 T/A T/G T/C T/A T/G

5 G/A G/T G/C G/A G/T

6 C/A C/T C/G C/A C/T

pop A B C D E

P1 T/T C/C C/C T/T C/C

P2 A/A G/G C/C T/T C/C

1 A/A G/G C/C T/T C/C

2 A/A G/G C/C T/T C/C

3 A/T A/C A/G A/T A/C

4 T/A T/G T/C T/A T/G

5 G/A G/T G/C G/A G/T

6 C/A C/T C/G C/A C/T

guidelines to work:

1. first I want to convert all A/A to A, T/T to T, C/C to C, G/G to G, Z/Z to - and -/- to - and remaining characters with combination of A,T,G,C like A/T,G/T,C/G,T/C etc to H

2. Now I want to know status from A to E by comparing P1 with P2, if P1=P2 then status from A to E is mono or any one of P1 or P2 contains Z/Z or -/- then status from A to E is mono else status from A to E is poly

3. I want to match 1 in pop column with P2 in pop column for A to E, if 1 in pop column matches to p2 in pop column and its status is poly only then I would like to give 1 otherwise as such, if it is mono I do not want to do anything.

4. Now I will calculate # 1s and # H's and finally I will calculate %sim with this formula =((#1*2+#H)/((#1+#H)*2))*100.

5. I want to repeat the same procedure for second set of parents P1 and P2

i tried this code for first guideline

Code for 2nd guidelineCode:#!/usr/bin/perl -w use strict; open(FILE, "<input.txt") || die "File not found"; my @lines = <FILE>; my @newlines; foreach(@lines) { $_ =~ s/AA/A/g; $_ =~ s/TT/T/g; $_ =~ s/GG/G/g; $_ =~ s/CC/C/g; $_ =~ s/AT/H/g; $_ =~ s/AG/H/g; $_ =~ s/AC/H/g; $_ =~ s/TA/H/g; $_ =~ s/TG/H/g; $_ =~ s/TC/H/g; $_ =~ s/GA/H/g; $_ =~ s/GT/H/g; $_ =~ s/GC/H/g; $_ =~ s/CA/H/g; $_ =~ s/CT/H/g; $_ =~ s/CG/H/g; $_ =~ s/ZZ/-/g; push(@newlines,$_); } open(FILE, ">input1.txt") || die "File not found"; print FILE @newlines; close(FILE);

Can anyone help me to proceed for remaining guidlines? I know these codes can be in one single programme as newbie in perl it is looking difficult for me. Any help would be highly appreciatedCode:#!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Data::Dumper; *ARGV = *DATA{IO} unless @ARGV; my (@parents, @rows); sub { my $header = <>; push @parents, map [ split ' ', <> ], 1, 2; push @rows, map [ split ' ', <> ], 1 .. 6; }->() for 1, 2; for (map @$_, @parents, @rows) { s= ([ACTG]) / \1 =$1=x; s= ([-Z]) / \1 =-=x; s= . / . =H=x; } say join "\t", 'pop', ('A' .. 'E') x 2; print 'P1'; for my $parent (0, 1) { print join "\t", q(), map { my $p1 = $parents[ $parent * 2 ][$_]; my $p2 = $parents[ 1 + $parent * 2 ][$_]; ($p1 eq $p2 or '-' eq $p1 or '-' eq $p2) ? 'mono' : 'poly'; } 1 .. 5; } print "\n"; __DATA__ pop A B C D E P1 T/T C/C C/C T/T C/C P2 A/A G/G C/C T/T C/C 1 A/A G/G C/C T/T C/C 2 A/A G/G C/C T/T C/C 3 A/T A/C A/G A/T A/C 4 T/A T/G T/C T/A T/G 5 G/A G/T G/C G/A G/T 6 C/A C/T C/G C/A C/T pop A B C D E P1 T/T C/C C/C T/T C/C P2 A/A G/G C/C T/T C/C 1 A/A G/G C/C T/T C/C 2 A/A G/G C/C T/T C/C 3 A/T A/C A/G A/T A/C 4 T/A T/G T/C T/A T/G 5 G/A G/T G/C G/A G/T 6 C/A C/T C/G C/A C/T

Tweet This+ 1 thisPost To Linkedin