Getting non unique lines from concatenated files


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Getting non unique lines from concatenated files
# 113  
Old 04-04-2011
Assuming that there are different values in the first field (SK1.chr15), you can try this:
Code:
#!/usr/bin/perl
open I, "$ARGV[0]";
local $/;
$_=<I>;
$x=1;
while (/(.*)\n(\d+)-(\d+)\nRef:\n.*\n\nGen:\n(.*)/g){
  $h{$1}.="N" x ($2-$x) . $4;
  $x=$3+1;
}
for $i (keys %h){
  print "$i\n";
  print "$h{$i}\n";
}

This User Gave Thanks to bartus11 For This Post:
# 114  
Old 04-04-2011
That was really short, quick and efficient ... thank you Smilie ... could you please explain how it does the job, escpecially the use of $/ and $_ variables? The rest is pretty clear to me .. The four numerical variables store the regex matches in the while loop if I'm right ?? and the .= is short for concatenation ?
Cheers Bartus Smilie
# 115  
Old 04-04-2011
Yes, that is right what you said about $1,$2.. and .=. Unsetting $/ variable results in whole file being loaded into variable as single string (into $_ in this case). $_ is the default variable for many Perl's operators, for example if you use "print" function like this:
Code:
print;

it is equivalent to:
Code:
print $_;

In this code $_ is used as the regex target for regular expression in while's condition part (so the matching is performed on file's contents). It is equivalent to this:
Code:
while ($_=~/(.*)\n(\d+)-(\d+)\nRef:\n.*\n\nGen:\n(.*)/g){

This User Gave Thanks to bartus11 For This Post:
# 116  
Old 04-04-2011
So $/ can store whole files as single strings ... thanks for that tip Smilie ... does this mean it performs chomp and removes newline characters automatically from the input file while being fed into $_?
Thanks for your input and commentsSmilie
Cheers ++
# 117  
Old 04-04-2011
No, $/ is only a record separator variable, that is not connected to chomp (removing last newline).
This User Gave Thanks to bartus11 For This Post:
# 118  
Old 04-04-2011
Quote:
Originally Posted by pawannoel
Thank you very much ... by the way we have won the World Cup in Cricket today !! ... so that's why I had no questions to post today Smilie ... Hv a good weekend Smilie
Congratulations! Great (if nail-biting) match.

This thread has deviated so much from the original question, and wil soon have more posts than India got runs!

If you have a new question please start a new thread.

Thank you.

Closed.
This User Gave Thanks to Scott For This Post:
# 119  
Old 04-22-2011
Lightbulb Help with modifying script

Hi All,
I want to make some changes in a code here https://www.unix.com/302508078-post55.html
So basically I want to input a coverage value of choice and report the results of the above code only for the lines which have coverage value equal to or greater than those input by user (me). What could be changed in the code to achieve this? Can someone enlighten on this?
Thanks for your input
Hv a nice day
Cheers Smilie
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print number of lines for files in directory, also print number of unique lines

I have a directory of files, I can show the number of lines in each file and order them from lowest to highest with: wc -l *|sort 15263 Image.txt 16401 reference.txt 40459 richtexteditor.txt How can I also print the number of unique lines in each file? 15263 1401 Image.txt 16401... (15 Replies)
Discussion started by: spacegoose
15 Replies

2. UNIX for Dummies Questions & Answers

Print unique lines without sort or unique

I would like to print unique lines without sort or unique. Unfortunately the server I am working on does not have sort or unique. I have not been able to contact the administrator of the server to ask him to add it for several weeks. (7 Replies)
Discussion started by: cokedude
7 Replies

3. Shell Programming and Scripting

Look up 2 files and print the concatenated output

file 1 Sun Mar 17 00:01:33 2013 submit , Name="1234" Sun Mar 17 00:01:33 2013 submit , Name="1344" Sun Mar 17 00:01:33 2013 submit , Name="1124" .. .. .. .. Sun Mar 17 00:01:33 2013 submit , Name="8901" file 2 Sun Mar 17 00:02:47 2013 1234 execute SUCCEEDED Sun Mar 17... (24 Replies)
Discussion started by: aravindj80
24 Replies

4. Shell Programming and Scripting

Print only lines where fields concatenated match strings

Hello everyone, Maybe somebody could help me with an awk script. I have this input (field separator is comma ","): 547894982,M|N|J,U|Q|P,98,101,0,1,1 234900027,M|N|J,U|Q|P,98,101,0,1,1 234900023,M|N|J,U|Q|P,98,54,3,1,1 234900028,M|H|J,S|Q|P,98,101,0,1,1 234900030,M|N|J,U|F|P,98,101,0,1,1... (2 Replies)
Discussion started by: Ophiuchus
2 Replies

5. Shell Programming and Scripting

compare 2 files and return unique lines in each file (based on condition)

hi my problem is little complicated one. i have 2 files which appear like this file 1 abbsss:aa:22:34:as akl abc 1234 mkilll:as:ss:23:qs asc abc 0987 mlopii:cd:wq:24:as asd abc 7866 file2 lkoaa:as:24:32:sa alk abc 3245 lkmo:as:34:43:qs qsa abc 0987 kloia:ds:45:56:sa acq abc 7805 i... (5 Replies)
Discussion started by: anurupa777
5 Replies

6. UNIX for Dummies Questions & Answers

getting unique lines from 2 files

hi i have used comm -13 <(sort 1.txt) <(sort 2.txt) option to get the unique lines that are present in file 2 but not in file 1. but some how i am getting the entire file 2. i would expect few but not all uncommon lines fro my dat. is there anything wrong with the way i used the command? my... (1 Reply)
Discussion started by: anurupa777
1 Replies

7. Shell Programming and Scripting

Compare multiple files and print unique lines

Hi friends, I have multiple files. For now, let's say I have two of the following style cat 1.txt cat 2.txt output.txt Please note that my files are not sorted and in the output file I need another extra column that says the file from which it is coming. I have more than 100... (19 Replies)
Discussion started by: jacobs.smith
19 Replies

8. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Hi All, I have a very huge file (4GB) which has duplicate lines. I want to delete duplicate lines leaving unique lines. Sort, uniq, awk '!x++' are not working as its running out of buffer space. I dont know if this works : I want to read each line of the File in a For Loop, and want to... (16 Replies)
Discussion started by: krishnix
16 Replies

9. Shell Programming and Scripting

Comparing 2 files and return the unique lines in first file

Hi, I have 2 files file1 ******** 01-05-09|java.xls| 02-05-08|c.txt| 08-01-09|perl.txt| 01-01-09|oracle.txt| ******** file2 ******** 01-02-09|windows.xls| 02-05-08|c.txt| 01-05-09|java.xls| 08-02-09|perl.txt| 01-01-09|oracle.txt| ******** (8 Replies)
Discussion started by: shekhar_v4
8 Replies

10. Shell Programming and Scripting

Lines Concatenated with awk

Hello, I have a bash shell script and I use awk to print certain columns of one file and direct the output to another file. If I do a less or cat on the file it looks correct, but if I email the file and open it with Outlook the lines outputted by awk are concatenated. Here is my awk line:... (6 Replies)
Discussion started by: xadamz23
6 Replies
Login or Register to Ask a Question