Using to perl to output specific fields to one file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Using to perl to output specific fields to one file
# 1  
Old 03-25-2016
Using to perl to output specific fields to one file

Trying to use perl to output specific fields from all text files in a directory to one new file. Each text file on a new line. The below seems to work for one text file but not more. Thank you Smilie.

Code:
 perl -ne 's/^#//; @n = (6, 7, 8, 16); print if $. ~~ @n' *.txt > out.txt

format of all text files in directory

Code:
#Sample = xxxxx
#Sample Type = xxxxxx
#Build = xxxxxxx
#Platform = xxxxxxx
#Display Name= XXXXX  (keep this field without the #)
#identifier = xxxxxx  (keep this field without the #)
#Gender = xxxxx       (keep this field without the #)
#Control Gender = xxxxx
#Control Sample = xxxxx
#Quality = X.XXXXXX   (keep this field without the # and X.XXX)

desired output (which is ok for 1 of the files in the directory)

Code:
Display Name= XXXXX  (keep this field without the #)
identifier = xxxxxx  (keep this field without the #)
Gender = xxxxx       (keep this field without the #)
Quality = X.XXXXXX   (keep this field without the # and X.XXX)
Display Name= XXXXX  (keep this field without the #)
identifier = xxxxxx  (keep this field without the #)
Gender = xxxxx       (keep this field without the #)
Quality = X.XXXXXX   (keep this field without the # and X.XXX)

# 2  
Old 03-25-2016
Quote:
Originally Posted by cmccabe
The below seems to work for one text file but not more.
Code:
 perl -ne 's/^#//; @n = (6, 7, 8, 16); print if $. ~~ @n' *.txt > out.txt

It "seems" but it doesn't. Smilie
In order to work, the array n would have to contain each possible line you want to keep. In fact, for the example provided, it would have to be @n = (5, 6, 7, 10).
If each of your files "*.txt" has the same configuration, it might work as:
Code:
perl -ne 'BEGIN{@n=(5,6,7,10)} $p ne $ARGV and $.=1; s/^#//; print if $. ~~ @n; $p=$ARGV' *.txt > output.txt


This might be better:
Code:
perl -ne 's/^#(Display|identifier|Gender|Quality)/$1/ and print' *.txt > out.txt

Your comment in within parenthesis is a bit ambiguous:
Quote:
#Quality = X.XXXXXX (keep this field without the # and X.XXX)
You show the desired output as:
Quote:
Quality = X.XXXXXX (keep this field without the # and X.XXX)
If what you mean is:
Quote:
Quality = X.XXX (keep this field without the # and X.XXX)
... and the X.XXXXXX represents a floating point, then:

Code:
perl -ne '(s/^#(Display|identifier|Gender)/$1/ or s/^#(Quality = \d\.\d{3})\d+/$1/) and print' *.txt > output.txt


Last edited by Aia; 03-25-2016 at 10:38 PM.. Reason: Addressing the ambiguous part
This User Gave Thanks to Aia For This Post:
# 3  
Old 03-26-2016
Thank you very much.... works great Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Extracting specific fields from an XML file

Hello All, I have a requirement to split the input.xml file different files and i have tried using earlier examples(where i have posted in the forum), but still no luck Here is my input.xml <jms-system-resource> <name>UMSJMSSystemResource</name> ... (4 Replies)
Discussion started by: Siv51427882
4 Replies

2. Shell Programming and Scripting

awk to output match and mismatch with count using specific fields

In the below awk I am trying output to one file those lines that match between $2,$3,$4 of file1 and file2 with the count in (). I am also trying to output those lines that are missing between $2,$3,$4 of file1 and file2 with the count of in () each. Both input files are tab-delimited, but the... (7 Replies)
Discussion started by: cmccabe
7 Replies

3. Shell Programming and Scripting

Parse file for fields and specific text

I have a file of ~500,000 entries in the following: file.txt chr1 11868 12227 ENSG00000223972.5 . + HAVANA exon . gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_status "KNOWN"; gene_name "DDX11L1"; transcript_type... (17 Replies)
Discussion started by: cmccabe
17 Replies

4. UNIX for Dummies Questions & Answers

Read the file and generate specific fields by awk

Hi I need to generate these output file from the below input file. Output : customer_id as customer, zip as zip_cd, catg_cd as catg, Input: out.customer::in.customer_id; out.zip_cd::in.zip; out.catg::in.catg_cd; Could you please help me on this. Please use code tags next... (1 Reply)
Discussion started by: Murugesh
1 Replies

5. Shell Programming and Scripting

Capture specific fields in file

Dear Friends, I have a file a.txt 1|3478.12|487|4578.04|4505.5478|rhfj|rehtire|rhj I want to get the field numbers which have decimal values output: Fields: 2,4,5 Plz help (6 Replies)
Discussion started by: i150371485
6 Replies

6. UNIX for Dummies Questions & Answers

using sed delete a line from csv file based on specific data in two separate fields

Hello, :wall: I have a 12 column csv file. I wish to delete the entire line if column 7 = hello and column 12 = goodbye. I have tried everything that I can find in all of my ref books. I know this does not work /^*,*,*,*,*,*,"hello",*,*,*,*,"goodbye"/d Any ideas? Thanks Please... (2 Replies)
Discussion started by: Chris Eagleson
2 Replies

7. Shell Programming and Scripting

Perl: Parse Hex file into fields

Hi, I want to split/parse certain bits of the hex data into another field. Example: Input data is Word1: 4f72abfd Output: Parse bits (5 to 0) into field word1data1=0x00cd=205 decimal Parse bits (7 to 6) into field word1data2=0x000c=12 decimal etc. Word2: efff3d02 Parse bits (13 to... (1 Reply)
Discussion started by: morrbie
1 Replies

8. Shell Programming and Scripting

selecting specific fields in a file (maybe with sed?)

Hi, I have a file with following lines: chr1 10 AC=2;AF=1.00;AN=2;DP=2;Dels=0.00;HRun=0;HaplotypeScore=0.00;MQ=23.00;MQ0=0;QD=14.33;SB=-10.01 chrX 18 AB=0.52;AC=1;AF=0.50;AN=2;DP=203;DS;Dels=0.00;HRun=0;HaplotypeScore=20.01;MQ=15.63;MQ0=85;QD=12.80;SB=-1289.58 I need to extract 4... (2 Replies)
Discussion started by: menenuh
2 Replies

9. Shell Programming and Scripting

Cut 2 fields and write to a output file

Hi, I am writing a code where the file is a pipe delimited and I would need to extract the 2nd part of field2 if it is "ATTN", "C/O" or "%" and check to see if field9 is populated or not. If field9 is already populated then leave it as is but if field9 is not populated then take the 2nd part of... (3 Replies)
Discussion started by: msalam65
3 Replies

10. UNIX for Dummies Questions & Answers

Cut specific fields from a file containing multiline records

Hi, I am looking for a method to get column13 to column 50 data from the 1st line of a multiline reord. The records are stored in a large file and are separated by newline. sample format is (data in red is to be extracted) <header> A001dfhskhfkdsh hajfhksdhfjh... (3 Replies)
Discussion started by: sunayana3112
3 Replies
Login or Register to Ask a Question