txt file modification which is beyond me


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers txt file modification which is beyond me
# 1  
Old 08-29-2011
txt file modification which is beyond me

Dear all,

I 'd like to create a new txt file using the old file. For example, in old file, if count=2 then in new file, repeat that row twice, with the only difference is: on the first row, 'start' column contains the 1st apart of the 'start' in the old file; while in the 2nd row, the 'start' column contains the 2nd part of the 'start' coulmn in the old file.

Thanks a lot!

old file example:
chr gene_name count start
1_q OMA1 2 43210,243007,
1_q TFR1 3 429310,2412007,23366,
1_q FTOMA1 5 3290,243317,24333,2433136,24334,
...

new file I want:
chr gene_name count start
1_q OMA1 2 43210
1_q OMA1 2 243007
1_q TFR1 3 429310
1_q TFR1 3 2412007
1_q TFR1 3 23366
1_q FTOMA1 5 3290
1_q FTOMA1 5 243317
1_q FTOMA1 5 24333
1_q FTOMA1 5 2433136
1_q FTOMA1 5 24334
...
# 2  
Old 08-29-2011
Using Perl :

Code:
cat old_file | perl -e '
while(<>){
 my ($chr,$gene,$count,$start) = split ;
 print "$chr\t$gene\t$count\t$_\n" foreach (split(/,/,$start));
 }
 '  > new_file

Using awk :
Code:
awk '{
split($4,arr,",");
for (tmp in arr) if (arr[tmp] !~ /^\s*$/) printf $1"\t"$2"\t"$3"\t"arr[tmp]"\n"
}' old_file > new_file


Last edited by pludi; 08-30-2011 at 04:37 AM..
# 3  
Old 08-30-2011
Another awk solution:

Code:
$ awk -F"," '{ if(NR>1){print $1;split($1,a," ");i=2;{while(a[3]-1!=0) {print a[1],a[2],NF-1,$(i);--a[3];++i}}}else print $0}' oldfile > newfile

Output from above:
Quote:
chr gene_name count start
1_q OMA1 2 43210
1_q OMA1 2 243007
1_q TFR1 3 429310
1_q TFR1 3 2412007
1_q TFR1 3 23366
1_q FTOMA1 5 3290
1_q FTOMA1 5 243317
1_q FTOMA1 5 24333
1_q FTOMA1 5 2433136
1_q FTOMA1 5 24334
# 4  
Old 08-30-2011
Code:
ruby -ane 'BEGIN{a=[]};b=$F[3].split(",") if /^\d_q/;b.map!{|x|x="#{$F[0..2].join(" ")} #{x}"} if b; puts b if b ' file
1_q OMA1 2 43210
1_q OMA1 2 243007
1_q TFR1 3 429310
1_q TFR1 3 2412007
1_q TFR1 3 23366
1_q FTOMA1 5 3290
1_q FTOMA1 5 243317
1_q FTOMA1 5 24333
1_q FTOMA1 5 2433136
1_q FTOMA1 5 24334

# 5  
Old 08-30-2011
questions

Hi,

Thank you for your reply! When I use the awk code, I got the new file like below which contains additonal rows which I don't want. How to remove them? by the way, what does '/^\s*$/' mean? I don't understand. Thanks!

1_q OMA1 2 43210
1_q OMA1 2 243007
1_q OMA1 2
1_q TFR1 3 429310
1_q TFR1 3 2412007
1_q TFR1 3 23366
1_q TFR1 3
1_q FTOMA1 5 3290
1_q FTOMA1 5 243317
1_q FTOMA1 5 24333
1_q FTOMA1 5 2433136
1_q FTOMA1 5 24334
1_q FTOMA1 5


Quote:
Originally Posted by daptal
Using Perl :

cat old_file | perl -e '
while(<>){
my ($chr,$gene,$count,$start) = split ;
print "$chr\t$gene\t$count\t$_\n" foreach (split(/,/,$start));
}
' > new_file

Using awk :
awk '{
split($4,arr,",");
for (tmp in arr) if (arr[tmp] !~ /^\s*$/) printf $1"\t"$2"\t"$3"\t"arr[tmp]"\n"
}' old_file > new_file
# 6  
Old 09-05-2011
You get the additional rows when you skip the '/^\s*$/' line.

The line /^\s*$/ means any line that just contains white spaces or empty

If you notice your input file the format is as follows
1_q OMA1 2 43210,243007,

when a split is done on the above line using split($4,arr,","); the array arr has the values (43210,243007,'') . hence you get the additional line. by adding the check for empty values we can eliminate that row.

Hope this helps


Quote:
Originally Posted by forevertl
Hi,

Thank you for your reply! When I use the awk code, I got the new file like below which contains additonal rows which I don't want. How to remove them? by the way, what does '/^\s*$/' mean? I don't understand. Thanks!

1_q OMA1 2 43210
1_q OMA1 2 243007
1_q OMA1 2
1_q TFR1 3 429310
1_q TFR1 3 2412007
1_q TFR1 3 23366
1_q TFR1 3
1_q FTOMA1 5 3290
1_q FTOMA1 5 243317
1_q FTOMA1 5 24333
1_q FTOMA1 5 2433136
1_q FTOMA1 5 24334
1_q FTOMA1 5
This User Gave Thanks to daptal For This Post:
# 7  
Old 09-05-2011
If the only commas in the input occur in that last field:

Code:
tr -s , \\n < file | awk 'NF>1 {p=$0} NF==1 {f=$0; $0=p; $NF=f} 1'

f=final field's value
p=prefix of records with common origin


Regards,
Alister
This User Gave Thanks to alister For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Programming

[Python] replicating "sha256 -C checksum_file.txt file.txt"

Hello everyone, Since my python knowledge is limimted, I've challenged myself to learn as much as possible to help me with my carrere. I'm currently trying to convert a shell script to python, just to give myself a task. There is one section of the script that I'm having issues converting and... (2 Replies)
Discussion started by: da1
2 Replies

2. UNIX for Dummies Questions & Answers

Split Every Line In Txt Into Separate Txt File, Named Same As The Line

Hi All Is there a way to export every line into new txt file where by the title of each txt output are same as the line ? I have this txt files containing names: Kandra Vanhooser Rhona Menefee Reynaldo Hutt Houston Rafferty Charmaine Lord Albertine Poucher Juana Maes Mitch Lobel... (2 Replies)
Discussion started by: Nexeu
2 Replies

3. Shell Programming and Scripting

Desired output.txt for reading txt file using awk?

Dear all, I have a huge txt file (DATA.txt) with the following content . From this txt file, I want the following output using some shell script. Any help is greatly appreciated. Greetings, emily DATA.txt (snippet of the huge text file) 407202849... (2 Replies)
Discussion started by: emily
2 Replies

4. Windows & DOS: Issues & Discussions

2 Questions: replace text in txt file, add text to end of txt file

so... Lets assume I have a text file. The text file contains multiple "#" symbols. I want to replace all thos "#"s with a STRING using DOS/Batch I want to add a certain TEXT to the end of each line. How can I do this WITHOUT aid of sed, grep or anything linux related ? (1 Reply)
Discussion started by: pasc
1 Replies

5. Shell Programming and Scripting

Need to append the date | abcddate.txt to the first line of my txt file

I want to add/append the info in the following format to my.txt file. 20130702|abcd20130702.txt FN|SN|DOB I tried the below script but it throws me some exceptions. <#!/bin/sh dt = date '+%y%m%d'members; echo $dt+|+members+$dt; /usr/bin/awk -f BEGIN { FS="|"; OFS="|"; } { print... (6 Replies)
Discussion started by: harik1982
6 Replies

6. Shell Programming and Scripting

Needed shell script to read txt file and do some modification

Hi ...programmers... I need a shell script to perform some specific task.. my txt file looks like this netcdf new { dimensions: XAX1_11 = 11 ; variables: double XAX1_11(XAX1_11) ; XAX1_11:point_spacing = "even" ; XAX1_11:axis = "X" ; float DEPTH(XAX1_11) ;... (19 Replies)
Discussion started by: Akshay Hegde
19 Replies

7. Shell Programming and Scripting

awk append fileA.txt to growing file B.txt

This is appending a column. My question is fairly simple. I have a program generating data in a form like so: 1 20 2 22 3 23 4 12 5 43 For ever iteration I'm generating this data. I have the basic idea with cut -f 2 fileA.txt | paste -d >> FileB.txt ???? I want FileB.txt to grow, and... (4 Replies)
Discussion started by: theawknewbie
4 Replies

8. Shell Programming and Scripting

command to list .txt and .TXT file

Hi expersts, in my directory i have *.txt and *.TXT and *.TXT.log, *.txt.log I want list only .txt and .TXT files in one command... how to ?? //purple (1 Reply)
Discussion started by: thepurple
1 Replies

9. Shell Programming and Scripting

AWK CSV to TXT format, TXT file not in a correct column format

HI guys, I have created a script to read 1 column in a csv file and then place it in text file. However, when i checked out the text file, it is not in a column format... Example: CSV file contains name,age aa,11 bb,22 cc,33 After using awk to get first column TXT file... (1 Reply)
Discussion started by: mdap
1 Replies

10. UNIX for Dummies Questions & Answers

Binary txt file received when i use uuencode to send txt file as attachment

Hi, I have already read a lot of posts on sending attachments in unix...but none of them were of help for my problem...so here goes.. i wanna attach a text file and send to a mail id..used the following code : uuencode "$File1" "$File1" ;|mail -s "$Mail_sub" abc@abc.com it works... (2 Replies)
Discussion started by: ash22
2 Replies
Login or Register to Ask a Question