Rearrange groups of lines from several files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Rearrange groups of lines from several files
# 1  
Old 09-13-2019
Rearrange groups of lines from several files

I have three files as an input and I need to rearrange this input to match the rules by which the processing program consumes the data.


My files are:
Code:
/tmp$ cat F[123]
# file -1-
FS00|0|zero-zero|
FSTA|0|10|
FSTA|0|12|
FSTA|0|15|
FSTA|0|17|
FS00|3|negative|
FSTA|3|-1|
FS00|5|regular|
FSTA|5|21|
#
GXRM|0|0|1|0
GXRX|120|200|220|220|
GXRX|116|202|222|222|
GXRM|2|1|0|0
GXRX|1|0|0|0|
#
STPE|Initial|
#
FNMR|20|12|11|12|11|0|
FNMR|21|15|10|10|10|5|
# file -2-
RGSS|0|1|1|1|1|0|
RGSS|1|1|1|1|1|2|
RGSS|7|6|1|9|2|2|
# file -3-
TESR|0.00|0.00|0|0|0.00|FIX
TESR|1.00|0.00|5|0|0.00|FIX CNT
TESR|4.20|1.05|5|5|0.10|FIX REV
TESR|0.00|0.00|8|7|0.00|FLEX




I need to combine them in this fashion:
Code:
STPE|Initial|
FS00|0|zero-zero|
FSTA|0|10|
FSTA|0|12|
FSTA|0|15|
FSTA|0|17|
FS00|3|negative|
FSTA|3|-1|
FS00|5|regular|
FSTA|5|21|
GXRM|0|0|1|0
GXRX|120|200|220|220|
GXRX|116|202|222|222|
GXRM|2|1|0|0
GXRX|1|0|0|0|
TESR|0.00|0.00|0|0|0.00|FIX
TESR|1.00|0.00|5|0|0.00|FIX CNT
TESR|4.20|1.05|5|5|0.10|FIX REV
TESR|0.00|0.00|8|7|0.00|FLEX
RGSS|0|1|1|1|1|0|
RGSS|1|1|1|1|1|2|
RGSS|7|6|1|9|2|2|
FNMR|20|12|11|12|11|0|
FNMR|21|15|10|10|10|5|

See, how I need groups of lines following each other, such as STPE to be 1st group, followed by FS group. Important to keep lines within each group in its original order.


I tried this awk (NOTE: gawk is not available, this is old awk) where I accumulate my groups of lines into arrays and I slap NR to be used later to re-sort output in the order these lines came in and I cut it off on the output.


Code:
/tmp$ cat f.awk
BEGIN { FS = OFS = "|" }
/^TESR/ {       TESR[NR"|"$0] = 1;      }
/^RGSS/ {       RGSS[NR"|"$0] = 1;      }
/^FNMR/ {       FNMR[NR"|"$0] = 1;      }
/^STPE/ {       STPE[NR"|"$0] = 1;      }
/^GX../ {       GX__[NR"|"$0] = 1;      }
/^FS../ {       FS__[NR"|"$0] = 1;      }
{
        continue;
}
END {
        for(ln in STPE) { print ln | "sort -n|cut -d'|' -f2-" }
        for(ln in FS__) { print ln | "sort -n|cut -d'|' -f2-" }
        for(ln in GX__) { print ln | "sort -n|cut -d'|' -f2-" }
        for(ln in TESR) { print ln | "sort -n|cut -d'|' -f2-" }
        for(ln in RGSS) { print ln | "sort -n|cut -d'|' -f2-" }
        for(ln in FNMR) { print ln | "sort -n|cut -d'|' -f2-" }
}


The output is not at all what I expect:
Code:
/tmp$ awk -f f.awk F[123]
FS00|0|zero-zero|
FSTA|0|10|
FSTA|0|12|
FSTA|0|15|
FSTA|0|17|
FS00|3|negative|
FSTA|3|-1|
FS00|5|regular|
FSTA|5|21|
GXRM|0|0|1|0
GXRX|120|200|220|220|
GXRX|116|202|222|222|
GXRM|2|1|0|0
GXRX|1|0|0|0|
STPE|Initial|
FNMR|20|12|11|12|11|0|
FNMR|21|15|10|10|10|5|
RGSS|0|1|1|1|1|0|
RGSS|1|1|1|1|1|2|
RGSS|7|6|1|9|2|2|
TESR|0.00|0.00|0|0|0.00|FIX
TESR|1.00|0.00|5|0|0.00|FIX CNT
TESR|4.20|1.05|5|5|0.10|FIX REV
TESR|0.00|0.00|8|7|0.00|FLEX


Please help, any idea will be appreciated
# 2  
Old 09-13-2019
So, I added a "group id" to be included in the sort and solved it that way, and I don't pipe to sort individual arrays (which I suspect was a bug)
Just in case my code now looks like this:
Code:
/^STPE/ {       STPE["1|"NR"|"$0] = 1;  }
/^FS../ {       FS__["2|"NR"|"$0] = 1;  }
/^GX../ {       GX__["3|"NR"|"$0] = 1;  }
/^TESR/ {       TESR["4|"NR"|"$0] = 1;  }
/^RGSS/ {       RGSS["5|"NR"|"$0] = 1;  }
/^FNMR/ {       FNMR["6|"NR"|"$0] = 1;  }


...


END {
        for(ln in STPE) { print ln }
        for(ln in FS__) { print ln }
        for(ln in GX__) { print ln }
        for(ln in TESR) { print ln }
        for(ln in RGSS) { print ln }
        for(ln in FNMR) { print ln }
}

and on the output I do
Code:
 sort -t'|' -k1,1n -k2,2n | cut -d'|' -f3-

to clean out those sort-related columns
# 3  
Old 09-14-2019
Why that difficult? Try
Code:
awk '
BEGIN { FS = OFS = "|" }

/^TESR/ {TESR[++CNTTESR] = $0}
/^RGSS/ {RGSS[++CNTRGSS] = $0}
/^FNMR/ {FNMR[++CNTFNMR] = $0}
/^STPE/ {STPE[++CNTSTPE] = $0}
/^GX../ {GX__[++CNTGX__] = $0}
/^FS../ {FS__[++CNTFS__] = $0}

END     {for (i=1; i<=CNTSTPE; i++) print STPE[i]
         for (i=1; i<=CNTFS__; i++) print FS__[i]
         for (i=1; i<=CNTGX__; i++) print GX__[i]
         for (i=1; i<=CNTTESR; i++) print TESR[i]
         for (i=1; i<=CNTRGSS; i++) print RGSS[i]
         for (i=1; i<=CNTFNMR; i++) print FNMR[i]
        }
 ' file[123]

to get exactly your desired output.




EDIT: Or even
Code:
awk '
BEGIN   {FS = OFS = "|"
         MX = split ("STPE|FS..|GX..|TESR|RGSS|FNMR", KEYS)
        }

        {for (i=1; i<=MX; i++) if ($1 ~ KEYS[i])        {RES[i,++CNT[i]] = $0 
                                                         break
                                                        }
        }
END     {for (i=1; i<=MX; i++) 
           for (j=1; j<=CNT[i]; j++) print RES[i, j]
        }
' file[123]


Last edited by RudiC; 09-14-2019 at 07:37 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Best way to sort file with groups of text of 4-5 lines by the first one

Hi, I have some data I have taken from the internet in the following scheme: name direction webpage phone number open hours menu url book url name ... Of course the only line that is mandatory is the name wich is the one I want to sort by. I have the following sed & awk script that... (3 Replies)
Discussion started by: devmsv
3 Replies

2. Shell Programming and Scripting

Print values within groups of lines with awk

Hello to all, I'm trying to print the value corresponding to the words A, B, C, D, E. These words could appear sometimes and sometimes not inside each group of lines. Each group of lines begins with "ZYX". My issue with current code is that should print values for 3 groups and only is... (6 Replies)
Discussion started by: Ophiuchus
6 Replies

3. Shell Programming and Scripting

Match single line in file1 to groups of lines in file2

I have two files. File 1 is a two-column index file, e.g. comp11084_c0_seq6:130-468(-) comp12746_c0_seq3:140-478(+) comp11084_c0_seq3:201-539(-) comp12746_c0_seq2:191-529(+) File 2 is a sequence file with headers named with the same terms that populate file 1. ... (1 Reply)
Discussion started by: pathunkathunk
1 Replies

4. Shell Programming and Scripting

Rearrange Lines with awk

I need to rearrange the lines in the input file in the example below: Input: LG1 R500 A-170 F1:81 F1:22 F2:32 F1:71 LG1 R700 A-203 F2:17 E2:18 LG1 R700 B-224 E1:9 LG2 R500 C-235 E2:9 F2:17 Output: LG1 R500 A-170 F1:81 LG1 R500 A-170 F1:22 LG1 R500 A-170 F2:32 LG1 R500 A-170... (2 Replies)
Discussion started by: aydj
2 Replies

5. Shell Programming and Scripting

Move groups of files

G'day all, I'm have tons of image files I need to process, but I don't need to process all of them and it would take a long time to process them all if I don't have to. The images are arranged in folders like this... folder1/RawData folder2/RawData folder3/RawData ... folderN/RawData ... (2 Replies)
Discussion started by: Dan_S
2 Replies

6. Shell Programming and Scripting

awk- looping through groups of lines

Hello, I'm working with a file that has three columns. The first one represents a certain channel and the third one a timestamp (second one is not important). Example input is as follows: 2513 12 10.771 2513 13 10.771 2513 14 10.771 2513 15 10.771 2644 8 10.771 ... (6 Replies)
Discussion started by: acsg
6 Replies

7. UNIX for Dummies Questions & Answers

Merge files into groups of 10000

Hi Guys, First post! I've seen a few options but dont know the most efficient: I have a directory with a 150,000+ text files in it I want to merge them into files contain 10,000 files with a carriage return in between. Thanks P The following is an example but doesnt limit the... (2 Replies)
Discussion started by: peh
2 Replies

8. UNIX for Dummies Questions & Answers

Remove groups of repeating lines

I know uniq exists, but am not sure how to remove repeating lines when they are groups of two different lines repeating themselves, without using sort. I need them to be sorted in the original order, just to remove repeats. cd /media/AUDIO/WAVE/9780743518673/mp3 ~/Desktop/mp3-to-m4b... (1 Reply)
Discussion started by: glev2005
1 Replies

9. Shell Programming and Scripting

Breaking long lines into (characters, newline, space) groups

Hello, I am currently trying to edit an ldif file. The ldif specification states that a newline followed by a space indicates the subsequent line is a continuation of the line. So, in order to search and replace properly and edit the file, I open the file in textwrangler, search for "\r " and... (14 Replies)
Discussion started by: rowie718
14 Replies

10. Shell Programming and Scripting

Rearrange data from 2 files

Dear All, I have the below files A file contains 1473 1649 1670 1758 1767 1784 B file contains 1242 1246 1264 1268 1284 (3 Replies)
Discussion started by: yahyaaa
3 Replies
Login or Register to Ask a Question