Sorting/Arranging file based on tags using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sorting/Arranging file based on tags using awk
# 1  
Old 01-04-2017
Sorting/Arranging file based on tags using awk

Hi,

I have file which contains data based on tags. Output of the file should be in order of tags.

Below are the files :

Tags.txt
Code:
f12
f13
f23
f45
f56

Original data is like this :
Data.txt
Code:
2017/01/04|09:07:00:021|R|XYZ|38|9|1234|f12=CAT|f23=APPLE|f45=PENCIL|f13=CAR
2017/01/04|09:07:00:021|T|LMN|38|7|1234|f23=ORANGE|f12=DOG|f45=BOOK|f56=ICE-CREAM
2017/01/04|09:08:00:768|R|XYZ|42|9|3457|f56=CUSTARD|f13=RAILWAY
2017/01/04|09:02:00:976|L|PQR|38|9|5644|f56=CHOCOLATE|f12=SNAKE|f13=AUTO|f23=BANANA

And, Output should be like this :
Expected Result -
Code:
2017/01/04|09:07:00:021|R|XYZ|38|9|1234|CAT|CAR|APPLE|PENCIL|
2017/01/04|09:07:00:021|T|LMN|38|7|1234|DOG||ORANGE|BOOK|ICE-CREAM
2017/01/04|09:08:00:768|R|XYZ|42|9|3457||RAILWAY|||CUSTARD
2017/01/04|09:02:00:976|L|PQR|38|9|5644|SNAKE|AUTO||BANANA|CHOCOLATE

I was thinking of using associative array in AWK. But, not able to do it properly. Can someone please help?
# 2  
Old 01-04-2017
Hello Prathmesh,

Can I just confirm if this sorting is just to be within each record and that the output lines should be in the same order, i.e. it's horizontal sorting, so this:-
Code:
a,4,3,2,1
c,5,4,3,2
b,1,5,4,2

...would deliver:-
Code:
a,1,2,3,4
c,2,3,4,5
b,1,2,4,5

If so, I have a few to questions pose in response first:-
  • Is this homework/assignment? There are specific forums for these.
  • What have you tried so far?
  • What output/errors do you get?
  • What OS and version are you using?
  • What are your preferred tools? (C, shell, perl, awk, etc.)
  • What logical process have you considered? (to help steer us to follow what you are trying to achieve)
Most importantly, What have you tried so far?

There are probably many ways to achieve most tasks, so giving us an idea of your style and thoughts will help us guide you to an answer most suitable to you so you can adjust it to suit your needs in future.


We're all here to learn and getting the relevant information will help us all.


Kind regards,
Robin
This User Gave Thanks to rbatte1 For This Post:
# 3  
Old 01-04-2017
Quote:
Originally Posted by rbatte1
Hello Prathmesh,

Can I just confirm if this sorting is just to be within each record and that the output lines should be in the same order, i.e. it's horizontal sorting, so this:-
Code:
a,4,3,2,1
c,5,4,3,2
b,1,5,4,2

...would deliver:-
Code:
a,1,2,3,4
c,2,3,4,5
b,1,2,4,5

If so, I have a few to questions pose in response first:-
  • Is this homework/assignment? There are specific forums for these.
  • What have you tried so far?
  • What output/errors do you get?
  • What OS and version are you using?
  • What are your preferred tools? (C, shell, perl, awk, etc.)
  • What logical process have you considered? (to help steer us to follow what you are trying to achieve)
Most importantly, What have you tried so far?

There are probably many ways to achieve most tasks, so giving us an idea of your style and thoughts will help us guide you to an answer most suitable to you so you can adjust it to suit your needs in future.


We're all here to learn and getting the relevant information will help us all.


Kind regards,
Robin
Thanks Robin for your reply.

This is not assignment problem. My OS is GNU/Linux and I prefer to use Shell script/AWK.

I am thinking of listing all possible tags in one file Tags.txt , then match each tag in one line at a time and take the result after = sign of that particular tag and present it as output. However, I am still not able to come up with correct AWK statement for this.

And, Yes It is horizontal sorting based on the order of tags.
# 4  
Old 01-04-2017
Looks like the pipe character is the field separator.
Are the tags always in field 8 and higher?
# 5  
Old 01-04-2017
Try
Code:
awk -F\| '
NR==FNR         {F[NR] = $1
                 MX = NR
                 next
                }
                {for (i=8; i<=NF; i++)  {split ($i, T, "=")
                                         R[T[1]] = T[2]
                                        }
                 for (i=1; i<=MX; i++)  $(7+i)=R[F[i]]
                 delete R
                }
1
' OFS=\| file1 file2
2017/01/04|09:07:00:021|R|XYZ|38|9|1234|CAT|CAR|APPLE|PENCIL|
2017/01/04|09:07:00:021|T|LMN|38|7|1234|DOG||ORANGE|BOOK|ICE-CREAM
2017/01/04|09:08:00:768|R|XYZ|42|9|3457||RAILWAY|||CUSTARD
2017/01/04|09:02:00:976|L|PQR|38|9|5644|SNAKE|AUTO|BANANA||CHOCOLATE

You seem to have a small error in your desired output sample.
This User Gave Thanks to RudiC For This Post:
# 6  
Old 01-04-2017
Quote:
Originally Posted by MadeInGermany
Looks like the pipe character is the field separator.
Are the tags always in field 8 and higher?
Yes. Pipe is delimiter. And tags may or may not be in field 8 or higher.

Sent from my Nexus 5 using Tapatalk

---------- Post updated at 12:37 AM ---------- Previous update was at 12:36 AM ----------

Quote:
Originally Posted by RudiC
Try
Code:
awk -F\| '
NR==FNR         {F[NR] = $1
                 MX = NR
                 next
                }
                {for (i=8; i<=NF; i++)  {split ($i, T, "=")
                                         R[T[1]] = T[2]
                                        }
                 for (i=1; i<=MX; i++)  $(7+i)=R[F[i]]
                 delete R
                }
1
' OFS=\| file1 file2
2017/01/04|09:07:00:021|R|XYZ|38|9|1234|CAT|CAR|APPLE|PENCIL|
2017/01/04|09:07:00:021|T|LMN|38|7|1234|DOG||ORANGE|BOOK|ICE-CREAM
2017/01/04|09:08:00:768|R|XYZ|42|9|3457||RAILWAY|||CUSTARD
2017/01/04|09:02:00:976|L|PQR|38|9|5644|SNAKE|AUTO|BANANA||CHOCOLATE

You seem to have a small error in your desired output sample.
Thanks. I will try it and let you know.

Sent from my Nexus 5 using Tapatalk
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sorting based on File name

Hi All I have a requirement to list all the files in chronological order based on the date value in the file name.For ex if I have three files as given below ABC_TEST_20160103_1012.txt ABC_TEST_20160229_1112.txt ABC_TEST_20160229_1112.txt I have written code as given below to list out... (2 Replies)
Discussion started by: ginrkf
2 Replies

2. Shell Programming and Scripting

List the files after sorting based on file content

Hi, I have two pipe separated files as below: head -3 file1.txt "HD"|"Nov 11 2016 4:08AM"|"0000000018" "DT"|"240350264"|"56432" "DT"|"240350264"|"56432" head -3 file2.txt "HD"|"Nov 15 2016 2:18AM"|"0000000019" "DT"|"240350264"|"56432" "DT"|"240350264"|"56432" I want to list the... (6 Replies)
Discussion started by: Prasannag87
6 Replies

3. Shell Programming and Scripting

Sorting file based on a numeric column

Hi, I use UBUNTU 12.04. I have a file with this structure: Name 2 1245787 A G 12 14 12 14 .... Name 1 1245789 C T 13 12 12 12..... I would like to sort my file based on the second column so to have this output for example: Name 1 1245789 C T 13 12 12 12..... Name 2 1245787 A G 12 14... (4 Replies)
Discussion started by: Homa
4 Replies

4. Shell Programming and Scripting

Sorting file based on name

Hi team, We have few files landing to our server based on sequence number. These files have to be processed in the sequence number order. Once the sequence number has reached its maximum, the files with sequence number 0000 has to be processed. For example: IN9997 IN9998 IN9999 IN0000... (7 Replies)
Discussion started by: anijan
7 Replies

5. UNIX for Dummies Questions & Answers

Sorting a file based on the absolute value of a column

I would like to sort a tab delimited text file based on the absolute value of its second column. How do I go about doing that? Thanks! Example input: A -12 B 0 C -6 D 7 Output: A -12 D 7 C -6 B 0 (4 Replies)
Discussion started by: evelibertine
4 Replies

6. Shell Programming and Scripting

sorting based on a specified column in a text file

I have a tab delimited file with 5 columns 79 A B 20.2340 6.1488 8.5086 1.3838 87 A B 0.1310 0.0382 0.0054 0.1413 88 A B 46.1651 99.0000 21.8107 0.2203 89 A B 0.1400 0.1132 0.0151 0.1334 114 A B 0.1088 0.0522 0.0057 0.1083 115 A B... (2 Replies)
Discussion started by: Lucky Ali
2 Replies

7. Shell Programming and Scripting

Sorting file based on names

Hi I have some files in directory and the names of files are like jnhld_15233_2010-11-23 jnhld_15233_2007-10-01 jnhld_15233_2001-05-04 jnhld_15233_2011-11-11 jnhld_15233_2005-06-07 jnhld_15233_2000-04-01 ..etc How can i sort these files based on the date in the file name so that ... (4 Replies)
Discussion started by: morbid_angel
4 Replies

8. Shell Programming and Scripting

sorting csv file based on column selected

Hi all, in my csv file it'll look like this, and of course it may have more columns US to UK;abc-hq-jcl;multimedia UK to CN;def-ny-jkl;standard DE to DM;abc-ab-klm;critical FD to YM;la-yr-tym;standard HY to MC;la-yr-ytm;multimedia GT to KJ;def-ny-jrt;critical I would like to group... (4 Replies)
Discussion started by: tententen
4 Replies

9. Shell Programming and Scripting

sorting file based on two or more columns

Hi gang. I'm using a unix/mac system and i'm trying to sort a file (more than 1,000,000 lines). chr1 100000965 100001001 - chr1 100002155 100002191 + chr1 100002165 100002201 + chr1 100002525 100002561 - chr1 10000364 ... (2 Replies)
Discussion started by: labrazil
2 Replies

10. UNIX for Dummies Questions & Answers

re-arranging text in a file with AWK

Hi Gurus, I have a text file that I want to process with the following structure; 4528788 Blah - Something 9341423 Text - Somethinghere 98792223,5546761 Some - More - Text 5119503,5159504,1234567 Text - More - Text 13459695 Stuff - Text Again 13526583 Junk - More Text Here 13595177... (1 Reply)
Discussion started by: th3g0bl1n
1 Replies
Login or Register to Ask a Question