Visit Our UNIX and Linux User Community


Data manipulation with Awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Data manipulation with Awk
# 8  
Old 08-25-2009
I just noticed something bad with my data. Some points are doubled for nothing. Here's a list of Long/Lat coordinates (it's just a sample of a huge data file) :
Quote:
-0.1194983E+03 0.6000011E+02
-0.1194394E+03 0.6000006E+02
-0.1195464E+03 0.6000011E+02 <-------- A
-0.1195464E+03 0.6000011E+02 <-------- A'
-0.1195002E+03 0.6000021E+02
-0.1194394E+03 0.6000006E+02 # END
-0.1065872E+03 0.5324551E+02
-0.1100000E+03 0.6000000E+02
-0.1099966E+03 0.6000000E+02 <-------- B
-0.1099966E+03 0.6000000E+02 <-------- B'
-0.1090837E+03 0.6000000E+02 <-------- C
-0.1090837E+03 0.6000000E+02 <-------- C'
-0.1090004E+03 0.6000012E+02
....
-0.1194394E+03 0.6000006E+02 # END
So I now need to remove all the doubles (A', B', C' in the example above), while keeping the first (A, B, C). Take note that the doubles are always located immediately after the first.

So how can I remove them, using Awk ? I'm pretty sure this should be easy, but I'm really not a specialist of Awk. Smilie
# 9  
Old 08-25-2009
You can remove the double lines first with uniq and pipe the result to the awk command:

Code:
uniq file | awk '
{s=n?s " " c++:c++; n++}
/# End/{print "NumberOfLines " n; print s "\n"; n=0; s =""}
'

Regards
# 10  
Old 08-25-2009
Quote:
Originally Posted by Franklin52
You can remove the double lines first with uniq and pipe the result to the awk command:

Code:
uniq file | awk '
{s=n?s " " c++:c++; n++}
/# End/{print "NumberOfLines " n; print s "\n"; n=0; s =""}
'

Regards
Ok, but there's a constraint : Actually, the curves are all loops (same START and END points). I don't want to remove these. I just want to remove the useless doubles, which are always standing next from each other (one following the other, as in the example I gave above).

---------- Post updated at 01:11 PM ---------- Previous update was at 12:10 PM ----------

Okay, the uniq command works very well.

Geez, I'm learning ! Smilie

Again, thank you very much for your help ! Smilie
# 11  
Old 08-28-2009
Okay, I have one more problem to solve with Awk (or any other simple method in UNIX) :

I have to remove some lines in the data. The original data have the following shape :
Quote:
data file
1
1.23 2.33 1.34
1.33 2.67 4.65
3.56 3.78 1.34
END
2 <-- to be removed
4.67 3.78 6.78 <-- to be removed
6.21 5.55 4.90
3.45 2.57 2.78
3.67 3.71 6.78
END
3 <-- to be removed
3.56 4.89 3.45 <-- to be removed
2.31 6.67 8.90
...

The two first lines after "END" should be removed. How can I do that ? Any idea ?
# 12  
Old 08-28-2009
Quote:
Originally Posted by Cham
Okay, I have one more problem to solve with Awk (or any other simple method in UNIX) :

I have to remove some lines in the data. The original data have the following shape :



The two first lines after "END" should be removed. How can I do that ? Any idea ?
Code:
nawk 'c&&c--{next} /END/ {c=2}1' myFile

# 13  
Old 08-28-2009
Quote:
Originally Posted by vgersh99
Code:
nawk 'c&&c--{next} /END/ {c=2}1' myFile

Thanks. This is working well, but only if I use awk (and not nawk, which isn't recognized on my system).

What is the logic behind this code ? What if we want to remove only 1 line, or three lines ?

Sorry if I'm such a noob !
# 14  
Old 08-29-2009
Quote:
Originally Posted by Cham
Thanks. This is working well, but only if I use awk (and not nawk, which isn't recognized on my system).

What is the logic behind this code ? What if we want to remove only 1 line, or three lines ?

Sorry if I'm such a noob !
adjust the 'c=N' accordingly.

Previous Thread | Next Thread
Test Your Knowledge in Computers #156
Difficulty: Easy
The first two-network TCP/IP communications test was performed between Stanford and University College London, in 1975.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Data manipulation, Please help..

Hello, I have a huge set of data that needs to be reformatted. Here is a simple example to explain the process. I have number n=5 and a input with many numbers separated with comma: ... (11 Replies)
Discussion started by: liuzhencc
11 Replies

2. UNIX for Dummies Questions & Answers

Data Manipulation

Dear Sir, I have file input RGR001|108.28|-2.86489|100-120|RANGGAR RGR002|108.071|-2.69028|80-100|RANNGAR RGR003|108.168|-2.97053|50-80|RANNGAR RGR007|108.192722222|-2.766138889|0-50|RANGGARI want to create files by joining each rows with each rows below Output as below ... (4 Replies)
Discussion started by: radius
4 Replies

3. UNIX for Dummies Questions & Answers

Data/date manipulation

Hallo Team, I need your help. I would like to change field9 format to yyyy-mm-dd it should be for example 2013-11-16 instead of 20131116 0780112843,0873599381,E,ISOL,ZAR,0.0035,O,1,20131116,4373200,0.21 0733001720,0873516499,E,ISOL,ZAR,0.0035,O,1,20131116,4331600,0.21... (3 Replies)
Discussion started by: kekanap
3 Replies

4. UNIX for Dummies Questions & Answers

Data manipulation

Hallo Team, I need to manipulate existing data file. Have a look at current data and expected data: Current Data: 27873517141 27873540000 27873515109 27873517140 27873540001 27873540000 27873501343 27873540000 27873517140 27873511292 27873645989 27873540000 27873540000... (7 Replies)
Discussion started by: kekanap
7 Replies

5. Shell Programming and Scripting

Data manipulation using shell

Dear all I have a dataset (in text format,delimited by tab) which have 100 variables (say, var0-var99) and more than 100,000 observations. I want to do the following: 1. for variable var0-var49, I want to add "00" in front of each data (for example, "1" would become "001") 2. for variable... (8 Replies)
Discussion started by: littlewenwen
8 Replies

6. Shell Programming and Scripting

Help with data manipulation script

Y,T,,H05,6,6,0,0 -> TH05_6 D,5,BT,B -> BT_KIOSK P,KQC222 -> KQC222 G,B,2 -> BRANI_GATE_2 fileA TPM658 Y,T,,H05,6,6,0,0 TPM110 D,5,BT,B TPM136 P,KQC222 TPM180 P,BQC913 TPM575 Y,B,,T05,14,14,0,0 IPM760 G,B,2 TPM011 I need to use second column $1,$2,$3,$4..... if first char... (6 Replies)
Discussion started by: ment0smintz
6 Replies

7. Shell Programming and Scripting

awk data subsets manipulation

Hi, I'm working on a data file with the following structure val1,val2,flag 214.7332983,979.0259,1 12.87435571,205.7679,1 1.365976384,19.01616,1 44.08584096,205.7679,2 7.034721792,383.8778,2 189.5685503,979.0259,2 1.96352032,19.01616,2 where the field 'flag' identifies different... (10 Replies)
Discussion started by: Linoleum
10 Replies

8. Shell Programming and Scripting

Data manipulation from one file

HI all i have a file consisting of following numbers 0000 0000 0000 0000 0000 1010 0000 0100 0000 0000 0000 1111 0000 1010 0000 0100 (3 Replies)
Discussion started by: vaibhavkorde
3 Replies

9. UNIX for Dummies Questions & Answers

Data Manipulation

Hello I am currently having problems in mapulating a certain file which contains vaious data. Belos is a sample content Event=<3190> Client IP=<151.111.11.143> DNS=<abc.sbc.com> TransCount=<139> Client IP=<150.222.133.163> DNS=<xyz.yuu.com> TransCount=<3734> Event=<3120> Client... (11 Replies)
Discussion started by: khestoi
11 Replies

10. UNIX for Dummies Questions & Answers

data manipulation script

I have a folder called {homedata} Within this folder there are 12 subfolders 200601.......200612 Within each subfolder there are 8 sets of files Each filename commences with A B C D E F G or H, so {filename}* can be used. I am trying to write a script which will from the top level go... (1 Reply)
Discussion started by: grinder182533
1 Replies

Featured Tech Videos