Visit Our UNIX and Linux User Community


Sorting


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sorting
# 15  
Old 05-20-2010
Quote:
Originally Posted by Ernst
I am using:

FMLISHELL=/sbin/sh
SHELL=/sbin/sh


I did not get an output file for either script. I did not get an error message either.
I asked about OS, not about shell......
# 16  
Old 05-20-2010
Windows XP
# 17  
Old 05-21-2010
Quote:
Originally Posted by Ernst
Windows XP
So Windows XP + Cygwin?

your real data file is different with the sample data.

1. No blank line between each group.
2. first column has space, but in your sample data, there is no space !

Code:
1  .6=
1  .5=
1  .4=
1  .3=12
1  .2=348
1  .1=180

3. each group has 6 lines. some groups has no data in line 4, 5 or 6.

That's why our scripts can't work on your read data.

---------- Post updated at 01:18 PM ---------- Previous update was at 12:36 PM ----------

According your read date, I updated the script:

Code:
awk -F "[.=]" '{a[$1]; b[$1,$2]=$3} 
            END {for (i in a) {if (b[i,1]>b[i,2]||b[i,2]>b[i,3]||b[i,1]>b[i,3]) 
                                      for (j=6;j>=1;j--) printf "%3s.%s=%s\n", i,j,b[i,j]
                                 }
                  } ' urfile

# 18  
Old 05-21-2010
Did this script work for you? I got this error message every time I run it.

Code:
awk: syntax error near line 1
awk: bailing out near line 1

Moderator's Comments:
Mod Comment Code Tags
# 19  
Old 05-21-2010
Quote:
Originally Posted by Ernst
... However, they do not work for my huge file.
With my huge file, I do not get an output file.
Well, how huge is your huge file ? 1000 lines ? 10,000 lines ? 100,000 lines ? 1 million lines ?

And, there was no *output file* in my suggested script. So if you executed the Perl one-liner as I had posted, you wouldn't see any output file either.

The Perl one-liner processes your input file ("input.dat" in my post) and spews the output on stdout - which is your Terminal screen by default.

Quote:
... Whenever I cat the output, I do not get any data.
I did not get an error message either.
If you mean displaying the output file with the use of the "cat" command, then did you redirect the output to a file first ?
If yes, then can you post what exactly you typed on your Terminal screen ? (i.e. can you copy/paste the session from your Terminal screen).

Quote:
...
Below is my input file:
Code:
1  .6=   
1  .5=   
1  .4=   
1  .3=12 
1  .2=348
1  .1=180
10 .6=   
10 .5=   
10 .4=   
10 .3=360
10 .2=192
10 .1=24 
100.6=   
100.5=   
100.4=   
100.3=364
100.2=196
100.1=28 
101.6=   
101.5=   
101.4=   
101.3=464
101.2=296
101.1=128
102.6=   
102.5=   
102.4=   
102.3=444
...
...

As posted by others, your input files do not show consistent data. This is what you've posted earlier:

Quote:
Originally Posted by Ernst
Okay, I have 6 groups. Now I want the script to go through these groups and look at the structure of each group. If the .1 (within a group) is greater than the .2 or .3 OR if .2 is greater than .3 within a group; thus output these groups. In our case the output would be groups.

input file
Code:
1.6=176
1.5=172
1.4=168
1.3=14
1.2=13
1.1=12
 
230.3=146
230.2=147
230.1=148
 
3.3=20
3.2=19
3.1=18
 
5.6=166
5.5=122
5.4=160
5.3=103
5.2=102
5.1=100
 
100.6=176
100.5=172
100.4=168
100.3=20
100.2=12
100.1=16
 
117.3=24
117.2=82
117.1=79

...
I hope it is clear enough.
As you can see, the differences are listed below:

Difference # 1 : Your old input file did not have space between "1" and ".", whereas your new file has the space.

Code:
# First line of old input file
1.6=176
 
# First line of new input file
1  .6=

Difference # 2 : Your old input file has a number to the right of every single "=" character. Your new input file does not have a number to the right of every single "=" character.

Code:
# First  5 lines of old input file

1.6=176
1.5=172
1.4=168
1.3=14
1.2=13
 
# First 5 lines of new input file
1  .6=   
1  .5=   
1  .4=   
1  .3=12 
1  .2=348

Difference # 3 : Your old input file has blank lines at the end of each "group". Your new input file does not have even a single blank line.

Code:
# First 10 lines of old input file; it has two "groups" with a blank line to separate them
1.6=176
1.5=172
1.4=168
1.3=14
1.2=13
1.1=12
 
230.3=146
230.2=147
230.1=148
 
# First 10 lines of new input file; it has no blank lines anywhere in the file
1  .6=   
1  .5=   
1  .4=   
1  .3=12 
1  .2=348
1  .1=180
10 .6=   
10 .5=   
10 .4=   
10 .3=360

Needless to say, you shouldn't expect consistent solutions to inconsistent problems !

Quote:
...
Try your scripts and let me know whether or not it works for you.
...
Sure thing. Since you did not mention how huge your input file is, I'll assume it has 2 million lines.

Here's what I did. I took this input file "input.dat" and kept on appending the content over and over to another file called "input.txt".

Code:
$
$ cat input.dat
1.6=176
1.5=172
1.4=168
1.3=14
1.2=13
1.1=12
 
230.3=146
230.2=147
230.1=148
 
3.3=20
3.2=19
3.1=18
 
5.6=166
5.5=122
5.4=160
5.3=103
5.2=102
5.1=100
 
100.6=176
100.5=172
100.4=168
100.3=20
100.2=12
100.1=16
 
117.3=24
117.2=82
117.1=79
$

The final line count of "input.txt" is 2 million lines roughly.
Here's some information about "input.txt".

Code:
$
$ # the line, word and character counts of "input.txt"; note that it has 2,062,500 lines
$ wc input.txt
 2062500  1687500 14625000 input.txt
$
$ # the first 10 lines of "input.txt"
$ head input.txt
1.6=176
1.5=172
1.4=168
1.3=14
1.2=13
1.1=12
 
230.3=146
230.2=147
230.1=148
$
$ # the last 10 lines of "input.txt"
$ tail input.txt
100.5=172
100.4=168
100.3=20
100.2=12
100.1=16
 
117.3=24
117.2=82
117.1=79
$

And now, I run the Perl one-liner on the file "input.txt" and redirect the output to file "output.txt".

I also feed the entire one-liner to the "time" command.

Code:
$
$
$ time perl -lne 'chomp;
           if (/^\s*$/) {
             if ($x>$y or $x>$z or $y>$z) {print foreach (@a); print}
             @a=(); $x=$y=$z="";
           } else {
             push @a,$_;
             if (/^\d+\.1=(.*)$/) {$x = $1}
             elsif (/^\d+\.2=(.*)$/) {$y = $1}
             elsif (/^\d+\.3=(.*)$/) {$z = $1}
           }
           END {if ($x>$y or $x>$z or $y>$z) {print foreach (@a); print}}
          ' input.txt >output.txt
real    0m15.125s
user    0m0.015s
sys     0m0.031s
$
$
$ wc output.txt
 937500  750000 8250000 output.txt
$
$ head output.txt
230.3=146
230.2=147
230.1=148
 
100.6=176
100.5=172
100.4=168
100.3=20
100.2=12
100.1=16
$
$ tail output.txt
100.5=172
100.4=168
100.3=20
100.2=12
100.1=16
 
117.3=24
117.2=82
117.1=79
$
$

And that's 15.125 seconds to process 2 million lines.

tyler_durden
# 20  
Old 05-23-2010
Quote:
Originally Posted by Ernst
Did this script work for you? I got this error message every time I run it.

Code:
awk: syntax error near line 1
awk: bailing out near line 1

Moderator's Comments:
Mod Comment Code Tags
I test it in Cygwin, and get the output without problem.

Code:
47 .6=
47 .5=
47 .4=
47 .3=148
47 .2=484
47 .1=316
129.6=
129.5=
129.4=
129.3=40
129.2=376
129.1=208
82 .6=
82 .5=
82 .4=
82 .3=148
82 .2=484
82 .1=316
2  .6=
2  .5=
2  .4=
2  .3=40
2  .2=208
2  .1=376
94 .6=
94 .5=
94 .4=
94 .3=52
94 .2=388
94 .1=220
67 .6=
67 .5=
67 .4=
67 .3=16
67 .2=352
67 .1=184
32 .6=
32 .5=
32 .4=
32 .3=32
32 .2=368
32 .1=200
363.6=352
363.5=496
363.4=328
363.3=160
363.2=184
363.1=508
486.6=
486.5=
486.4=
486.3=40
486.2=376
486.1=208
177.6=
177.5=
177.4=
177.3=124
177.2=460
177.1=292
178.6=
178.5=
178.4=
178.3=96
178.2=432
178.1=264
139.6=
139.5=
139.4=
139.3=96
139.2=432
139.1=264
290.6=
290.5=
290.4=
290.3=124
290.2=460
290.1=292
250.6=
250.5=
250.4=
250.3=60
250.2=396
250.1=228
251.6=
251.5=
251.4=
251.3=124
251.2=460
251.1=292
217.6=
217.5=
217.4=
217.3=100
217.2=436
217.1=268
95 .6=
95 .5=
95 .4=
95 .3=204
95 .2=372
95 .1=36
68 .6=
68 .5=
68 .4=
68 .3=64
68 .2=400
68 .1=232
64 .6=
64 .5=
64 .4=
64 .3=128
64 .2=464
64 .1=296
41 .6=
41 .5=
41 .4=
41 .3=100
41 .2=436
41 .1=268
1  .6=
1  .5=
1  .4=
1  .3=12
1  .2=348
1  .1=180
186.6=
186.5=
186.4=
186.3=172
186.2=4
186.1=340
145.6=
145.5=
145.4=
145.3=16
145.2=352
145.1=184
57 .6=364
57 .5=420
57 .4=196
57 .3=252
57 .2=28
57 .1=84
225.6=
225.5=
225.4=
225.3=108
225.2=444
225.1=276
30 .6=
30 .5=
30 .4=
30 .3=68
30 .2=404
30 .1=236
380.6=
380.5=
380.4=
380.3=80
380.2=416
380.1=248
300.6=
300.5=
300.4=
300.3=136
300.2=472
300.1=304
110.6=
110.5=
110.4=
110.3=12
110.2=348
110.1=180
111.6=
111.5=
111.4=
111.3=8
111.2=344
111.1=176
112.6=
112.5=
112.4=
112.3=64
112.2=400
112.1=232
156.6=
156.5=
156.4=
156.3=48
156.2=384
156.1=216
159.6=
159.5=
159.4=
159.3=68
159.2=404
159.1=236
81 .6=
81 .5=
81 .4=
81 .3=92
81 .2=428
81 .1=260
274.6=
274.5=
274.4=
274.3=120
274.2=456
274.1=288
275.6=
275.5=
275.4=
275.3=144
275.2=480
275.1=312
237.6=
237.5=
237.4=
237.3=200
237.2=368
237.1=32
310.6=
310.5=
310.4=
310.3=8
310.2=344
310.1=176
312.6=
312.5=
312.4=
312.3=224
312.2=392
312.1=56
314.6=
314.5=
314.4=
314.3=140
314.2=476
314.1=308
357.6=96
357.5=432
357.4=444
357.3=264
357.2=276
357.1=108


Previous Thread | Next Thread
Test Your Knowledge in Computers #903
Difficulty: Medium
There are less than 10 million lines of code in the Linux kernel as of 2018.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sorting

Hii guys, I need to sort my file and remove duplicates before writing to another file. The first line in the file are column names. I dont want this line to be sorted and should always be the first line in the output. sort -u file.txt > file1.txt. is the command that i am using... (4 Replies)
Discussion started by: just4u_sharath
4 Replies

2. Shell Programming and Scripting

sorting help

Hi, Please i need help in writing an 'awk' script in sorting the following data; traceroute6 to 2001:1ba0:2a0:5965:0:30:24:1 (2001:1ba0:2a0:5965:0:30:24:1) from 2001:418:1::62, 64 hops max, 16 byte packets 1 2001:418:1::4 0.342 ms 2 2001:418:1::1 0.630 ms 3 2001:504:16::1b1b 0.393 ms 4... (6 Replies)
Discussion started by: sam127
6 Replies

3. Shell Programming and Scripting

sorting

Hi all, Does anyone can help me the following question? I would like to write an AWK script. In the following input file, each number in "start" is paired with numbers in column "end". No Start End A 22,222,33,22,1233,3232,44 555,333,222,55,1235,3235,66... (7 Replies)
Discussion started by: phoeberunner
7 Replies

4. Shell Programming and Scripting

Sorting HELP

Hi, I have posted related topic but as i continue the research I find more need to sort the data. AS(2607:f278:4101:11:dead:beef:f00f:f), AS786 AS6453 AS7575 AS7922 AS(2607:f2e0:f:1db::16), AS786 AS3257 AS36252 AS786 AS3257 AS36252 AS(2607:f2f8:1700::2), AS786 AS6939 AS25795 ... (6 Replies)
Discussion started by: sam127
6 Replies

5. UNIX for Advanced & Expert Users

HELP on sorting

hi everyone, I am kind of new to this forum. I need help in sorting this data out accordingly, I am actually doing a traceroute application and wants my AS path displayed in front of my address like this; 192.168.1.1 AS28513 AS65534 AS5089 AS5089 .... till the last AS number and if possible... (1 Reply)
Discussion started by: sam127
1 Replies

6. UNIX for Dummies Questions & Answers

HELP on sorting

hi everyone, I am kind of new to this forum. I need help in sorting this data out accordingly, I am actually doing a traceroute application and wants my AS path displayed in front of my address like this; 192.168.1.1 AS28513 AS65534 AS5089 AS5089 .... till the last AS number and if possible... (1 Reply)
Discussion started by: sam127
1 Replies

7. Homework & Coursework Questions

Sorting help

i have list of files: Wang De Wong CVPR 09.pdf Yaacob AFGR 99 Second edition.pdf Shimon CVPR 01.pdf Den CCC 97 long one.pdf Ronald De Bour CSPP 04.pdf ..... how can i sort this directory so the output will be in the next format: <year>\t<conference/journal>\t<author list> - t is tab (its... (1 Reply)
Discussion started by: nirnir26
1 Replies

8. UNIX for Dummies Questions & Answers

Sorting help

i have list of files: Wang De Wong CVPR 09.pdf Yaacob AFGR 99 Second edition.pdf Shimon CVPR 01.pdf Den CCC 97 long one.pdf Ronald De Bour CSPP 04.pdf ..... how can i sort this directory so the output will be in the next format: <year>\t<conference/journal>\t<author list> - t is tab (its... (1 Reply)
Discussion started by: nirnir26
1 Replies

9. UNIX for Dummies Questions & Answers

Sorting help

how can i sort the next list just by look at the numbers (ignore letters) example: abc123 dff4f aaa2aa bbbb55555bb output: aaa2aa dff4f abc123 bbbb55555bb (1 Reply)
Discussion started by: nirnir26
1 Replies

10. Shell Programming and Scripting

Need immediate help with sorting!!!

hey, I have a file that looks smthng like this: /*--- abcd_0050 ---*/ asdfjk adsfkja lkjljgafsd /*---abcd_0005 ---*/ lkjkljbfkgj ldfksjgf dfkgfjb /*-- abcd_0055--*/ klhfdghd dflkjgd jfdg I would like it to be sorted so that it looks like this: /*---abcd_0005 ---*/ lkjkljbfkgj (9 Replies)
Discussion started by: sasuke_uchiha
9 Replies

Featured Tech Videos