Sort and summarise between patterns


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sort and summarise between patterns
# 1  
Old 11-21-2018
Sort and summarise between patterns

Hi!

I have a text file which I would like to sort, summarise and count between the pattern "--Current Database"

This is my text file:

Code:
-- Current Database: `city`
New York
Chicago
Las Vegas
San Francisco
-- Current Database: `country`
United States
Mexico
Portugal
Mexico
Mexico
Norway
-- Current Database: `name`
Kevin Hart
Caroline
Max
Kevin Hart
-- Current Database: `phone`
669874223
236897556
478896542
669874223
-- Current Database: `addres`
menk st 
guitar st 15

And I would like the output to be like that

Code:
-- Current Database: `city`
1 Chicago
1 Las Vegas
1 New York
1 San Francisco
-- Current Database: `country`
3 Mexico
1 Norway
1 Portugal
1 United States
-- Current Database: `name`
1 Caroline
2 Kevin Hart
1 Max
-- Current Database: `phone`
1 236897556
1 478896542
2 669874223
-- Current Database: `addres`
1 guitar st 15
1 menk st

I know that cat file.txt | sort | uniq -c would sort, summarise and count all lines but I don't know how to do it between patterns. I also tried with "split" command but I wasn't able to make what I was expecting

Could somebody help me?

Thanks

Last edited by mac-arrow; 11-21-2018 at 07:08 AM..
# 2  
Old 11-21-2018
Hi,
Maybe as:
Code:
awk '/^-- Current Database:/ {A++}{print A" "$0}' file.txt | LC_COLLATE=C sort | uniq -c | sed 's/ *\([0-9]\+\) [0-9]\+ /\1 /;/-- Current Database:/s/^[0-9]\+ //'

Regards.
These 2 Users Gave Thanks to disedorgue For This Post:
# 3  
Old 11-21-2018
WOW!! it worked!! Thank you so much!

Just one more thing... How could I do to sort the results from the highest number to the lowest instead of alphabetically?

Like this:
Code:
-- Current Database: `country`
3 Mexico
1 Norway
1 Portugal
1 United States
-- Current Database: `name`
2 Kevin Hart
1 Caroline
1 Max
-- Current Database: `phone`
2 669874223
1 236897556
1 478896542

Regards
# 4  
Old 11-21-2018
It's a little few hard and the result is little few different (highest number then highest aphabetic so) :
Code:
awk '/^-- Current Database:/ {A--}{print A" "$0}' /tmp/file.txt | LC_COLLATE=C sort | uniq -c | sed '/-- Current Database:/s/[0-9]\+/Z/' | LC_COLLATE=C  sort -rn -k2,2 | sed 's/ *Z -[0-9]* //;s/  \+\|-[0-9]\+//g'

Regards.
This User Gave Thanks to disedorgue For This Post:
# 5  
Old 11-21-2018
Try also
Code:
awk '
/-- /   {if (cmd) close (cmd)
         print
         cmd = "sort | uniq -c | sort -k1,1r -k2"
         next
        }
        {print | cmd
        }

END     {close (cmd)
        }
' file

This User Gave Thanks to RudiC For This Post:
# 6  
Old 11-22-2018
Thank you @disedorgue for your help! This sentence is kind of mixing up some lines with different databases but don't worry.
I think your first answer will also help me for the task I needed to do Smilie

------ Post updated at 09:03 AM ------

Hi @Rudic

I tried yours but its giving me some errors

Code:
awk: /-- /   {if (cmd) close (cmd) print cmd = "sort | uniq -c | sort -k1,1r -k2" next} {print | cmd} END {close (cmd)}
awk:                               ^ syntax error
awk: /-- /   {if (cmd) close (cmd) print cmd = "sort | uniq -c | sort -k1,1r -k2" next} {print | cmd} END {close (cmd)}
awk:                                                                              ^ syntax error

regards
Moderator's Comments:
Mod Comment Please use CODE tags when displaying input and output as well as code segments.

Last edited by Don Cragun; 11-22-2018 at 05:51 AM.. Reason: Add missing CODE tags.
# 7  
Old 11-22-2018
Quote:
Originally Posted by mac-arrow
Thank you @disedorgue for your help! This sentence is kind of mixing up some lines with different databases but don't worry.
I think your first answer will also help me for the task I needed to do Smilie

------ Post updated at 09:03 AM ------

Hi @Rudic

I tried yours but its giving me some errors

Code:
awk: /-- /   {if (cmd) close (cmd) print cmd = "sort | uniq -c | sort -k1,1r -k2" next} {print | cmd} END {close (cmd)}
awk:                               ^ syntax error
awk: /-- /   {if (cmd) close (cmd) print cmd = "sort | uniq -c | sort -k1,1r -k2" next} {print | cmd} END {close (cmd)}
awk:                                                                              ^ syntax error

regards
Moderator's Comments:
Mod Comment Please use CODE tags when displaying input and output as well as code segments.
Please try RudiC's code the way he presented it.

You introduced syntax errors when you combined lines that RudiC had as separate lines in post #5 in this thread.
This User Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Use sort to sort numerical column

How to sort the following output based on lowest to highest BE? The following sort does not work. $ sort -t. -k1,1n -k2,2n bfd.txt BE31.116 0s 0s DOWN DAMP BE31.116 0s 0s DOWN DAMP BE31.117 0s 0s ... (7 Replies)
Discussion started by: sand1234
7 Replies

2. Shell Programming and Scripting

Bash - Find files excluding file patterns and subfolder patterns

Hello. For a given folder, I want to select any files find $PATH1 -f \( -name "*" but omit any files like pattern name ! -iname "*.jpg" ! -iname "*.xsession*" ..... \) and also omit any subfolder like pattern name -type d \( -name "/etc/gconf/gconf.*" -o -name "*cache*" -o -name "*Cache*" -o... (2 Replies)
Discussion started by: jcdole
2 Replies

3. Shell Programming and Scripting

Find matched patterns and print them with other patterns not the whole line

Hi, I am trying to extract some patterns from a line. The input file is space delimited and i could not use column to get value after "IN" or "OUT" patterns as there could be multiple white spaces before the next digits that i need to print in the output file . I need to print 3 patterns in a... (3 Replies)
Discussion started by: redse171
3 Replies

4. Shell Programming and Scripting

Sort help: How to sort collected 'file list' by date stamp :

Hi Experts, I have a filelist collected from another server , now want to sort the output using date/time stamp filed. - Filed 6, 7,8 are showing the date/time/stamp. Here is the input: #---------------------------------------------------------------------- -rw------- 1 root ... (3 Replies)
Discussion started by: rveri
3 Replies

5. Shell Programming and Scripting

Help with sort word and general numeric sort at the same time

Input file: 100%ABC2 3.44E-12 USA A2M%H02579 0E0 UK 100%ABC2 5.34E-8 UK 100%ABC2 3.25E-12 USA A2M%H02579 5E-45 UK Output file: 100%ABC2 3.44E-12 USA 100%ABC2 3.25E-12 USA 100%ABC2 5.34E-8 UK A2M%H02579 0E0 UK A2M%H02579 5E-45 UK Code try: sort -k1,1 -g -k2 -r input.txt... (2 Replies)
Discussion started by: perl_beginner
2 Replies

6. Shell Programming and Scripting

Alternate to sort --random-sort

sort --random-sort The full command is path=`find /testdir -maxdepth 1 -mindepth 1 -type d | ***Some sort of sort function*** | head -1` I have a list I want to randomly sort. It works fine in ubuntu but on a 'osx lion' sort dosen't have the --random-sort option. I don't want to... (5 Replies)
Discussion started by: digitalviking
5 Replies

7. Shell Programming and Scripting

Looking for a short way to summarise many sed commands

Hello, I have a large number of sed commands that I execute one after the other, simply because I don't know if there's a shorter way to do it. I hope someone can help me save some time :-) These are my commands: 1.) remove all " in the file: sed -e 's/\"//g' file 2.) insert ( and... (3 Replies)
Discussion started by: Bloomy
3 Replies

8. UNIX for Advanced & Expert Users

Script to sort the files and append the extension .sort to the sorted version of the file

Hello all - I am to this forum and fairly new in learning unix and finding some difficulty in preparing a small shell script. I am trying to make script to sort all the files given by user as input (either the exact full name of the file or say the files matching the criteria like all files... (3 Replies)
Discussion started by: pankaj80
3 Replies

9. Shell Programming and Scripting

How to Sort Floating Numbers Using the Sort Command?

Hi to all. I'm trying to sort this with the Unix command sort. user1:12345678:3.5:2.5:8:1:2:3 user2:12345679:4.5:3.5:8:1:3:2 user3:12345687:5.5:2.5:6:1:3:2 user4:12345670:5.5:2.5:5:3:2:1 user5:12345671:2.5:5.5:7:2:3:1 I need to get this: user3:12345687:5.5:2.5:6:1:3:2... (7 Replies)
Discussion started by: daniel.gbaena
7 Replies

10. Shell Programming and Scripting

Searching patterns in 1 file and deleting all lines with those patterns in 2nd file

Hi Gurus, I have a file say for ex. file1 which has 3500 lines in it which are different account numbers and another file (file2) which has 230000 lines in it. I want to read all the lines in file1 and delete all those lines from file2 which has that same pattern as in file1. I am not quite... (4 Replies)
Discussion started by: toms
4 Replies
Login or Register to Ask a Question