How to sort a content of a text file using a shell script?


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers How to sort a content of a text file using a shell script?
# 8  
Old 12-08-2014
My output should look like this

Image
# 9  
Old 12-08-2014
The output I want should be sorted like this.

http://i62.tinypic.com/15xkcvp.png

Moreover, I'm new to unix so need to know the usage for the awk and if possible, the script itself.

I am into a small business and hence need help! please!
# 10  
Old 12-08-2014
Please give us a CLEAR English explanation for the process that the script you want is supposed to use to transform your input into the output you want. And, please show us THE EXACT text in CODE tags for your desired sample output; not an image or a pointer to a webpage containing an image.

Your sample data shows five <app> values "bsi", "fap", "fsp", "unit", and "user" (in sorted order) and your sample output shows only two of those <app> values (in reverse sorted order, or input file order, or random order). How does this "sort a content of a text file"???

Your sample input has 20 lines. Your sample output has a total count of 565075???

If you won't show us the exact output you want from a 20 line input sample and won't give us a clear specification of what you're trying to do, any time that volunteers here devote to trying to help you is a waste of time. Please help us help you!

Last edited by Don Cragun; 12-08-2014 at 07:55 AM.. Reason: Explicitly state that we want to see exact sample output.
# 11  
Old 12-08-2014
Here is the list of 59 entries which is the sample input. I need this list to be sorted as explained below.

Code:
fap022-curtis.feitty
fap0217-dawn.kalani
fap0253-Debbie.Miller
fap0284-Donald.Gray
fap0285-Elaine.Barnes
fap0285-Elaine.Shoudy
fap0285-elizabeth.cabezas
fap0289-Elizabeth.Mavery
fap0299-evan.graham
fap0300-francis.schuckman
NMDemo164-llonjon
NMDemo217-mgrant
NMDemo217-mpinsent
NMDemo217-nrienks
NMDemo217-pcampbell
NMDemo217-pcurrey
NMDemo217-pkoch
NMDemo400-plee
NMDemo599-pmanger
NMDemo655-pnroman
NMDemo017-pschaefer
NMTalent6-gcook
NMTalent6-hatsma
NMTalent4-hdrechsler
NMTalent600-hhartmann
NMTalent3-hhay
NMTalent002-hmanager
NMTalent971-hmatsui
unit1700-zahrms
unit1732-exists
unit1788-accountinquiry
unit1788-adb
unit1625-admissions
unit1234-alert
unit100-auhrms
psf03-LMCKINLEY
psf037-LMULLEN
psf0377-LNILS
psf0378-LPARKER
psf0378-LPEABODY
psf0399-LPEPPER
psf0254-LROBERTS
sbl03400-tnance
sbl0383-tsmythe
sbl0310-vtaylor
sbl03-wmimms
sbl0310-exists
jde0261-ACCTPAY
jde0261-ACCTREV
jde024-AUST
jde0289-BIDDER1
jde0100-BIDDER2
jde0009-BUYER1
adsweb_noreplies
webmaster
dm
bg
rd
jm

These entries can be classified into the following.

10 entries of the type 'fap'
11 entries of the type 'NMDemo'
7 entries of the type 'NMTalent'
7 entries of the type 'unit'
7 entries of the type 'psf'
5 entries of the type 'sbl'
6 entries of the type 'jde'
6 unclassified entry (the last 6 in the list)

each entry of any type is followed by a numeric which represents its environment. eg : In psf0254-LROBERTS, psf is type and 0254 is the environment

two entries of a type may have the same environments. eg: jde0261-ACCTPAY & jde0261-ACCTREV are different entries but share the same environment 0261

Now, I would like to have the following input sorted and the output like below.

Code:
type            env	   count
-----            ----	   ------
fap              8	   10
NMDemo           6	   11
NMTalent         6	   7
unit	         6	   7
psf	         6	   7
sbl	         4	   5
jde	         5	   6
unclassified	 N/A	   6

NOTE 1 : The 'count' column should the total number of entries under a particular type.
NOTE 2 : The 'env' column shows the number of unique environments in a type. eg : fap got 10 entries but 8 unique environments.
NOTE 3 : the 'unclassified' type does not have any environment associated and hence contains all the 'non-type' values
# 12  
Old 12-08-2014
And what should be done with the entries:
Code:
user1
user2

you had in an earlier sample input? The "type" (formerly "app name") "user" and two "env" (formerly "number of env") "1" and "2" values are present, but there is no "-" or name entry.


Will all entries for a given type and environment be adjacent as they are in your input samples? If not, will all entries for a given type be adjacent as they are in your input samples?

You say that you want your output sorted, but the types shown in your sample output are not sorted; the output order is in the same order as your input. If the output is supposed to be sorted (other than having the header lines come first and the "unclassified" entry come last), what are the sort keys?

Your earlier posts said you wanted a total count at the end of the output; your last sample output does not show a total count. Which is correct?
# 13  
Old 12-08-2014
This may come close to what you want, based on the sample in post #11. As I can't recognize any sort pattern in your sample output, I've left out that part, as well as the total count that disappeared somewhere down the road. Try
Code:
awk     'NF==1          {CNTAPP["unclassified"]++
                         CNTENV["unclassified"]="N/A"
                         next
                        }
                        {APP=$1
                         sub(/[0-9]*$/,"",APP)
                         CNTAPP[APP]++
                         ENV=$1
                         sub(/^[A-Za-z]*/,"",ENV)
                        }
         !($1 in DUP)   {DUP[$1]
                         CNTENV[APP]++
                        }       
         END            {for (A in CNTAPP) print A, CNTENV[A], CNTAPP[A]}
        ' FS="-" OFS="\t" file4
unclassified    N/A     6
unit    6       7
fap     8       10
NMTalent        6       7
jde     5       6
sbl     4       5
NMDemo  6       11
psf     6       7

# 14  
Old 12-09-2014
This needs to be tested. You'll find that awk can do these kinds of jobs 10 times faster.
Code:
#!/bin/bash

# infile=sample.txt
infile=Input_File.txt

fmt="%-12s\t%-6s\t%-6s\n"
line="~~~~~~"
line2="~~~~~~~~~~~~"
printf $fmt "    App" "Env   " "Count "
printf $fmt "$line2" "$line" "$line"

grep -e ^.*[/A-Za-z].*[0-9].*-.*$ $infile | sed 's/^[/A-Za-z]*/&-/g' | awk -F '-' '{ print $1, $2 }' | sort > outfile
sort -u outfile -o outfile2
idx=($(cat outfile | cut -d ' ' -f 1 | sort -u))

for X in ${idx[*]};
do
    appcnt=$(grep -c "^$X " outfile)
    envcnt=$(grep -c "^$X " outfile2)
    envtot=$(( $envtot + $envcnt ))
    apptot=$(( $apptot + $appcnt ))
    printf $fmt "$X" "$envcnt" "$appcnt"
done

# next two lines added for speed up
# line commented out was for testing
fsz=$(cat $infile | wc -l)
unclass=$(( $fsz - $apptot ))
# unclass=$(grep -v -e ^.*[/A-Za-z].*[0-9].*-.*$ $infile | wc -l)
printf $fmt "Unclassified" "N/A" "$unclass"
printf $fmt "$line2" "$line" "$line"
printf $fmt "Totals" "$envtot" "$(( $apptot + $unclass ))"

# eof #

output
-------
    App         Env       Count 
~~~~~~~~~~~~    ~~~~~~    ~~~~~~
app             1         44    
bics            1         49    
fap             1254      475732
fapbs           1         160   
jde             77        4383  
la              2         3     
NMDemo          35        3754  
/NMDemo         1         160   
NMTalent        6         592   
ofm             1         592   
PRI             40        40    
psf             114       67601 
sbl             34        2890  
unit            166       7415  
Unclassified    N/A       1660  
~~~~~~~~~~~~    ~~~~~~    ~~~~~~
Totals          1733      565075


Last edited by ongoto; 12-10-2014 at 08:46 PM.. Reason: fixed error; /NMDemo now recognized
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Homework & Coursework Questions

Shell script/awk to sort text

1. The problem statement, all variables and given/known data: I have a file with a fragment of a novel, which I have to clear from punctuation and sort all the words contained one per line and non duplicated, all this going to a file called "palabras". Here is fragment of the input file: ... (4 Replies)
Discussion started by: ektorzoza
4 Replies

2. Shell Programming and Scripting

Shell Script to Dynamically Extract file content based on Parameters from a pdf file

Hi Guru's, I am new to shell scripting. I have a unique requirement: The system generates a single pdf(/tmp/ABC.pdf) file with Invoices for Multiple Customers, the format is something like this: Page1 >> Customer 1 >>Invoice1 + invoice 2 >> Page1 end Page2 >> Customer 2 >>Invoice 3 + Invoice 4... (3 Replies)
Discussion started by: DIps
3 Replies

3. Shell Programming and Scripting

Shell script to monitor new file in a directory and mail the file content

Hi I am looking for a help in designing a bash script on linux which can do below:- 1) Look in a specific directory for any new files 2) Mail the content of the new file Appreciate any help Regards Neha (5 Replies)
Discussion started by: neha0785
5 Replies

4. Shell Programming and Scripting

Script to create a text file whose content is the text of another files

Hello everyone, I work under Ubuntu 11.10 (c-shell) I need a script to create a new text file whose content is the text of another text files that are in the directory $DIRMAIL at this moment. I will show you an example: - On the one hand, there is a directory $DIRMAIL where there are... (1 Reply)
Discussion started by: tenteyu
1 Replies

5. Shell Programming and Scripting

Need to build Shell Script to search content of a text file into a folder consist several files

Have to read one file say sourcefile containing several words and having another folder containing several files. Now read the first word of Sourcefile & search it into the folder consisting sevral files, and create another file with result. We hhave to pick the filename of the file in which... (3 Replies)
Discussion started by: mukesh.baranwal
3 Replies

6. Shell Programming and Scripting

Shell script to remove some content in a file

How can I remove all data that contain domain e.g zzgh@something.com, sdd@something.com.my and gg@something.my in one file? so that i only have data without the domain in the file. Here is the file structure "test.out" more test.out 1 zzztop@b.com 1 zzzulll 1 zzzullll@s.com.my ... (4 Replies)
Discussion started by: Mr_47
4 Replies

7. UNIX for Dummies Questions & Answers

creating text file with content from script

hi, can somebody tell me how I can create a text file with content from Bash script. The file should be prefilled with information such as current date and time then leaving the user ability to input more data right below those prefilled content. thank you :) (0 Replies)
Discussion started by: s3270226
0 Replies

8. Shell Programming and Scripting

Problem getting the content of a file in a shell script variable

Hi, I have a text file that has a long multi-line db2 CTE query. Now I want to store all the contents of this file (i.e. the entire query) in a shell script variable. I am trying to achieve it by this: query = `cat /Folder/SomeFile.txt` But when I echo the contents of this file by saying echo... (4 Replies)
Discussion started by: DushyantG
4 Replies

9. Shell Programming and Scripting

Sort content of text file based on date?

I now have a 230,000+ lines long text file formatted in segments like this: Is there a way to sort this file to have everything in chronological order, based on the date and time in the text? In this example, I would like the result to be: (19 Replies)
Discussion started by: KidCactus
19 Replies

10. Shell Programming and Scripting

shell script to edit the content of a file

Hi I need some help using shell script to edit a file. My original file has the following format: /txt/email/myemail.txt /txt/email/myemail2.txt /pdf/email/myemail.pdf /pdf/email/myemail2.pdf /doc/email/myemail.doc /doc/email/myemail2.doc I need to read each line. If the path is... (3 Replies)
Discussion started by: tiger99
3 Replies
Login or Register to Ask a Question