Sorting and saving values based on unique entries


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Sorting and saving values based on unique entries
# 1  
Old 01-07-2014
Sorting and saving values based on unique entries

Hi all,

I wanted to save the values of a file that contains unique entries based on a specific column (column 4). my sample file looks like the following:
Code:
input file: 200006-07file.txt
145 35 10 3
147 35 12 4
146 36 11 3
145 34 12 5
143 31 15 4
146 30 14 5

desired output files:
200006-07_003.txt (this contains:)
145 35 10 3
146 36 11 3

200006-07_004.txt
147 35 12 4
143 31 15 4

200006-07_004.txt
145 34 12 5
146 30 14 5

Thank you and happy new year!
# 2  
Old 01-07-2014
What have you tried so far?

PS Am I correct in assuming that the second output file named 200006-07_004.txt is really supposed to be named 200006-07_005.txt?

Last edited by Don Cragun; 01-07-2014 at 11:52 PM.. Reason: Add PS.
# 3  
Old 01-08-2014
Hi,
Yes, I typed it incorrectly. It should be
Code:
200006-07_005.txt

. I have tried the following but I did the saving of output files manually and its very inefficient.

Code:
sort -n -k4 200006-07file.txt > file.tmp

awk '$4==3 {print}' file.tmp > 200006-07_003.txt
awk '$4==4 {print}' file.tmp > 200006-07_004.txt
awk '$4==5 {print}' file.tmp > 200006-07_005.txt

Thanks!
# 4  
Old 01-08-2014
You could try something like:
Code:
awk '
FNR == 1 {
        # Get filename base and clear list of output files for this input file.
        if((i = index(FILENAME, "file.txt")) == 0) {
                printf("Filenames (\"%s\") does not end in \"file.txt\"\n",
                        FILENAME)
                exit 1
        }
        base = substr(FILENAME, 1, i - 1)
        for (i in outlist) delete outlist[i]
}
{       # Generate output filename for this line:
        of = sprintf("%s_%03d.txt", base, $4)
        if(lf != of) {
                # Close previous output file, if there was one.
                if(lf != "") close(lf)
                # If this is the 1st time for a new output file, add it to the
                # list of output files and remove any existing file with that
                # name.
                if(!($4 in outlist)) {
                        # Remove any existing file with this name:
                        system("rm -f " of)
                        # Save this index in outlist:
                        outlist[$4]
                }
                lf = of
        }
        # Save the current line in the current output file:
        print >> lf
}' 200006-07file.txt

This is probably more complex than is needed. It will work if given multiple input files and clears any existing output files that already exist when the script is started.

If you want to try this script on a Solaris/SunOS system, use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of awk.
# 5  
Old 01-08-2014
Is there any way to get return status of script after 2-3 hours of its completion to check whether it got aborted or finished successfully?
Moderator's Comments:
Mod Comment When posting a question about a new topic; please start a new thread. Hijacking a thread by posting an unrelated question as a response to an existing thread creates confusion for anyone trying to help answer either question.

Last edited by Don Cragun; 01-08-2014 at 01:40 AM..
# 6  
Old 01-08-2014
Thank you very much for the quick reply. I will give it a try and will get back to you.Smilie
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print lines based upon unique values in Nth field

For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt: PS003,001 MZMWR/ L-DWD// * PS003,001... (4 Replies)
Discussion started by: jvoot
4 Replies

2. Shell Programming and Scripting

Sorting unique by column

I am trying to sort, do uniq by 1st column and report this 4 columns tab delimiter table , eg chr10:112174128 rs2255141 2E-10 Cholesterol, total chr10:112174128 rs2255141 7E-16 LDL chr10:17218291 rs10904908 3E-11 HDL Cholesterol chr10:17218291 rs970548 8E-9 TG... (4 Replies)
Discussion started by: fat
4 Replies

3. Shell Programming and Scripting

Sorting out unique values from output of for loop.

Hi , i have a belwo script which is used to get sectors per track value extarcted from Solaris machine: for DISK in /dev/dsk/c*t*d*s*; do value=`prtvtoc "$DISK" | sed -n -e '/Dimensions/,/Flags/{/Dimensions/d; /Flags/d; p; }' | sed -n -e '/sectors\/track/p'`; if ; then echo... (4 Replies)
Discussion started by: omkar.jadhav
4 Replies

4. Linux

To get all the columns in a CSV file based on unique values of particular column

cat sample.csv ID,Name,no 1,AAA,1 2,BBB,1 3,AAA,1 4,BBB,1 cut -d',' -f2 sample.csv | sort | uniq this gives only the 2nd column values Name AAA BBB How to I get all the columns of CSV along with this? (1 Reply)
Discussion started by: sanvel
1 Replies

5. Shell Programming and Scripting

Unique entries based on a range of numbers.

Hi, I have a matrix like this: Algorithm predicted_gene start_point end_point A x 65 85 B x 70 80 C x 75 85 D x 10 20 B y 125 130 C y 120 140 D y 200 210 Here there are four tab-separated columns. The first column is the used algorithm for prediction, and there are 4 of them A-D.... (8 Replies)
Discussion started by: flyfisherman
8 Replies

6. Shell Programming and Scripting

Find and count unique date values in a file based on position

Hello, I need some sort of way to extract every date contained in a file, and count how many of those dates there are. Here are the specifics: The date format I'm looking for is mm/dd/yyyy I only need to look after line 45 in the file (that's where the data begins) The columns of... (2 Replies)
Discussion started by: ronan1219
2 Replies

7. UNIX for Dummies Questions & Answers

Assistance with combining, sorting and saving multi files into one new file

Good morning. I have a piece of code that is currently taking multiple files and using the CAT.exe command to combine into one file that is then sorted in reverse order based on the 3rd field of the file, then displayed on screen. I am trying to change this so that the files are being combined into... (4 Replies)
Discussion started by: jaacmmason
4 Replies

8. Shell Programming and Scripting

Finding unique entries without sorting

Hi Guys, I have two files that I am using: File1 is as follows: wwe khfgv jfo jhgfd hoaha hao lkahe This is like a master file which has entries in the order which I want. (4 Replies)
Discussion started by: npatwardhan
4 Replies

9. UNIX for Dummies Questions & Answers

need help sorting/deleting non-unique things

I don't really know much about UNIX commands, so if someone could help me understand how to do this, I'd really appreciate it. I have a text file with data that looks like this (filename: numbers.txt): 1 1 1 1 1 1 1 1 1 2 1 1_2 2_1 1 1 1 1 1 1 1 1 2 1 2 1_2 2_1 1 1 1 1 1 1 1 1 2 1 2 1_2 2_1... (12 Replies)
Discussion started by: zac100
12 Replies

10. Shell Programming and Scripting

sorting file and unique commnad..

hello everyone.. I was wondering is there a effective way to sort file that contains colomns and numeric one. file 218900012192 8938929 8B8DF3664 1E7E2D59D5 0000 26538 1234 74024415 218900012979 8938929 8B8DF3664 1E7E2D59D5 0000 26538 1234 74024415 218900012992 8938929 8B8DF3664... (2 Replies)
Discussion started by: amon
2 Replies
Login or Register to Ask a Question