Sort, sed, and zero padding date column csv bash scripting


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sort, sed, and zero padding date column csv bash scripting
# 1  
Old 07-20-2013
Sort, sed, and zero padding date column csv bash scripting

Hello people,

I am having problem to sort, sed and zero padding of column in csv file.

7th column only.

Input of csv file:
Code:
1,2,3,4,5,6,4/1/2010 12:00 AM,8
1,2,3,4,5,6,3/11/2010 9:39 AM,8
1,2,3,4,5,6,5/12/2011 3:43 PM,8
1,2,3,4,5,6,12/20/2009 7:23 PM,8

Output:
Code:
1,2,3,4,5,6,12/20/2009,8
1,2,3,4,5,6,03/11/2010,8
1,2,3,4,5,6,04/01/2010,8
1,2,3,4,5,6,05/12/2011,8

Very thank you for you help.

Last edited by Don Cragun; 07-20-2013 at 03:11 PM.. Reason: Add CODE tags
# 2  
Old 07-20-2013
How about using awk and sort?
Code:
awk -F, '
        {
                sub( / .*/, X, $7 )
                split ( $7, V, "/" )
                $7 = sprintf ( "%02d/%02d/%d", V[1], V[2], V[3] )
                print
        }
' OFS=, file | sort -t"," -k 7.9 -k 7.5 -k 7

# 3  
Old 07-20-2013
It works. Thank you very muh for your help. Now, I having another problem how to remove row from range date.

range from 04/01/2010 - 05/04/2013, but the problem is that 04/01/2010 or 05/04/2013 maybe doesn't existed in csv file. Is the a way that you just compare date of input?

Example:
Code:
1,2,3,4,5,6,03/02/2009,8   -> need to remove column
1,2,3,4,5,6,08/03/2009,8   -> need to remove column
1,2,3,4,5,6,04/01/2010,8   -> keep from here
....
....
1,2,3,4,5,6,05/01/2013,8   -> keep to end
1,2,3,4,5,6,05/11/2013,9  -> need to remove from this point

# 4  
Old 07-20-2013
What is your SHELL and OS?
# 5  
Old 07-20-2013
Thanks. I'm using Cygwin bash script on Windows 7. Is this will be a problem?
# 6  
Old 07-20-2013
Quote:
Originally Posted by Yoda
How about using awk and sort?
Code:
awk -F, '
        {
                sub( / .*/, X, $7 )
                split ( $7, V, "/" )
                $7 = sprintf ( "%02d/%02d/%d", V[1], V[2], V[3] )
                print
        }
' OFS=, file | sort -t"," -k 7.9 -k 7.5 -k 7

I'm a little bit confused by this. Yoda's code is usually right on, but the sort keys used here make the last two digits of the year to the end of the line the primary sort key, the last digit of the day to the end of the line the secondary sort key, and the start of the date field to the end of the line the tertiary sort key. (This seems to have worked with the sample data only because the 8th field in all of the input lines was identical and the last digit of the day was a constant on the two input lines with the same year.) I think the sort he intended to perform on the awk output was:
Code:
sort -t"," -k 7.7,7 -k 7.1,7.5

which makes the entire year the primary sort key and the month and day together as the secondary sort key.

With the new requirement to exclude a date range from the output, you could try something like:
Code:
awk -F, -v sd="04/01/2010" -v ed="05/04/2013" '
BEGIN { ss = substr(sd, 7) substr(sd,1,2) substr(sd,4,2)
        es = substr(ed, 7) substr(ed,1,2) substr(ed,4,2)
}
{       sub(/ .*/, X, $7)
        split($7, V, "/")
        cs = sprintf("%d%02d%02d", V[3], V[1], V[2])
        if(cs >= ss && cs <= es) next
        $7 = sprintf("%02d/%02d/%d", V[1], V[2], V[3])
        print
}' OFS=, input.csv | sort -t"," -k 7.7,7 -k 7.1,7.5

If you are using a Solaris/SunOS system, use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of /usr/bin/awk or /bin/awk.

With the input file:
Code:
1,2,3,4,5,6,4/1/2010 12:00 AM,1
1,2,3,4,5,6,3/11/2010 9:39 AM,2
1,2,3,4,5,6,5/12/2011 3:43 PM,3
1,2,3,4,5,6,12/20/2009 7:23 PM,4
1,2,3,4,5,6,03/02/2009 1:23 AM,5
1,2,3,4,5,6,08/03/2009 2:34 AM,6
1,2,3,4,5,6,04/01/2010 3:45 AM,7
1,2,3,4,5,6,05/01/2013 4:56 AM,8
1,2,3,4,5,6,05/11/2013 12:03 PM,9

the output produced is:
Code:
1,2,3,4,5,6,03/02/2009,5
1,2,3,4,5,6,08/03/2009,6
1,2,3,4,5,6,12/20/2009,4
1,2,3,4,5,6,03/11/2010,2
1,2,3,4,5,6,05/11/2013,9

which drops the lines shown in red in the input because they are in the date exclusion range.
This User Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to sort a column in excel/csv file?

I have to sort the 4th column of an excel/csv file. I tried the following command sort -u --field-separator=, --numeric-sort -k 2 -n dinesh.csv > test.csv But, it's not working. Moreover, I have to do the same for more than 30 excel/csv file. So please help me to do the same. (6 Replies)
Discussion started by: dineshkumarsrk
6 Replies

2. Shell Programming and Scripting

Csv files sort by date and zip monthly

Hi, I'm an absolute beginner in shell programming. I would need a script for a NAS that makes the csv files sorted by date and always monthly a zip. In the current month, the data should be integrated into this folder, so there are only monthly files. Thanks for your help (1 Reply)
Discussion started by: Pipo
1 Replies

3. Shell Programming and Scripting

Get maximum per column from CSV file, based on date column

Hello everyone, I am using ksh on Solaris 10 and I'm gathering data in a CSV file that looks like this: 20170628-23:25:01,1,0,0,1,1,1,1,55,55,1 20170628-23:30:01,1,0,0,1,1,1,1,56,56,1 20170628-23:35:00,1,0,0,1,1,2,1,57,57,2 20170628-23:40:00,1,0,0,1,1,1,1,58,58,2... (6 Replies)
Discussion started by: ejianu
6 Replies

4. Shell Programming and Scripting

Need a piece of shell scripting to remove column from a csv file

Hi, I need to remove first column from a csv file and i can do this by using below command. cut -f1 -d, --complement Mytest.csv I need to implement this in shell scripting, Whenever i am using the above command alone in command line it is working fine. I have 5 files in my directory and... (3 Replies)
Discussion started by: Samah
3 Replies

5. Shell Programming and Scripting

Bash - delete from csv all the row if the first column is length >

Hi guys, i have a csv file like: USERID;COG;DESCR;FIL;OFF user001;user;test1;001;A01 user002;user;test2;002;A02 user0003;user;test3;003;A03 user004;user;test4;004;A04 user0005;user;test5;005;A05 etc.. I need to read line for line and, if value of first column is > 7 char (in this example... (4 Replies)
Discussion started by: kamose
4 Replies

6. Shell Programming and Scripting

sed --> sort data by date

Hi, i "tried" to sort data by date. So far, i used sed to take the data from the last and the actual month. Now, after changing the year it is not working properly. i use: GNU bash, version 4.2.45(1)-release (x86_64-suse-linux-gnu) sed -n '/\//p' $Home/../scripte/pd_0.txt y is a... (6 Replies)
Discussion started by: IMPe
6 Replies

7. Shell Programming and Scripting

Padding a csv value with 0's

I have this csv file that I would like to sort on the 20th and 21st field. They are high lighted below. My challenge is that when I sort on those fields they are not in order as I would have liked. It seems like I have to pad those fields to the longest value in that fields data. ... (6 Replies)
Discussion started by: GroveTuckey
6 Replies

8. Shell Programming and Scripting

Sort data by date and then search by column

Hi, I have a file where data is pipe separated.First i want to sort the file content by date . Then i want to pick up the records based on the first column which should be unique and not have duplicates. NYSE|yyyrrrddd|toronto|isin|ticker|2013-05-15... (2 Replies)
Discussion started by: samrat dutta
2 Replies

9. UNIX for Dummies Questions & Answers

Sort csv file by duplicated column value

hello, I have a large file (about 1gb) that is in a file similar to the following: I want to make it so that I can put all the duplicates where column 3 (delimited by the commas) are shown on top. Meaning all people with the same age are listed at the top. The command I used was ... (3 Replies)
Discussion started by: jl487
3 Replies

10. Shell Programming and Scripting

How to insert a sequence number column inside a pipe delimited csv file using shell scripting?

Hi All, I need a shell script which could insert a sequence number column inside a dat file(pipe delimited). I have the dat file similar to the one as shown below.. |A|B|C||D|E |F|G|H||I|J |K|L|M||N|O |P|Q|R||S|T As shown above, the column 4 is currently blank and i need to insert sequence... (5 Replies)
Discussion started by: nithins007
5 Replies
Login or Register to Ask a Question