Sort, sed, and zero padding date column csv bash scripting | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Sort, sed, and zero padding date column csv bash scripting

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 07-20-2013
sean1357 sean1357 is offline
Registered User
 
Join Date: Jul 2013
Last Activity: 20 July 2013, 7:05 PM EDT
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Sort, sed, and zero padding date column csv bash scripting

Hello people,

I am having problem to sort, sed and zero padding of column in csv file.

7th column only.

Input of csv file:

Code:
1,2,3,4,5,6,4/1/2010 12:00 AM,8
1,2,3,4,5,6,3/11/2010 9:39 AM,8
1,2,3,4,5,6,5/12/2011 3:43 PM,8
1,2,3,4,5,6,12/20/2009 7:23 PM,8

Output:

Code:
1,2,3,4,5,6,12/20/2009,8
1,2,3,4,5,6,03/11/2010,8
1,2,3,4,5,6,04/01/2010,8
1,2,3,4,5,6,05/12/2011,8

Very thank you for you help.

Last edited by Don Cragun; 07-20-2013 at 02:11 PM.. Reason: Add CODE tags
Sponsored Links
    #2  
Old 07-20-2013
Yoda's Avatar
Yoda Yoda is offline Forum Advisor  
Jedi Master
 
Join Date: Jan 2012
Last Activity: 24 July 2014, 5:28 PM EDT
Location: Galactic Empire
Posts: 3,354
Thanks: 230
Thanked 1,190 Times in 1,123 Posts
How about using awk and sort?

Code:
awk -F, '
        {
                sub( / .*/, X, $7 )
                split ( $7, V, "/" )
                $7 = sprintf ( "%02d/%02d/%d", V[1], V[2], V[3] )
                print
        }
' OFS=, file | sort -t"," -k 7.9 -k 7.5 -k 7

Sponsored Links
    #3  
Old 07-20-2013
sean1357 sean1357 is offline
Registered User
 
Join Date: Jul 2013
Last Activity: 20 July 2013, 7:05 PM EDT
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
It works. Thank you very muh for your help. Now, I having another problem how to remove row from range date.

range from 04/01/2010 - 05/04/2013, but the problem is that 04/01/2010 or 05/04/2013 maybe doesn't existed in csv file. Is the a way that you just compare date of input?

Example:

Code:
1,2,3,4,5,6,03/02/2009,8   -> need to remove column
1,2,3,4,5,6,08/03/2009,8   -> need to remove column
1,2,3,4,5,6,04/01/2010,8   -> keep from here
....
....
1,2,3,4,5,6,05/01/2013,8   -> keep to end
1,2,3,4,5,6,05/11/2013,9  -> need to remove from this point

    #4  
Old 07-20-2013
Yoda's Avatar
Yoda Yoda is offline Forum Advisor  
Jedi Master
 
Join Date: Jan 2012
Last Activity: 24 July 2014, 5:28 PM EDT
Location: Galactic Empire
Posts: 3,354
Thanks: 230
Thanked 1,190 Times in 1,123 Posts
What is your SHELL and OS ?
Sponsored Links
    #5  
Old 07-20-2013
sean1357 sean1357 is offline
Registered User
 
Join Date: Jul 2013
Last Activity: 20 July 2013, 7:05 PM EDT
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Thanks. I'm using Cygwin bash script on Windows 7. Is this will be a problem?
Sponsored Links
    #6  
Old 07-20-2013
Don Cragun's Avatar
Don Cragun Don Cragun is online now Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 24 July 2014, 5:37 PM EDT
Location: San Jose, CA, USA
Posts: 4,132
Thanks: 162
Thanked 1,412 Times in 1,197 Posts
Quote:
Originally Posted by Yoda View Post
How about using awk and sort?

Code:
awk -F, '
        {
                sub( / .*/, X, $7 )
                split ( $7, V, "/" )
                $7 = sprintf ( "%02d/%02d/%d", V[1], V[2], V[3] )
                print
        }
' OFS=, file | sort -t"," -k 7.9 -k 7.5 -k 7

I'm a little bit confused by this. Yoda's code is usually right on, but the sort keys used here make the last two digits of the year to the end of the line the primary sort key, the last digit of the day to the end of the line the secondary sort key, and the start of the date field to the end of the line the tertiary sort key. (This seems to have worked with the sample data only because the 8th field in all of the input lines was identical and the last digit of the day was a constant on the two input lines with the same year.) I think the sort he intended to perform on the awk output was:

Code:
sort -t"," -k 7.7,7 -k 7.1,7.5

which makes the entire year the primary sort key and the month and day together as the secondary sort key.

With the new requirement to exclude a date range from the output, you could try something like:

Code:
awk -F, -v sd="04/01/2010" -v ed="05/04/2013" '
BEGIN { ss = substr(sd, 7) substr(sd,1,2) substr(sd,4,2)
        es = substr(ed, 7) substr(ed,1,2) substr(ed,4,2)
}
{       sub(/ .*/, X, $7)
        split($7, V, "/")
        cs = sprintf("%d%02d%02d", V[3], V[1], V[2])
        if(cs >= ss && cs <= es) next
        $7 = sprintf("%02d/%02d/%d", V[1], V[2], V[3])
        print
}' OFS=, input.csv | sort -t"," -k 7.7,7 -k 7.1,7.5

If you are using a Solaris/SunOS system, use /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk instead of /usr/bin/awk or /bin/awk .

With the input file:

Code:
1,2,3,4,5,6,4/1/2010 12:00 AM,1
1,2,3,4,5,6,3/11/2010 9:39 AM,2
1,2,3,4,5,6,5/12/2011 3:43 PM,3
1,2,3,4,5,6,12/20/2009 7:23 PM,4
1,2,3,4,5,6,03/02/2009 1:23 AM,5
1,2,3,4,5,6,08/03/2009 2:34 AM,6
1,2,3,4,5,6,04/01/2010 3:45 AM,7
1,2,3,4,5,6,05/01/2013 4:56 AM,8
1,2,3,4,5,6,05/11/2013 12:03 PM,9

the output produced is:

Code:
1,2,3,4,5,6,03/02/2009,5
1,2,3,4,5,6,08/03/2009,6
1,2,3,4,5,6,12/20/2009,4
1,2,3,4,5,6,03/11/2010,2
1,2,3,4,5,6,05/11/2013,9

which drops the lines shown in red in the input because they are in the date exclusion range.
The Following User Says Thank You to Don Cragun For This Useful Post:
Yoda (07-20-2013)
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Sort data by date and then search by column samrat dutta Shell Programming and Scripting 2 05-17-2013 10:23 AM
Sort csv file by duplicated column value jl487 UNIX for Dummies Questions & Answers 3 04-26-2013 01:56 AM
Use sed on column (csv) file if data in colmns is greater > than? Chris Eagleson UNIX for Dummies Questions & Answers 2 10-17-2012 02:18 PM
How to insert a sequence number column inside a pipe delimited csv file using shell scripting? nithins007 Shell Programming and Scripting 5 09-25-2011 03:03 AM
CSV formatting with prefixing, appending and padding field meself Shell Programming and Scripting 7 01-12-2009 11:40 PM



All times are GMT -4. The time now is 05:38 PM.