Adding Leading Zeros for date in a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Adding Leading Zeros for date in a file
# 1  
Old 09-08-2016
Adding Leading Zeros for date in a file

Hello,

I have a pipe separated file with two major lines. One is header and another is detail line. Header starts with H and Detail start with D.

Sample Content:

Code:
 H|123456|Joes Watson|UK|4/5/2016|12/5/2016|3456|HC|NW||||||
 D|123456|Joes Watson|4/5/2016|12/5/2016|LR|LW||||||||||||||||||
 D|123456|Joes Watson|4/5/2016|6/11/2016|WR|LW||||||||||||||||||
 H|123457|Jack Roes|UK|11/11/2015|12/11/2015|3456|HC|NW||||||
 D|123457|Jack Roes|1/5/2016|1/5/2016|LR|LW||||||||||||||||||
 D|123457|Jack Roes|4/5/2016|6/12/2016|WR|LW||||||||||||||||||

Here Date format should be in mm/dd/yyyy but file has date format in different format - m/dd/yyyy or mm/d/yyyy or m/d/yyyy. If the format is not correct then a leading zero should be added to make it in mm/dd/yyyy like 4/5/2016 to 04/05/2016.

I tried few awk printf commands but nothing works out for multiple occurrences within the line.

Please let me know your possible way to crack this scenario.

Thanks,
Mannu

Moderator's Comments:
Mod Comment edit by bakunin: CODE, not ICODE-tags, please. Thank you for your consideration.

Last edited by bakunin; 09-08-2016 at 05:42 AM..
# 2  
Old 09-08-2016
It looks to me as though when it's a header ($1 == 'H') you are looking at fields 5 and 6 ($5 and $6). When it's a detail line you are looking at fields 4 and 5.

I don't know awk but is seems to me you need a construct like:
Code:
if ($1 == 'H') {
   # code for $5
   # identical code but for $6
} elsif ($1 == 'D') {
   # code for $5
   # identical code but for $4
}

As for the modification for the date you need to split on '/' and rebuild using printf.

I don't really know awk so the above construct is probably nonsensical.

Concentrate on modifying field 5 and then try putting the above construct in for the other fields.

Andrew
# 3  
Old 09-08-2016
You can try this awk script. It looks for fields looking like dates, splits them into month, day and year and recombines them using a proper format.

Code:
BEGIN {
    FS="|"
    OFS="|"
}
{
    for (field = 1; field <= NF; field++) {
        if (match ($field, ".*/.*/.*")) {
            split ($field, mdy, "/");
            $field = sprintf ("%02d/%02d/%d", mdy[1], mdy[2], mdy[3]);
        }
    }
    print
}

Very old versions of awk, like /usr/bin/awk on Solaris do not understand sprintf. You should be on the safe side with GNU-awk or nawk though.
This User Gave Thanks to hergp For This Post:
# 4  
Old 09-08-2016
Quote:
Originally Posted by Mannu2525
Here Date format should be in mm/dd/yyyy but file has date format in different format - m/dd/yyyy or mm/d/yyyy or m/d/yyyy. If the format is not correct then a leading zero should be added to make it in mm/dd/yyyy like 4/5/2016 to 04/05/2016.
In your sample there seem to be no other data similar to dates, so the following abbreviation may (or may not, depending on how representative your sample is) work:

Replace every instance of a single digit followed by a slash by this sequence with a "0" prepended. This would amount to a simple sed-command:

Code:
sed 's/\([^0-9]\)\([0-9]\/\)/\10\2/g' /path/to/your/file

This will replace in i.e. "|4/5/2016" the "|4/" with "|04/" and then the "/5/" to "/05/".

I hope this helps.

bakunin
# 5  
Old 09-08-2016
Quote:
Originally Posted by hergp
You can try this awk script. It looks for fields looking like dates, splits them into month, day and year and recombines them using a proper format.

Code:
BEGIN {
    FS="|"
    OFS="|"
}
{
    for (field = 1; field <= NF; field++) {
        if (match ($field, ".*/.*/.*")) {
            split ($field, mdy, "/");
            $field = sprintf ("%02d/%02d/%d", mdy[1], mdy[2], mdy[3]);
        }
    }
    print
}

Very old versions of awk, like /usr/bin/awk on Solaris do not understand sprintf. You should be on the safe side with GNU-awk or nawk though.
Hello hergp,

Thank you for nice code. I think you could change (match ($field, ".*/.*/.*")) to (match($field,/[0-9]+\/[0-9]+\/[0-9]+/)), which will get only digits(for matching dates only).

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 6  
Old 09-08-2016
Hi.

A suite of programs dealing with many aspects of dates is dateutils, found in many repositories ( ArchLinux, Debian, Fedora, FreeBSD, Gentoo, NetBSD, OpenSuSE, OS, Slackware, Ubuntu ).

Most of the codes scan the input lines for data that looks like a date format, then processes it. In this case we ask dconv to read it and write it with the same format:
Code:
#!/usr/bin/env bash

# @(#) s1       Demonstrate date reformatting, dateutils.dconv
# Suite dateutils is in many repositories, and:
# https://github.com/hroptatyr/dateutils (verified 2016.09.08)

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C dateutils.dconv
which dateutils.dconv

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

# ...4/5/2016|12/5/2016...
pl " Results:"
dateutils.dconv -S -i "%m/%d/%Y" -f "%m/%d/%Y" < $FILE

exit 0

producing:
Code:
$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.4 (jessie) 
bash GNU bash 4.3.30
dateutils.dconv dconv 0.3.1
/usr/bin/dateutils.dconv

-----
 Input data file data1:
 H|123456|Joes Watson|UK|4/5/2016|12/5/2016|3456|HC|NW||||||
 D|123456|Joes Watson|4/5/2016|12/5/2016|LR|LW||||||||||||||||||
 D|123456|Joes Watson|4/5/2016|6/11/2016|WR|LW||||||||||||||||||
 H|123457|Jack Roes|UK|11/11/2015|12/11/2015|3456|HC|NW||||||
 D|123457|Jack Roes|1/5/2016|1/5/2016|LR|LW||||||||||||||||||
 D|123457|Jack Roes|4/5/2016|6/12/2016|WR|LW||||||||||||||||||

-----
 Results:
 H|123456|Joes Watson|UK|04/05/2016|12/05/2016|3456|HC|NW||||||
 D|123456|Joes Watson|04/05/2016|12/05/2016|LR|LW||||||||||||||||||
 D|123456|Joes Watson|04/05/2016|06/11/2016|WR|LW||||||||||||||||||
 H|123457|Jack Roes|UK|11/11/2015|12/11/2015|3456|HC|NW||||||
 D|123457|Jack Roes|01/05/2016|01/05/2016|LR|LW||||||||||||||||||
 D|123457|Jack Roes|04/05/2016|06/12/2016|WR|LW||||||||||||||||||

If not in your repository or you cannot change your system, see the github source -- that is how I originally obtained the codes until dateutils appeared in my repository.

Best wishes ... cheers, drl
# 7  
Old 09-08-2016
Quote:
Originally Posted by bakunin
In your sample there seem to be no other data similar to dates, so the following abbreviation may (or may not, depending on how representative your sample is) work:

Replace every instance of a single digit followed by a slash by this sequence with a "0" prepended. This would amount to a simple sed-command:

Code:
sed 's/\([^0-9]\)\([0-9]\/\)/\10\2/g' /path/to/your/file

This will replace in i.e. "|4/5/2016" the "|4/" with "|04/" and then the "/5/" to "/05/".

I hope this helps.

bakunin
Not quite. If I have one of the OP's input lines:
Code:
 D|123457|Jack Roes|1/5/2016|1/5/2016|LR|LW|||||||||||||||||

the output I get is:
Code:
 D|123457|Jack Roes|01/5/2016|01/5/2016|LR|LW||||||||||||||||||

This is because the first match on the BRE is |1/ and you want to match /5/ to change the 5 to 05, but the leading / for the 2nd match was consumed as the trailing / in the previous match.

This can be fixed by applying that substitute command twice:
Code:
sed -e 's/\([^0-9]\)\([0-9]\/\)/\10\2/g' -e 's/\([^0-9]\)\([0-9]\/\)/\10\2/g' /path/to/your/file

or with a slightly simpler second substitute command:
Code:
sed -e 'sx\([^0-9]\)\([0-9]/\)x\10\2xg' -e 'sx/\([0-9]/\)x/0\1xg' /path/to/your/file

(using x as the substitution delimiter instead of / to avoid a few backslashes).

Note that this sed solution will also work if a date field also includes time (e.g., 1/5/2016 11:23:45) while the awk solutions suggested will fail in this case. And note that this sed solution may "fix" too much if there is a field containing a simple fraction where the numerator is a single digit (e.g., 1/2 off will become 01/2 off and 2/3 majority will become 02/3 majority). None of these are problems with the given sample input, but with all of those empty fields in the sample we have to wonder what else might appear in those field in other input lines???
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Ho to remove leading zeros from a csv file which is sent from a UNIX script

Hi All, I am using a informatica job to create a csv file and a unix script the mail the generated file.Everything is working fine but I am not seeing leading zeros in the csv file sent in the mail.These zeros were present when the .csv file was generated by informatica procees. Is there any... (11 Replies)
Discussion started by: karthik adiga
11 Replies

2. Shell Programming and Scripting

Fixed with file- removing leading zeros and adding the space

Hi All, i have a fixed width file , where each line is 3200 length. File: 1ABC 1111 2222 3333 000012341 1001 2ABC 1111 2222 3333 000012342 1002 3ABC 1111 2222 3333 000112343 1003 1DEF 5555 4444 9696 000012344 1004 2DEF 5555 2323 8686 000012345 1005 3DEF 5555 1212 7676 000012346 1006 ... (1 Reply)
Discussion started by: mechvijays
1 Replies

3. Shell Programming and Scripting

Help deleting leading zeros in a file

I have a list of numbers extracted and need to delete the leading zeros from them, but when i do so, the command I am using also deletes numbers that end in Zero as well. eg 10, 20, 30, etc this is part of a larger script and the only way I can think of is to try and detect the 10,20 30 etc in... (19 Replies)
Discussion started by: kcpoole
19 Replies

4. Shell Programming and Scripting

Numbers with leading zeros

Hi, i have a variable which conatins values like 00001,0003,00067,00459. I want to use the values one by one and in the same form as they are like 00001,0003,00067,00459. Also can anyone tell me how to increment those numbers by 1,keeping the format as same like 00002,0004,00068,00460.... (5 Replies)
Discussion started by: arijitsaha
5 Replies

5. UNIX for Dummies Questions & Answers

Add leading zeros to columns in a file

Hello Gurus, Quick question. I have a file with the following records: A~000000000000518000~SLP ~99991231~20090701~88.50~USD~CS~ A~000000000000518000~SLP ~99991231~20090701~102.00~USD~CS~ A~000000000000772000~SLP ~99991231~20100701~118.08~USD~CS~ I wold like to do the following: 1. Add... (1 Reply)
Discussion started by: chumsky
1 Replies

6. Shell Programming and Scripting

Help with adding leading zeros to a filename

Hi i need help in adding leading zero to filenames e.g file name in my folder are 1_234sd.txt 23_234sd.txt the output i need is 001_234sd.txt 023_234sd.txt can i do this shell scripting please help (2 Replies)
Discussion started by: rsmpk
2 Replies

7. Shell Programming and Scripting

Help needed in padding leading zeros

Hi all, I have file with numeric values. I need to pad each value with leading zeros such that total lenght of each value is 16. Example: cat tmp.txt 502455 50255 5026 5027 5028 Output 0000000000502455 0000000000050255 0000000000005026 0000000000005027 0000000000005028 Any... (12 Replies)
Discussion started by: jakSun8
12 Replies

8. Shell Programming and Scripting

truncating leading zeros of a column in a file

Hi I have a file in which I have 5 columns which are delimited by “|” as shown ABC|12|YAK|METRIC|000000019.5 XYZ|10|ABX|META|000000002.5 Now my requirement is to take the last column trim the leading zero's for that column values and write back to the same file in the same... (7 Replies)
Discussion started by: nvuradi
7 Replies

9. Shell Programming and Scripting

how to retain leading zeros

Hi All, I am working with a fixed width file Forrmat. C1 Number (10,3) C2 Number (10,3) e.g. c1= 0000000100.000 c2= 0000000020.000 0000000100.0000000000020.000 I have to perform c1 - c2 . i.e. I want answer to be 0000000080.000. but I am loosing the leading zeros( only getting... (3 Replies)
Discussion started by: Manish Jha
3 Replies

10. Shell Programming and Scripting

Leading zeros

How to insert leading zeros into a left-justisfied zip code? e.g. Zip code is written as 60320 which is left-justified to make it be read as 0060320. We have to move it to right-justifiable then insert 2 leading zeros into it... ;) (1 Reply)
Discussion started by: wtofu
1 Replies
Login or Register to Ask a Question