Masking data for different file format


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Masking data for different file format
# 1  
Old 06-08-2011
Error Masking data for different file format

Hi,
I have 3 kind of files that contains date data needed to be masked. The file is like this:

File 1 (all contents in 1 line):
input:
Code:
DTM+7:201103281411:203'LOC+175+SGSIN:139:6+TERMINATOR......'DTM+132:201103281413:203'LOC....

output:
Code:
DTM+7:'''''''''''':203'LOC+175+SGSIN:139:6+TERMINATOR......'DTM+132:'''''''''''':203'LOC....

File 2( contents devided in many lines):
input:
Code:
EXIT201103281044IAAU 3680272 4363400018000DD                 FIL  IL  LEA PERDANA      006     006         MANLEE   FD LD RD DD                    N
 
DISC201103281101TCKU 9672645 45G1400022500DD                 FIL  RC      VIRO BHUM        S079    S079                                             0300312TCNTXG                     NN N

output:
Code:
EXIT''''''''1044IAAU 3680272 4363400018000DD                 FIL  IL  LEA PERDANA      006     006         MANLEE   FD LD RD DD                    N
 
DISC''''''''1101TCKU 9672645 45G1400022500DD                 FIL  RC      VIRO BHUM        S079    S079                                             0300312TCNTXG                     NN N

File 3:
input: date to be masked is near to HHDR
Code:
HHDR   01010020110208000004NYK VERANICD     266                                                    2011020704100020110207181500                                                                                                                                                                                                                                                                                 D              NYKU 5629211                                                                                              F22521  NY   NY      N                                                                                                                          N                                                 NZLYT             NZZ063854            4510LCI                                       C                                                                                                                                                                                                                                                                                                                                                                                                               HHDR   01010020110208000004ARUNIRICKMER     008W                                                   2011020717050020110208010000

output:
Code:
HHDR   010100''''''''000004NYK VERANICD     266                                                    2011020704100020110207181500                                                                                                                                                                                                                                                                                 D              NYKU 5629211                                                                                              F22521  NY   NY      N                                                                                                                          N                                                 NZLYT             NZZ063854            4510LCI                                       C                                                                                                                                                                                                                                                                                                                                                                                                               HHDR   010100''''''''000004ARUNIRICKMER     008W                                                   2011020717050020110208010000

I have to find a generic solution that can mask those data no matter what kind of file(among those 3 kind) is put in. It's not hard to handle each kind of file but when it comes to 3 files, I'm desperated.
# 2  
Old 06-09-2011
Code:
sed '
/^EXIT/s/^\(....\).\{8\}/\1########/
/^DISC/s/^\(....\).\{8\}/\1########/
/DTM/s/\(DTM[^:]*:\).\{12\}/\1############/g
/HHDR/s/\(HHDR[^0-9]*......\).\{8\}/\1########/g
' infile >outfile

or
Code:
sed '
/^EXIT/s/^\(....\)......../\1########/
/^DISC/s/^\(....\)......../\1########/
/DTM/s/\(DTM[^:]*:\)............/\1############/g
/HHDR/s/\(HHDR[^0-9]*......\)......../\1########/g
' infile >outfile

You can still ... | tr \# \' >outfile if you want it maskerade with simple quote instead of hash

Code:
sed '
/^EXIT/s/^\(....\)......../\1########/
/^DISC/s/^\(....\)......../\1########/
/DTM/s/\(DTM[^:]*:\)............/\1############/g
/HHDR/s/\(HHDR[^0-9]*......\)......../\1########/g
' infile | tr \# \' >outfile


Last edited by ctsgnb; 06-09-2011 at 09:17 AM..
This User Gave Thanks to ctsgnb For This Post:
# 3  
Old 06-09-2011
Hi,

Using 'perl':
Code:
$ cat script.pl
use strict;                                                                                                                                                                         
use warnings;                                                                                                                                                                       
                                                                                                                                                                                    
@ARGV || die "Usage: perl $0 file1 file2 ...\n";                                                                                                                                    
                                                                                                                                                                                    
while ( my $file = shift @ARGV ) {                                                                                                                                                  
    # Open files. In failure, warn the problem and read next file.                                                                                                                  
    open my $ifh, "<", $file or do { warn "Cannot open $file for reading: $!\n"; next };                                                                                            
    open my $ofh, ">", $file . ".out" or do { warn "Cannot open $file.out for writing: $!\n"; next };                                                                               
                                                                                                                                                                                    
    # Check type of file.                                                                                                                                                           
    while ( <$ifh> ) {                                                                                                                                                              
        # File 1.                                                                                                                                                                   
        if ( /^DTM/ ) {                                                                                                                                                             
            s/(DTM\+\d*:)\d{12}/$1 . ("'" x 12)/eg;                                                                                                                                 
        # File 2.                                                                                                                                                                   
        } elsif ( /^(?:EXIT|DISC)/ ) {                                                                                                                                              
            s/^(EXIT|DISC)\d{8}/$1 . ("'" x 8)/e;                                                                                                                                   
        # File 3.                                                                                                                                                                   
        } elsif ( /^HHDR/ ) {                                                                                                                                                       
            s/(HHDR\s+\d{6})\d{8}/$1 . ("'" x 8)/eg;                                                                                                                                
        }                                                                                                                                                                           
                                                                                                                                                                                    
        print $ofh $_;                                                                                                                                                              
    }                                                                                                                                                                               
                                                                                                                                                                                    
    close $ifh or warn "Error found closing $file: $!\n";                                                                                                                           
    close $ofh or warn "Error found closing $file.out: $!\n";                                                                                                                       
} 
$ perl script
Usage: perl script.pl file1 file2 ...
$ perl script.pl file1 file2 file3
(Output files will be appended with '.out' -> file1.out, file2.out and file3.out in this sample).

Regards,
Birei
# 4  
Old 06-11-2011
Thanks ctsgnb and birei!
I will try both your solutions.
Btw, I'm not really understand what these codes doing (in birei's post):
Code:
# File 1.                                                                                                                                                                     
        if ( /^DTM/ ) {                                                                                                                                                               
            s/(DTM\+\d*:)\d{12}/$1  . ("'" x 12)/eg;                                                                                                                                   
        # File 2.                                                                                                                                                                     
        } elsif ( /^(?:EXIT|DISC)/ ) {                                                                                                                                                
            s/^(EXIT|DISC)\d{8}/$1 . ("'" x 8)/e;

What is 'eg' and 'e'?
And what 's/' means?
Notice that there is many 'DTM' in the 1-line file, is the above code can cater for all the 'DTM' inside?
Regards.
# 5  
Old 06-11-2011
Hi,

I paste some of the 'perl' help, which I'm sure it explains much better than me.
Quote:
s/PATTERN/REPLACEMENT/msixpogce
Searches a string for a pattern, and if found, replaces that pattern with the replacement text and returns the number of substitutions made. Otherwise it returns false (specifically, the empty string).

Options:
e Evaluate the right side as an expression.
g Match globally, i.e., find all occurrences.
Regards,
Birei
This User Gave Thanks to birei For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Masking data

How Can I mask one particular columns using some unix command? (4 Replies)
Discussion started by: dsa
4 Replies

2. Shell Programming and Scripting

Script to generate Excel file or to SQL output data to Excel format/tabular format

Hi , i am generating some data by firing sql query with connecting to the database by my solaris box. The below one should be the header line of my excel ,here its coming in separate row. TO_CHAR(C. CURR_EMP_NO ---------- --------------- LST_NM... (6 Replies)
Discussion started by: dani1234
6 Replies

3. UNIX for Advanced & Expert Users

Identifing The Format of Data File

Hello All, We are receiving UTF-16 formatted file and when i vi the file in Linux OS i am seeing some weird characters, is this because in my .bashrc profile the language variable has been set to LC_ALL=en_US.UTF-8 and hence the reason i am not able to read the UTF-16 formatted file properly? or... (2 Replies)
Discussion started by: Ariean
2 Replies

4. UNIX for Dummies Questions & Answers

Format the data file

hi all, i am new to unix forum,i want do some large data format work. pls help me to format the file. i have attched data file to be formated in that two set of line item is there . first set header & second set header to be come in to single header item same as like all the data also. ... (2 Replies)
Discussion started by: dhamu
2 Replies

5. UNIX for Dummies Questions & Answers

How to Open a data format file?

Hi, Am having a file. I checked that file format by the following command file filename Output is filename: data So the file is data format file Am trying to view that file so i have used some commands like cat,more so on but it showing the contents like compressed form(full of Symbols). How... (4 Replies)
Discussion started by: Adhi
4 Replies

6. Shell Programming and Scripting

Masking off data for different kind of files

Hi everyone, I am required to mask off the date data for 3 kinds of files. Please help me to find a most generic way to handle the files. Example of the content of each kind(the dates are in red): 1. (1-line files). Possible solution: Date needed to be masked off start with DTM+, end with :203 ... (1 Reply)
Discussion started by: Alvin123
1 Replies

7. UNIX for Dummies Questions & Answers

Format/Fix Timestamp Data in a File.

Hello Experts, I have a timestamp(6) column in a .csv data file , format of the data is as below:- ETCT,P,Elec, Inc.,abc,11/5/2010 4:16:09.000000 PM,Y,Y,Y I want the timestamp column to be properly formatted like 11/05/2010 04:16:09.000000 PM Currently the "0" is missing with... (3 Replies)
Discussion started by: mtlrsk
3 Replies

8. Shell Programming and Scripting

Data Masking

I have a pipe delimited file that I need to 'mask' to before loading to keep some data confidential. I need to maintain the first 4 bytes of certain columns and replace the remaining bytes with an 'x'. I would like to maintain spaces but it's not a requirement. Example, need to mask columns 2... (2 Replies)
Discussion started by: 1superdork
2 Replies

9. UNIX for Advanced & Expert Users

Sed to format data in a file

Hi , i need help with formatting a file i am generating which is to be used in mainframe app so the file length has to be 80 for each rows. The file that m able to generate looks like this (consists of two rows only) E 1006756 1006756 Active T E 0551055 0551055 Active T I... (2 Replies)
Discussion started by: cnilashis
2 Replies

10. UNIX for Dummies Questions & Answers

how to number format a data file without using SED?

Hi I have a file which contains data (list of data) and I want to put a number with bracket 1) 2) 3) etc at the beginning of every successive line I can do it with SED and I can also do it using the nl route but am looking for a different method. I'm guessing I would need some sort of loop... (3 Replies)
Discussion started by: Cactus Jack
3 Replies
Login or Register to Ask a Question