Need a shell script to clean data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need a shell script to clean data
# 1  
Old 06-27-2012
Need a shell script to clean data

Hi, Appreciated if anyone can throw some hint

I have a file format like this:
Code:
old(1): PRCNCP                                                                  
1                                                                              
old(2): PRSKU                                                                   
6480966         
old(2): PRSKU2                                                                                                                                 
old(3): PRPRCE                                                                  
89.99            
new(1): PRCNCP                                                                  
8                                                                               
new(2): PRSKU                                                                   
6480966                                                                         
new(3): PRPRCE                                                                  
89.99

I would like my output to be:

1. remove all old data
2. put new data in one line.
Code:
new(1): PRCNCP 8                                                                               
new(2): PRSKU  6480966                                                                         
new(3): PRPRCE 89.99


Last edited by Franklin52; 06-28-2012 at 04:55 AM.. Reason: Please use code tags for data and code samples
# 2  
Old 06-27-2012
Try this:

Code:
awk 'j&&!/:/{j=j" "$0;next}j{print j;j=x}/^new/{j=$0}END{print j}' infile

# 3  
Old 06-27-2012
thank you Sir,

my file has some other data I donot want to delete
Code:
source database: WEBSUN1.WORLD                                                  
owner: WS_APP_OWNER                                                             
object: DZPRCP                                                                  
is tag null: Y                                                                  
command_type: UPDATE      
old(1): PRCNCP 
1 
old(2): PRSKU 
6480966 
old(2): PRSKU2 
old(3): PRPRCE 
89.99 
new(1): PRCNCP 
8 
new(2): PRSKU 
6480966 
new(3): PRPRCE 
89.99

can I have:
Code:
source database: WEBSUN1.WORLD                                                  
owner: WS_APP_OWNER                                                             
object: DZPRCP                                                                  
is tag null: Y                                                                  
command_type: UPDATE  
new(1): PRCNCP 8 
new(2): PRSKU 6480966 
new(3): PRPRCE 89.99


Last edited by Franklin52; 06-28-2012 at 04:55 AM.. Reason: Please use code tags for data and code samples
# 4  
Old 06-27-2012
See if this will work for you:
Code:
$ sed -e 'H;$!d;x;s/\n\([0-9]\{1,\}\)/\@\1/g;s/\@/ /g' test.txt | sed '/old/'d

source database: WEBSUN1.WORLD
owner: WS_APP_OWNER
object: DZPRCP
is tag null: Y
command_type: UPDATE
new(1): PRCNCP 8
new(2): PRSKU 6480966
new(3): PRPRCE 89.99



---------- Post updated at 09:28 PM ---------- Previous update was at 08:37 PM ----------

oops, forgot one thing, to delete the blank line that is always in the hold buffer when you append:
Code:
$ sed -e 'H;$!d;x;s/\n\([0-9]\{1,\}\)/\@\1/g;s/\@/ /g' test.txt | sed -e '/old/'d -e '1d'
source database: WEBSUN1.WORLD
owner: WS_APP_OWNER
object: DZPRCP
is tag null: Y
command_type: UPDATE
new(1): PRCNCP 8
new(2): PRSKU 6480966
new(3): PRPRCE 89.99

# 5  
Old 07-02-2012
thanks spacebar. I hate myself but here is more challenging txt:

Code:
source database: WEBSUN1.WORLD                                                  
owner: WS_APP_OWNER                                                             
object: DZPRCP                                                                  
is tag null: Y                                                                  
command_type: UPDATE                                                            
old(1): PRCNCP                                                                  
7                                                                               
old(2): PRSKU                                                                   
2514297                                                                         
old(3): PRPRCE                                                                  
56                                                                              
old(4): PRPCAT                                                                  
83                                                                              
old(5): PREFFP                                                                  
40929                                                                           
old(6): PRMCHR                                                                  
old(7): PRSUR$                                                                  
0                                                                               
old(8): PRWTGL                                                                  
N                                                                               
old(9): PRCERT                                                                  
N                                                                               
old(10): PRRET                                                                  
56                                                                              
old(11): PRDATE                                                                 
20120627                                                                        
old(12): PRTIME                                                                 
100331                                                                          
old(13): PREDSCP                                                                
.4                                                                              
old(14): PREFSHP                                                                
N                                                                               
new(1): PRCNCP                                                                  
7                                                                               
new(2): PRSKU                                                                   
2514297                                                                         
new(3): PRPRCE                                                                  
56                                                                              
new(4): PRPCAT                                                                  
83                                                                              
new(5): PREFFP                                                                  
40929                                                                           
new(6): PRMCHR                                                                  
new(7): PRSUR$                                                                  
0                                                                               
new(8): PRWTGL                                                                  
N                                                                               
new(9): PRCERT                                                                  
N                                                                               
new(10): PRRET                                                                  
56                                                                              
new(11): PRDATE                                                                 
20120627                                                                        
new(12): PRTIME                                                                 
120334                                                                          
new(13): PREDSCP                                                                
.4                                                                              
new(14): PREFSHP                                                                
N


can we have the output like:

Code:
source database: WEBSUN1.WORLD                                                  
owner: WS_APP_OWNER                                                             
object: DZPRCP                                                                  
is tag null: Y                                                                  
command_type: UPDATE                                                                                                                                        
new(1): PRCNCP  :  7                                                                               
new(2): PRSKU   :  2514297                                                                         
new(3): PRPRCE  :  56                                                                              
new(4): PRPCAT  :  83                                                                              
new(5): PREFFP  :  40929                                                                           
new(6): PRMCHR  : 
new(7): PRSUR$  :  0                                                                               
new(8): PRWTGL  : N                                                                               
new(9): PRCERT  : N                                                                               
new(10): PRRET  :  56                                                                              
new(11): PRDATE :  20120627                                                                        
new(12): PRTIME :  120334                                                                          
new(13): PREDSCP: .4                                                                              
new(14): PREFSHP: N

---------- Post updated at 03:25 PM ---------- Previous update was at 03:24 PM ----------

this is what I got using ur great script:

Code:
sed -e 'H;$!d;x;s/\n\([0-9]\{1,\}\)/\@\1/g;s/\@/ /g'  aa.txt | sed -e '/old/'d -e '1d' 
source database: WEBSUN1.WORLD                                                  
owner: WS_APP_OWNER                                                             
object: DZPRCP                                                                  
is tag null: Y                                                                  
command_type: UPDATE                                                            
N                                                                               
N                                                                               
.4                                                                              
N                                                                               
new(1): PRCNCP                                                                   7                                                                               
new(2): PRSKU                                                                    2514297                                                                         
new(3): PRPRCE                                                                   56                                                                              
new(4): PRPCAT                                                                   83                                                                              
new(5): PREFFP                                                                   40929                                                                           
new(6): PRMCHR                                                                  
new(7): PRSUR$                                                                   0                                                                               
new(8): PRWTGL                                                                  
N                                                                               
new(9): PRCERT                                                                  
N                                                                               
new(10): PRRET                                                                   56                                                                              
new(11): PRDATE                                                                  20120627                                                                        
new(12): PRTIME                                                                  120334                                                                          
new(13): PREDSCP                                                                
.4                                                                              
new(14): PREFSHP                                                                
N


Last edited by Scrutinizer; 07-02-2012 at 05:37 PM.. Reason: code tags
# 6  
Old 07-02-2012
How about this:

Code:
awk '/^(new|old)/{o++}!o{print; next}j&&!/:/{j=j" "$0;next}j{print j;j=x}/^new/{sub(/ *$/," : ");j=$0}END{print j}' infile

# 7  
Old 07-09-2012
Thank you sir.

Unfortunately, still litte flaw. see the sample file, the object can be different so I would like to keep the oject and command type:

Code:
type name: SYS.LCR$_ROW_RECORD
source database: WEBSUN1.WORLD
owner: WS_APP_OWNER
object: DZPRCP
is tag null: Y
command_type: UPDATE
old(1): PRCNCP
8
old(2): PRSKU
6480966
old(3): PRPRCE
89.99
old(4): PRPCAT
18
old(5): PREFFP
37814
old(6): PRMCHR
old(7): PRSUR$
0
old(8): PRWTGL
N
old(9): PRCERT
N
old(10): PRRET
0
old(11): PRDATE
20120627
old(12): PRTIME
104015
old(13): PREDSCP
.4
old(14): PREFSHP
N
new(1): PRCNCP
8
new(2): PRSKU
6480966
new(3): PRPRCE
89.99
new(4): PRPCAT
18
new(5): PREFFP
37814
new(6): PRMCHR
new(7): PRSUR$
0
new(8): PRWTGL
N
new(9): PRCERT
N
new(10): PRRET
0
new(11): PRDATE
20120627
new(12): PRTIME
120105
new(13): PREDSCP
.4
new(14): PREFSHP
N
type name: SYS.LCR$_ROW_RECORD
source database: WEBSUN1.WORLD
"txt.log" 252 lines, 19655 characters
new(5): PREFFP
40922
new(6): PRMCHR
new(7): PRSUR$
0
new(8): PRWTGL
N
new(9): PRCERT
N
new(10): PRRET
46
new(11): PRDATE
20120627
new(12): PRTIME
120348
new(13): PREDSCP
.4
new(14): PREFSHP
N
typename:SYS.LCR$_ROW_RECORD
sourcedatabase:WEBSUN1.WORLD
owner:WS_APP_OWNER
object:MOPITCT
istagnull:Y
command_type:UPDATE
new(1):ICSKU1741347
new(2):ICCAT52
new(3):ICPRCE396
new(4):ICEXPR72313
new(5):ICEFFP22112

your output:

Code:
source database: WEBSUN1.WORLD                                                  
owner: WS_APP_OWNER                                                             
object: DZPRCP                                                                  
is tag null: Y                                                                  
command_type: UPDATE                                                            
new(1): PRCNCP :  8                                                                               
new(2): PRSKU :  6480966                                                                         
new(3): PRPRCE :  89.99                                                                           
new(4): PRPCAT :  18                                                                              
new(5): PREFFP :  37814                                                                           
new(6): PRMCHR : 
new(7): PRSUR$ :  0                                                                               
new(8): PRWTGL :  N                                                                               
new(9): PRCERT :  N                                                                               
new(10): PRRET :  0                                                                               
new(11): PRDATE :  20120627                                                                        
new(12): PRTIME :  120105                                                                          
new(13): PREDSCP :  .4                                                                              
new(14): PREFSHP :  N       
...
...
new(1):ICSKU1741347 : 
new(2):ICCAT52 : 
new(3):ICPRCE396 : 
new(4):ICEXPR72313 : 
new(5):ICEFFP22112 :

should be :
Code:
type name: SYS.LCR$_ROW_RECORD                                                  
source database: WEBSUN1.WORLD                                                  
owner: WS_APP_OWNER                                                             
object: DZPRCP                                                                  
is tag null: Y                                                                  
command_type: UPDATE                                                            
new(1): PRCNCP :  8                                                                               
new(2): PRSKU :  6480966                                                                         
new(3): PRPRCE :  89.99                                                                           
new(4): PRPCAT :  18                                                                              
new(5): PREFFP :  37814                                                                           
new(6): PRMCHR : 
new(7): PRSUR$ :  0                                                                               
new(8): PRWTGL :  N                                                                               
new(9): PRCERT :  N                                                                               
new(10): PRRET :  0                                                                               
new(11): PRDATE :  20120627                                                                        
new(12): PRTIME :  120105
...

...

typename:SYS.LCR$_ROW_RECORD
sourcedatabase:WEBSUN1.WORLD
owner:WS_APP_OWNER
object:MOPITCT
istagnull:Y
command_type:UPDATE
new(1):ICSKU1741347 : 
new(2):ICCAT52 : 
new(3):ICPRCE396 : 
new(4):ICEXPR72313 : 
new(5):ICEFFP22112 :


Last edited by Scrutinizer; 07-09-2012 at 05:16 PM.. Reason: code tags / spelling
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

UNIX Script to clean files

Hello All, I need a script that would delete files which are more than "X" number of days old, also if there can be a log file of the deleted files for reference. I am from windows background hence finding it difficult. Any help is much appreciated Regards Wert (4 Replies)
Discussion started by: wert468
4 Replies

2. UNIX for Dummies Questions & Answers

'awk' help for script to clean out wireless.

I've been working on a script to clean out the wireless connections on my MAC. I know the terminal command to do this (that part is not rocket science ;) ) So I thought I would have a go at automating this but I'm having trouble with the 'awk' command & the correct context. My code is := ... (7 Replies)
Discussion started by: Lord Lien
7 Replies

3. UNIX for Dummies Questions & Answers

Shell script to read lines in a text file and filter user data Shell Programming and Scripting

sxsaaas (3 Replies)
Discussion started by: VikrantD
3 Replies

4. Shell Programming and Scripting

Script to FTP,clean up and email

Hi i need a unix script to do the following tasks. My folder structure is /home/MSTR/test and will have the following folder within it Cache Lookup Source Target 1. On the Source & Target folder i have to take a copy of files older than 5 days and move(FTP) it into local machines C:\Backup... (1 Reply)
Discussion started by: Codesearcher
1 Replies

5. UNIX for Advanced & Expert Users

Convert column data to row data using shell script

Hi, I want to convert a 3-column data to 3-row data using shell script. Any suggestion in this regard is highly appreciated. Thanks. (4 Replies)
Discussion started by: sktkpl
4 Replies

6. Shell Programming and Scripting

How to clean this script?

Hello guys, this script partially works but it's still pretty ugly and, moreover, if the month is jan/feb/mar... it doesn't work at all. Could anyone say me how to correct, cut and clean a little bit? #!/usr/bin/ksh egrep -v -e "^\s*#" /file/permission | awk '{ print $1 }' | sort | uniq... (3 Replies)
Discussion started by: gogol_bordello
3 Replies

7. Shell Programming and Scripting

Bourne Shell: Clean Display of stored procedure's output

Environment: Sun UNIX Language: Bourne Shell I have the following script and it works fine. Unfortunately, from user's perspective, it looks very messy because the user is able to see the output of the process caused by the print command. Is there a better way to overcome it? Here's the... (10 Replies)
Discussion started by: totziens
10 Replies

8. BSD

clean data from partition

Anyone know of an automated utility that will fill a designated partition with random data then delete the data? We have several harddrives that need certain partitions cleaned of sensitive data before they are placed back in service. Thanks Thumper (1 Reply)
Discussion started by: thumper
1 Replies

9. OS X (Apple)

Startup script to clean out trash can

I need to know how I would be able to clean out the trash can of a single "dumb" user every time the MAC is turned on. Back ground. OS 10.3x G3 Mac Two users configured... 1) Root or Admin (superuser) 2) student (Simple no access to anything but shared folder for files etc.) The problem... (4 Replies)
Discussion started by: Andrek
4 Replies

10. Shell Programming and Scripting

clean up script

I have a script which would monitor a given directory and delete any files which are older than 10 days. I was going to set the 10 crob jobs to perform this operation for 10 different directories (some are actually sub-directories), but my boss doesn't like that idea, so I need to do that in one... (1 Reply)
Discussion started by: mpang_
1 Replies
Login or Register to Ask a Question