Sponsored Content
Top Forums Shell Programming and Scripting Shell script Help - Data cleansing Post 302970539 by RudiC on Thursday 7th of April 2016 05:32:20 PM
Old 04-07-2016
Well, one pass only, and slightly prettier:
Code:
awk '
BEGIN           {HD[++HDCNT]  = "InsertTime"
                 HD[++HDCNT]  = "DocID"
                }

/^#col/         {FS = "\047"
                 $0 = $0
                 for (i=2; i<NF; i+=2)  {gsub (" ", "_", $i)
                                         if (!($i in X))        {X[$i]
                                                                 HD[++HDCNT] = $i
                                                                }
                                        }
                 FS = " "
                 $0 = $0
                 RW[++RCNT] = $0
                }

/^#/       ||
/^ *$/     ||
/^.DocID/       {next
                }

/^Inse/         {gsub (/(^| )[^ :]*:/, " ")
                 INS = $1
                 DOC = $2
                 next
                }

                {gsub (/\047/, "")
                 LN[RCNT,++LCNT[RCNT]] = INS FS DOC FS $0
                }


END             {for (i=1; i<HDCNT; i++) printf "%s|", HD[i]
                 printf "%s%s", HD[HDCNT], ORS
                 OFS = "|"
                 for (r=1; r<=RCNT; r++)
                        {delete COL 
                         COL[1] = 1
                         COL[2] = 2
                         n = split (RW[r], T)
                         for (i=2; i<=n; i++) for (j=3; j<=HDCNT; j++) if (T[i] == HD[j]) COL[i+1] = j
                         for (l=1; l<=LCNT[r]; l++)
                                {n = split (LN[r,l], T)
                                 $0 = ""
                                 for (i=1; i<=n; i++) $(COL[i]) = T[i]
                                 $HDCNT = $HDCNT
                                 print
                                }
                        }
                }
'  file
InsertTime|DocID|TargetDoc|GRank|LRank|Priority|Loc_ID|Rank|Check_Name
201604070523|101|aaaaa|1|1|Slow|8gkahinka.01||
201604070523|101|aaaaa|1|0|Slow|7nlafnjbaflnbja.01||
201604070523|102|aa||||8gkahinka.01|1|xyz
201604070523|102|aax||||7nlafnjbaflnbja.01|1|none
201604070750|101|xxxx|1|1|Slow|bjkkacka.01||
201604070750|101|yyyy|1|0|Slow|jiafjklas.001||


Last edited by RudiC; 04-07-2016 at 06:52 PM..
This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Pipe data to shell script

Sorry about the noobish question but... How do I capture data thats piped to my script? For instance, ls -al | myscript.sh How do I access the output from ls -al in myscript.sh? (3 Replies)
Discussion started by: tomjones07
3 Replies

2. Shell Programming and Scripting

Getting remote data through shell script

Hi, I need to get the details (File System status & Memory status) of a remote server. I am executing a shell script in ksh and preparing the report. Pls help. Regards, armohans. (1 Reply)
Discussion started by: armohans
1 Replies

3. UNIX for Dummies Questions & Answers

cleansing file in unix

Hi Experts, Our requirement is to cleanse a specific formatted file in unix. For example : File pattern is : Job name.......................................... \\\\Jobs\Amey ABC PQRS ABCD XYZ Job name.......................................... WEQ RED AAA Desired Result: (2 Replies)
Discussion started by: Amey Joshi
2 Replies

4. Shell Programming and Scripting

reformat data with a shell script

Can anyone help me with a shell script that can do the following: I have a data in fasta format (first line is the header, followed by a sequence of characters). >ALLLY GGCCCCTCGAGCCTCGAACCGGAACCTCCAAATCCGAGACGCTCTGCTTATGAGGACCTC GAAATATGCCGGCCAGTGAAAAAATCTTGTGGCTTTGAGGGCTTTTGGTTGGCCAGGGGC... (5 Replies)
Discussion started by: manishabh
5 Replies

5. Shell Programming and Scripting

Help with cleansing data

I have a file with 27 fields seperated by pipe. I have a field 17 that is defined as numeric and the data coming in might contain character and other miscellaneous data like (@,!,~,#,%,^,&,*,(,)). I have to make sure that the column strictly contains numeric data and if it contains any of the... (2 Replies)
Discussion started by: dsravan
2 Replies

6. UNIX for Dummies Questions & Answers

Data Importing using shell script

Hi All, I have a .csv file pipe delimter.., I am using excel data import option for importing the data from a pipe delimter file to xls...I want to make this happen using shell script. Please let me know how can I do this using shell script. Regards, Deepti (2 Replies)
Discussion started by: gaur.deepti
2 Replies

7. UNIX for Advanced & Expert Users

Convert column data to row data using shell script

Hi, I want to convert a 3-column data to 3-row data using shell script. Any suggestion in this regard is highly appreciated. Thanks. (4 Replies)
Discussion started by: sktkpl
4 Replies

8. Shell Programming and Scripting

Need a shell script to clean data

Hi, Appreciated if anyone can throw some hint I have a file format like this: old(1): PRCNCP 1 old(2): PRSKU ... (6 Replies)
Discussion started by: netbanker
6 Replies

9. UNIX for Dummies Questions & Answers

Shell script to read lines in a text file and filter user data Shell Programming and Scripting

sxsaaas (3 Replies)
Discussion started by: VikrantD
3 Replies

10. Shell Programming and Scripting

Shell script to correct the data

Hi, I have below data in my flat file.I would like to remove the quotes and comma necessary from the data.Below is the details I would like to have in my output. Could anybody help me providing the Unix shell script for this. Input : ABC,ABC,10/15/2012,"47,936,164.567 ","1,036,997.453... (2 Replies)
Discussion started by: sonu_pal
2 Replies
CPANPLUS::Shell(3pm)					 Perl Programmers Reference Guide				      CPANPLUS::Shell(3pm)

NAME
CPANPLUS::Shell SYNOPSIS
use CPANPLUS::Shell; # load the shell indicated by your # config -- defaults to # CPANPLUS::Shell::Default use CPANPLUS::Shell qw[Classic] # load CPANPLUS::Shell::Classic; my $ui = CPANPLUS::Shell->new(); my $name = $ui->which; # Find out what shell you loaded $ui->shell; # run the ui shell DESCRIPTION
This module is the generic loading (and base class) for all "CPANPLUS" shells. Through this module you can load any installed "CPANPLUS" shell. Just about all the functionality is provided by the shell that you have loaded, and not by this class (which merely functions as a generic loading class), so please consult the documentation of your shell of choice. BUG REPORTS
Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>. AUTHOR
This module by Jos Boumans <kane@cpan.org>. COPYRIGHT
The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved. This library is free software; you may redistribute and/or modify it under the same terms as Perl itself. SEE ALSO
CPANPLUS::Shell::Default, CPANPLUS::Shell::Classic, cpanp perl v5.12.1 2010-04-26 CPANPLUS::Shell(3pm)
All times are GMT -4. The time now is 02:07 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy