Hello community, I am getting a log files from system and I need to clean the data and store as txt files for reporting purposes. Since these files are generated in Unix box, so we have to write shell script to handle the data cleansing.
Please find the sample file data looks like:
This was a raw data and I need to write a shell script to cleanse the data.
1. row started with # is like comment and we need to ignore that other than #coulmns
2. #columns are give the columns names and #rows give the actual data.
3. unwanted data highlighted with red color and useful data highlighted as black color
4. The header for out put file is always all the #headers in the data along with InsertTime and DocID
5. assign the values as per header and add InsertTime & DocID values too.
6. data delimiter is | in the out put file.
Please find the desired out put:
Last edited by RudiC; 04-07-2016 at 11:11 AM..
Reason: Added code tags
Phew, what an ordeal! Far from elegant, utterly clumsy, but it does what is requested:
We're running through the file twice, and I had to replace the spaces in the column headers by underscore char to avoid additional field splits. Anybody out there to try to pimp it up?
Hi pdathu,
Someone who has been posting questions to this forum for well over 4 years is not "a newbie who has never had a chance to work on the advanced commands".
You may have chosen to use this site as your unpaid programming staff instead of using it as a way to learn how to use basic UNIX, Linux, and BSD system utilities; but that doesn't qualify you to be someone who has never had a chance to work on the advanced commands.
In future threads, we will be very disappointed if you still claim to be a newbie and refuse to make any attempt to show us that you have invested any effort to solve problems on your own.
Hi RudiC,
I won't claim that this is any prettier than your suggestions (and it does read the input file twice), but it is a slightly different approach and doesn't need to convert spaces in field names to underscores. But, it does depend on the fact that there are no spaces in any of the fixed header fields names or data (i.e., those that are not defined by a "#columns" comment line) since it uses spaces and colons as field separators on the:
lines.
And, since it defines SQ outside of the awk script using "'" instead of using "\047", it will also work on mainframe systems using EBCDIC as well as on systems using codesets that are supersets of ASCII.
If anyone wants to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
With the given sample input data, the above script produces the output:
as requested.
This User Gave Thanks to Don Cragun For This Post:
Hi,
I have below data in my flat file.I would like to remove the quotes and comma necessary from the data.Below is the details I would like to have in my output.
Could anybody help me providing the Unix shell script for this.
Input :
ABC,ABC,10/15/2012,"47,936,164.567 ","1,036,997.453... (2 Replies)
Hi All,
I have a .csv file pipe delimter.., I am using excel data import option for importing the data from a pipe delimter file to xls...I want to make this happen using shell script.
Please let me know how can I do this using shell script.
Regards,
Deepti (2 Replies)
I have a file with 27 fields seperated by pipe. I have a field 17 that is defined as numeric and the data coming in might contain character and
other miscellaneous data like (@,!,~,#,%,^,&,*,(,)).
I have to make sure that the column strictly contains numeric data and if it contains any of the... (2 Replies)
Can anyone help me with a shell script that can do the following:
I have a data in fasta format (first line is the header, followed by a sequence of characters).
>ALLLY
GGCCCCTCGAGCCTCGAACCGGAACCTCCAAATCCGAGACGCTCTGCTTATGAGGACCTC
GAAATATGCCGGCCAGTGAAAAAATCTTGTGGCTTTGAGGGCTTTTGGTTGGCCAGGGGC... (5 Replies)
Hi Experts,
Our requirement is to cleanse a specific formatted file in unix.
For example :
File pattern is :
Job name..........................................
\\\\Jobs\Amey
ABC
PQRS
ABCD
XYZ
Job name..........................................
WEQ
RED
AAA
Desired Result: (2 Replies)
Hi,
I need to get the details (File System status & Memory status) of a remote server. I am executing a shell script in ksh and preparing the report.
Pls help.
Regards,
armohans. (1 Reply)
Sorry about the noobish question but...
How do I capture data thats piped to my script?
For instance,
ls -al | myscript.sh
How do I access the output from ls -al in myscript.sh? (3 Replies)