Shell script change new format on the file.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Shell script change new format on the file.
# 1  
Old 06-08-2017
Shell script change new format on the file.

Hi---Is there's way can write small shell script or perl script open "abc.txt" file and create new "new_abc.txt" file with format output below? Thanks

cat abc.txt
Code:
 
 ###########################Readme###############################
Contained with this README.TXT file are all of the 
file specs for your
directory abt.
###########################Readme###############################
 
Filename : SW_PP_CTRL_20170505.txt.gz
Data Format : ASCII with carriage returns and linefeeds
Compression : GZIP
GZIP Bytes : 2019064
Unzipped Bytes : 11413730
Records : 95788
Record Length : 157
 
Filename : SW_PP_DATA_20170505.txt.gz
Data Format : ASCII with carriage returns and linefeeds
Compression : GZIP
GZIP Bytes : 691778058
Unzipped Bytes : 8316153069
Records : 60400481
Record Length : 158
 
Filename : SW_PP_DEMO_20170505.txt.gz
Data Format : ASCII with carriage returns and linefeeds
Compression : GZIP
GZIP Bytes : 26240709
Unzipped Bytes : 77053000
Records : 543250
Record Length : 227
 
Filename : SW_PP_PLANXREF_20170505.txt.gz
Data Format : ASCII with carriage returns and linefeeds
Compression : GZIP
GZIP Bytes : 557904
Unzipped Bytes : 3061930
Records : 16262
Record Length : 310
 
Filename : SW_PP_PRODUCT_20170505.txt.gz
Data Format : ASCII with carriage returns and linefeeds
Compression : GZIP
GZIP Bytes : 21375
Unzipped Bytes : 229431
Records : 1264
Record Length : 211
 
Filename : SW_PP_REASSIGN_20170505.txt.gz
Data Format : ASCII with carriage returns and linefeeds
Compression : GZIP
GZIP Bytes : 32681
Unzipped Bytes : 69399
Records : 802
Record Length : 130

Output: cat new_abc.txt
Code:
FILE_NAME,Filename|Data Format|Compression|GZIP Bytes|Unzipped Bytes|Records|Record Length
SW_PP_CTRL_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|2019064|11413730|95788|157
SW_PP_DATA_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|691778058|8316153069|60400481|158
SW_PP_DEMO_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|26240709|77053000|543250|227
SW_PP_PLANXREF_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|557904|3061930|16262|310
SW_PP_PRODUCT_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|21375|229431|1264|211
SW_PP_REASSIGN_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|32681|69399|802|130

---------- Post updated at 08:39 PM ---------- Previous update was at 05:33 PM ----------

I have this perl code and when run see extra "|" at the end of each line. And not sure how to code remove it or can write shell script easier.....can someone help? Thanks
Code:
 
 #!/usr/bin/perl 
 my $filename = 'abc.txt';
open(my $fh, '<:encoding(UTF-8)', $filename)  or die "Could not open file '$filename' $!";
print "FILE_NAME,Filename|Data Format|Compression|GZIP Bytes|Unzipped Bytes|Records|Record Length";
while (my $row = <$fh>) {
  chomp $row;
  my ($label, $value) = split /: /, $row;
  if ($row eq '') {
    print "\n";
  }
  else
  {
     print "$value|";
  }
}

/test1> ./test.pl
Code:
FILE_NAME,Filename|Data Format|Compression|GZIP Bytes|Unzipped Bytes|Records|Record Length|||||
SW_PP_CTRL_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|2019064|11413730|95788|157|
SW_PP_DATA_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|691778058|8316153069|60400481|158|
SW_PP_DEMO_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|26240709|77053000|543250|227|
SW_PP_PLANXREF_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|557904|3061930|16262|310|
SW_PP_PRODUCT_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|21375|229431|1264|211|
SW_PP_REASSIGN_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|32681|69399|802|130|


Last edited by dotran; 06-08-2017 at 07:48 PM..
# 2  
Old 06-08-2017
Except for the strange field 1 header in your output, the following awk script seems to produce the output you requested:
Code:
awk -F ' : ' '
NF == 2 {
	if(h < 7)
		printf("%s%s", $1, (++h == 7) ? ORS : OFS)
	o = f++ ? (o OFS $2) : $2
	if(f == 7) {
		print o
		o = ""
		f = 0
	}
}' OFS='|' file

which prints:
Code:
Filename|Data Format|Compression|GZIP Bytes|Unzipped Bytes|Records|Record Length
SW_PP_CTRL_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|2019064|11413730|95788|157
SW_PP_DATA_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|691778058|8316153069|60400481|158
SW_PP_DEMO_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|26240709|77053000|543250|227
SW_PP_PLANXREF_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|557904|3061930|16262|310
SW_PP_PRODUCT_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|21375|229431|1264|211
SW_PP_REASSIGN_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|32681|69399|802|130

This could have been done much more simply with:
Code:
awk -F ' : |\n' '
NF == 14 {
	if(!h) {h = 1
		print $1, $3, $5, $7, $9, $11, $13
	}
	print $2, $4, $6, $8, $10, $12, $14
}' OFS='|' RS='' file2

if the blank lines in your input file had been empty lines instead (i.e., no <space> characters on the blank lines between records).

If you want to try either of these on a Solaris/SunOS system, change awk to nawk or /usr/xpg4/bin/awk.
# 3  
Old 06-08-2017
Thank you for your help Mr.Don. I run the code and output 1 line header should no space and require extra 'FILE_NAME,' like below. Could you please help out again?

Code:
FILE_NAME,Filename|Data Format|Compression|GZIP Bytes|Unzipped Bytes|Records|Record Length

/test :./abc1

Code:
 
 Filename        | Data Format    | Compression    | GZIP Bytes     | Unzipped Bytes | Records        | Record Length  
SW_PP_CTRL_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|2019064|11413730|95788|157
SW_PP_DATA_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|691778058|8316153069|60400481|158
SW_PP_DEMO_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|26240709|77053000|543250|227
SW_PP_PLANXREF_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|557904|3061930|16262|310
SW_PP_PRODUCT_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|21375|229431|1264|211
SW_PP_REASSIGN_20170505.txt.gz|ASCII with carriage returns and linefeeds|GZIP|32681|69399|802|130

# 4  
Old 06-09-2017
If you copied the code I gave you and you executed that code as given, the output you showed us is not the output that would have been produced unless your sample input file format is significantly different from the input file you used when you ran my script.

With a very simple script like the one I suggested, it should be easy for you to modify it to print a constant header line that doesn't try to use the field headings found in the data being read. Why don't you try modifying the code I suggested and let us know where you run into problems if you can't make it work?
# 5  
Old 06-09-2017
Thank you mr.Don. Yes...the input had space.....anyway. I got it work.....thank you for your input.....Smilie

Code:
 
 Filename         : SW_PP_PRODUCT_20170505.txt.gz
  Data Format     : ASCII with carriage returns and linefeeds
  Compression     : GZIP
  GZIP Bytes      : 21375
  Unzipped Bytes  : 229431
  Records         : 1264
  Record Length   : 211

# 6  
Old 06-09-2017
You're welcome. Would you like to show us what you did so others reading your thread can learn from your experience?
# 7  
Old 06-09-2017
(1) Your input file has a "readme" block at the start. You process it as if it were a normal line, which results in the vertical bars added to your header line.

(2) Instead of
Code:
if ($row = '')

, I would reverse the test and ask, whether a line contains a colon - and only then split it:

Code:
if ($row =~ /:/)
   # .... split
else
   # .... process other lines
end

Actually, you can even get rid of the split by

Code:
if ($row =~ /:(.*)/)
  # ..... The part after the colon is now stored in $1
else
  # .... process non-colon lines
end



(3) Note that, if you find a value, you always print it as "$value|". This means that every line has a vertical bar at the end.

(4) If your input has a sequence of more than one lines without a colon, you would also produce the same number of empty lines in the output.

Last edited by rovf; 06-09-2017 at 01:39 AM.. Reason: Improving solution
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Format CSV file from a shell script

I have a shell script which creates a CSV files. there are 3 fields, field1, field2 and comp. I will generates the values for field1 and field2 and Want to compare both. If field1>filed2 then comp should be success written in green in CSV file, else it should fail in red color. How can I change the... (5 Replies)
Discussion started by: sauravrout
5 Replies

2. Shell Programming and Scripting

Change date format in shell script

Plz help me To display date in the mm/dd/yyyy. Eg. if date is 28-09-2012 the output of the shell script should be date 09/28/2012. (1 Reply)
Discussion started by: shivasaini
1 Replies

3. Shell Programming and Scripting

Change value in a file using perl or shell script

hi, I have a local.conf file which has the first line TOPDIR = "/home/mvdev/workspace/boxer". I want to replace the value to "/home/common/workspace/mirror". I tried the following perl command that is perl -p -i -e 's/Path/path1/g' myfile.txt then sed... (7 Replies)
Discussion started by: amvarma77
7 Replies

4. Shell Programming and Scripting

Conversion of below Tabs Tex file into CSV format file : shell script needed

Request if some one could provide me shell script that converts the below "input file" to "CSV format file" given Name Domain Contact Phone Email Location ----------------------- ------------------------------------------------ ------- ----- ---------------------------------... (7 Replies)
Discussion started by: sreenath1037
7 Replies

5. Shell Programming and Scripting

Help with shell script for know when a file change it

Hi, IŽd like to know how to program a shell script for know when a file changes and based on that make another tasks all this in real time.. Thanks (2 Replies)
Discussion started by: mrios7
2 Replies

6. Shell Programming and Scripting

How to check file name format using shell script?

Hi, I am writting a script, which accepts input file as parameter. Input file name is aa_bb_cc_dd_ee.<ext> I need to check that input file name should be of 5 fileds. Please help me out. :confused: (7 Replies)
Discussion started by: Poonamol
7 Replies

7. Solaris

format file using shell script

my question "format file using shell script " is not a homework. bad guess. my actual file is much more complex. the requirement is to format the file before i can read it from SAP. so i'd appreciate if any inputs can be provide. i've tried most of the commands like tr and sed and nawk, no... (2 Replies)
Discussion started by: balajim
2 Replies

8. Solaris

format file using shell script

Hi All, I am new to shell scripts. I have a requirement to change the format of a file. Here is the original file: #student layout student_name student_class student_subject david 5 chemistry paul 4 physics steve 6 mathematics This is the format i need: k1,david,5,chemistry... (1 Reply)
Discussion started by: balajim
1 Replies

9. Shell Programming and Scripting

script to change the date format in a file

i have many files with date format of 6-9-2008 and i want a script that can change the format to 2008-06-09 Thanks (15 Replies)
Discussion started by: shehzad_m
15 Replies

10. UNIX for Advanced & Expert Users

format csv file using shell script

i have a report.csv file from oracle datavase In that file data is like this with report heading and date SALES DAILY REPORT DATE: 06-26-2007 REPORT NAME: SALES DATA AA.BB.CCCC.DDDD,BBBBB,06-26-2007,0,BEGIN,END ZZ.VV.DDDD.XXXXXXX,MMMMMM,06-25-2007,18,BEGIN,END... (3 Replies)
Discussion started by: raosurya
3 Replies
Login or Register to Ask a Question