The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
To generate the FTP Script file konankir Shell Programming and Scripting 1 04-01-2008 04:16 PM
Script to generate text file from excel file isingh786 UNIX for Dummies Questions & Answers 1 01-24-2008 10:32 AM
generate new string from a text file walterwaston Shell Programming and Scripting 6 10-16-2007 04:43 PM
generate file incremented by 1 mape Shell Programming and Scripting 4 08-17-2006 04:46 AM
How to generate an image file from FD Esteban UNIX for Advanced & Expert Users 1 12-22-2005 11:50 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 01-23-2007
rahulrathod rahulrathod is offline
Registered User
  
 

Join Date: Sep 2004
Location: Mumbai-India
Posts: 158
Question Generate csv file

I have a file which has some thousand records in the following format

File: input.txt ->


<option value="14333">VISWANADH VELAMURI</option>

<option value="17020">VISWANADHA RAMA KRISHNA</option>


I want to generate a csv file from the above file as follows

File: output.txt ->

14333,VISWANADH VELAMURI
17020,VISWANADHA RAMA KRISHNA


The HTML option tags are to be removed alongwith the unwanted tabs and the empty lines in between. I have tried cut, awk, but I am not getting the correct combination. Can you please help me out, as I want to upload this data into a database table.

Thanks.
  #2 (permalink)  
Old 01-23-2007
ghostdog74 ghostdog74 is offline Forum Advisor  
Registered User
  
 

Join Date: Sep 2006
Posts: 2,557
If you have Python, here's an alternative:

Code:
import re
for line in open("inputfile"):
     print ','.join(re.findall(r'<.*value=\"(.*)\">(.*)<.*?>',line)[0])

from command line:

Code:
#/home: python script.py > output.csv

  #3 (permalink)  
Old 01-23-2007
radoulov's Avatar
radoulov radoulov is offline Forum Staff  
addict
  
 

Join Date: Jan 2007
Location: Варна, България / Milano, Italia
Posts: 2,926
With GNU awk/nawk:


Code:
awk '$0=$0{printf "%s,%s\n",$2,$3}' \
FS="<option value=\"|\">|</option>" infile

  #4 (permalink)  
Old 01-23-2007
matrixmadhan matrixmadhan is offline Forum Advisor  
Technorati Master
  
 

Join Date: Mar 2005
Location: leaf node in B+ tree
Posts: 2,960

Code:
#! /opt/third-party/bin/perl

open (FILE, "< $ARGV[0] ") || die "Unable to open $ARGV[0] <$!> \n";

my(@split_fields, @second_split, @further);

while( chomp($_ = <FILE> ) ) {
  @split_fields = split(/"/, $_);
  @second_split = split(/>/, $_);
  @further = split(/</, $second_split[1]);
  print "$split_fields[1],$further[0]\n";
}

close(FILE);

exit 0

  #5 (permalink)  
Old 01-23-2007
matrixmadhan matrixmadhan is offline Forum Advisor  
Technorati Master
  
 

Join Date: Mar 2005
Location: leaf node in B+ tree
Posts: 2,960
one more,

Code:
sed 's/\(.*\)\"\(.*\)\"\(.*\)>\(.*\)<\(.*\)/\2,\4/'  filename

Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 04:14 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0