TEXT to CSV using Perl


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting TEXT to CSV using Perl
# 1  
Old 10-11-2012
TEXT to CSV using Perl

Hi Folks

Need some help with this and my Perl isnt the hottest I also have text::csv installed on my perl install
The large text with a few million entries is in a format below

example text file

Fig Leafs
Cake No: 0000001
Author: King s.
Record No: 995-34343-232-232
Comments:SSDFGDFGDFFGDFGDFGDFGDFGDFGDFGFGDFGDFGDFGFDFDASSFDSFSDFSDFDSFSD
SDFSDFSDFSDFSDFDFDFDFDDFDSFDFDFDFDFSD SDFGSDFFGE RFGSDFGFDG DFGDFGFDG
Food Source: MY mUMS house
Ingredients: dsfsdfsd dsfsdfsdf sdfsdfsdf dfsdf dsfsdf sdfsdfdsf sdfsdfsd sdf sd
sdfsdsdfsd sdfsdfsdf sdsfsdfsdf sdfsdfdf sdfsdfd sdf dsdfsd fr.
Another No: 999999
Location: The kitchen

Insane batman

Cake No: 0000002
"Author: nannao y.y.,carloff t.t.,mello.i."
Record No: 123-83412-111-666
Comment: sdfjhfgfgkmfbf djkndfdjfnd dfjkndjf dfjndfdfdf dfjdfn DFGDFGDFGDFGFGDFGDFGDFGFDFDASSFDSFSDFSDFDSFSD
SDFSDFSDFSDFSDFDFDFDFDDFDSFDFDFDFDFSD SDFGSDFFGE RFGSDFGFDG DFGDFGFDG sdsdfsffsd sdsdsdsd xxxxx
Food Source: MY Dads house over there
Ingredients: dsfsdfsd dsfsdfsdf sdfsdfsdf dfsdf dsfsdf sdfsdfdsf sdfsdfsd sdf sd
sdfsdsdfsd sdfsdfsdf (sdsfsdfsdf) sdfsdfdf sdfsdfd sdf dsdfsd fr.
Another No: 002234
Location: The Hallway

From the file above i want the output like
Cake No,Author,Record No,Comments,Food Source,Ingredients,Another No,Location


Notice the formatting isnt that good theres a " infront of author on the second record and also at the end
The words like Fig leafs and Insane batman i want ignored and not picked up.
Other thing im worried about is the processing on this the text file is like 2mb around a million entries
Thanks in advance for any help on this! I will repay you !
Cheers

---------- Post updated at 12:02 PM ---------- Previous update was at 10:15 AM ----------

Could this be acheived with a Kshell Script?!!
# 2  
Old 10-11-2012
CSV has just a few rules:
  1. Separate columns with commas.
  2. Separate rows with ^M^J (carriage-return + line-feed)
  3. For any " in a column, expand to two: ""
  4. For any column containing a comma, carriage-return or line-feed, enclose in ". (Other columns can be " enclosed if desired.)
So, CSV columns can contain any character! Of course, if you are going to have utf-8 multibyte characters, you need a tool that knows and respects that!

You need to parse your file, store each column value after detecting and compensating for embedded metacharacters: ',', '"', carriage-return and line feed, detect row boundaries at new cake number or end of file, and spit out your columns in the desired order with commas between and the row separator after.

You can write it in any language, but I would prefer C or PERL. I am not sure which of the PERL functions encodes data into CSV, as opposed to decodimg from. Like many tools, you can play with it and see what results. Some options for data typing, white space interpretation and binary are only relevant to some receiving systems.

Last edited by DGPickett; 10-11-2012 at 02:20 PM..
# 3  
Old 10-11-2012
Hi.

Here is a sample of the basic operation for combining with Text::CSV:
Code:
#!/usr/bin/env perl

# @(#) p1	Demonstrate creation of csv with Text/CSV.
# See perldoc Text/CSV

use warnings;
use strict;

use Text::CSV;

my ( $csv, @columns, $status, $line );

$csv = Text::CSV->new();    # create a new object

@columns = qw/ Now is the time /;
$status  = $csv->combine(@columns);    # combine columns into a string
$line    = $csv->string();
print " csv line is: ", $line, "\n";

@columns = qw/ Now "is the" time /;
$status  = $csv->combine(@columns);    # combine columns into a string
$line    = $csv->string();
print " csv line is: ", $line, "\n";

exit(0);

producing:
Code:
% ./p1
 csv line is: Now,is,the,time
 csv line is: Now,"""is","the""",time

Best wishes ... cheers, drl
# 4  
Old 10-11-2012
Code:
awk '
BEGIN {
  read_comment=0;
  fl=0;
  t="Cake No,Author,Record No,Comments,Food Source,Ingredients,Another No,Location";
  c=split(t,b,",");
}
{sub("^[ \"]*","");
 sub("[ \"]*$","");
 sub("Comment[s]* *: *","Comments:");
}
$0 !~ /:/ && w ~ /./ {
  a[w]=a[w] $0;
}
/:/ {
  w=$0;
  sub(" *:.*", "", w);
  v=$0;
  sub(".*: *", "", v);
  a[w]=v;
}
/^Location:/ {
  sub("Comments *: *","",a["Comments"]);
  sub("^ *","",a["Comments"]);
  if (fl++<1) {
    printf "\"" b[1] "\"";
    for (i=2; i<=c; i++) {
      printf ","  "\"" b[i] "\"";
    }
    print "";
  }
  printf "\"" a[b[1]] "\"";
  a[b[1]]="";
  for (i=2; i<=c; i++) {
    printf "," "\"" a[b[i]] "\"";
    a[b[i]]="";
  }
  print "";
  w="";
}' infile > outfile.csv

# 5  
Old 10-12-2012
Guys thanks for all your help so far, its been good learning this and ill make sure to share the solution
RDRTX1 Thank you for your time and the script
Do i run it just like

cat cairstest.txt |
awk '
BEGIN {
read_comment=0;
fl=0; blah blah

I got a syntax error at line 8 !
# 6  
Old 10-12-2012
Try replacing the lines:
Code:
sub("^[ \"]*","");
 sub("[ \"]*$","");

with:
Code:
sub("^[ '\"']*","");
 sub("[ '\"']*$","");

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

CSV from Text file

Gents, Attached there is a input.txt and code.txt file. I use that code to create a csv file from txt file. Is there the possibility to generate other code more faster to generate the CVS file using the input file. I have deleted many data in the input file to avoid load a lot data. ... (10 Replies)
Discussion started by: jiam912
10 Replies

2. Shell Programming and Scripting

*log to *.csv in perl

I want to convert the log files to xls. by bash script I converted it to csv, but I need to make some changes in some of them. please dont ask why xls and not csv. I need xls and this is to be a automated process. I am new to linux, bash and NULL in Perl. That would be nice if someone help me to... (6 Replies)
Discussion started by: frhling
6 Replies

3. Shell Programming and Scripting

Convert text to CSV

Hi Gurus I need urgent help to convert a flat log file into csv format to load into database. Log looks like: a=1 b=2 c=3 a=4 b=5 c=6 Only the values at right side of = will come into csv and it should create a new line once it receives "a" field. (8 Replies)
Discussion started by: sandipjee
8 Replies

4. Shell Programming and Scripting

Perl search csv fileA where two strings exist on another csv fileB

Hi I have two csv files, with the following formats: FileA.log: Application, This occured blah Application, That occured blah Application, Also this AnotherLog, Bob did this AnotherLog, Dave did that FileB.log: Uk, London, Application, datetime, LaterDateTime, Today it had'nt... (8 Replies)
Discussion started by: PerlNewbRP
8 Replies

5. Shell Programming and Scripting

text to csv conversion

Thank u every body ......just need a help so that a text file needs to be converted into CSV............. my log file is as follows Host scsi3: usb-storage Vendor: Maxtor Product: OneTouch III Serial Number: 044303E5 Protocol: Transparent SCSI Transport: Bulk ... (4 Replies)
Discussion started by: tangotango
4 Replies

6. Shell Programming and Scripting

Perl program to convert PDF to text/CSV

Please suggest ways to easily convert pdf to text in perl only on windows (no other tools can be downloaded) Here is what I have been doing : using a module CAM::PDF to extract data. But it shows everything in messy format :wall: But this module is the only one working with the pdf... (0 Replies)
Discussion started by: chakrapani
0 Replies

7. Shell Programming and Scripting

Format text to bold from perl script to csv

Hi everyone, is there any way in perl using which we can print the selective words in bold when we write the output to a csv file? Please find the example below 1. Filename: A 2. name age 12 3. city add 23 Line1 should only be bold. Outputs from other files being read in the... (2 Replies)
Discussion started by: ramakanth_burra
2 Replies

8. Shell Programming and Scripting

Text to CSV

Hi, My access log looks like this... 192.168.50.184 - - "GET /ATIM_LATEST/ABC/ HTTP/1.1" 200 522 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.13) Gecko/2009073022 Firefox/3.0.13" 192.168.50.184 - - "GET /ATIM_LATEST/ABC/ATIM/syncdepot.php HTTP/1.1" 200 1463... (5 Replies)
Discussion started by: shantanuo
5 Replies

9. Shell Programming and Scripting

Text to csv

Hi, I have a document with a lot of data, it is structured like this, UNIQUESTRING To be acquited of a crime is to be deemed to be innocent of the charges after a court hearing. This is different from a <a href=lawglos_Discharge.html>Discharge</a>, where the case is never heard. In... (1 Reply)
Discussion started by: lawstudent
1 Replies

10. Shell Programming and Scripting

Replacing text in a .csv file using Perl

I have played with this for some time but i dont seem like i am getting it right. I am trying to change the delimiters on a file so i can import it into a database. this file has rows of data separated by enter Right now the delimiters are represented by tabs and " ". e.g. "dlfkldfs... (9 Replies)
Discussion started by: salemh
9 Replies
Login or Register to Ask a Question