Visit Our UNIX and Linux User Community


Need help with extracting data to MySQL format


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need help with extracting data to MySQL format
# 1  
Old 11-20-2011
Need help with extracting data to MySQL format

Hi guys,

I'm doing a project now and extracting tables from a webpage to MySQL table format.

I dumped the webpage with lynx and it is like this

Code:
   
   id
   Spec
   524543
   Developed especially for seniors
   Spec
   No
   Java
   Spec
   Yes
   Java MIDP
   Spec
   Available using application
   Operating system
   Spec
   Android 2.3 (Gingerbread)
   Qwerty keyboard (buttons)
   Spec
   No
   Qwerty keyboard (screen)
   Spec
   Yes
   SAR
   Spec
    0.34 W/kg
   Stand by time
   Spec
    710 h
   Talk time
   Spec
    18.2 h
   Talk time (3G)
   Spec
    8.4 h
   Stand By-Time (3G)
   Spec
    610 h
   Audio features
   Built-in speakers
   Spec
   Mono
   Camera
   Built-in camera
   Spec
   Yes
   Auto focus
   Spec
   Yes
   Built-in flash
   Spec
   Yes
   Flash type
   Spec
   LED-flash

it is specification of a cellphone. And I got the line number of each 'Spec' word with grep cut and sed and it is like this:

Code:
2 
5 
8 
11
14
17
20
23
26
29
32
35
39
43
46
49
52

now I want to get just the lines after each 'Spec' word and put in the MySQL format like this:

Code:
INSERT INTO `specifications` VALUES (524543,'No','Yes','Available using application','Android 2.3 (Gingerbread)',...);

I know that I need to get the lines with number +1 and put them between ',' in one line, but this part was very hard for me could you please help me about that?

Cheers,
Johanni
# 2  
Old 11-20-2011
The following would create your statement

You could use the DBI and DBD::mysql modules to add the values directly to the database.

WARNING CODE UNTESTED!!
Code:
#!/usr/bin/perl

use strict;
use warnings;

open (my $data, "<", $ARGV[0]); #Supply the spec file as the argument
my $after_spec=0;
my @values;
while(<$data>){
   chomp;
   if ($after_spec){
      push @values $_;
      $after_spec--;
   }
   elsif (/^Spec$/){
      $after_spec++;
   }
}
for (@values){ #add quotes to non numerical values and escape single quotes
   s/'/\\'/g;
   if (! /\d+(\.\d+)/){
       $_="'".$_."'";
   }
}
$values_string=join(', ',@values);
print "INSERT INTO `specifications` VALUES ($values);"

This User Gave Thanks to Skrynesaver For This Post:
# 3  
Old 11-20-2011
Thanks Skrynesaver,

this could is a bit pro for me Smilie

I got these errors

Code:
$ data2.sh
/usr/bin/data2.sh: line 2: use: command not found
/usr/bin/data2.sh: line 2: $'\r': command not found
/usr/bin/data2.sh: line 3: use: command not found
/usr/bin/data2.sh: line 3: $'\r': command not found
/usr/bin/data2.sh: line 4: syntax error near unexpected token `my'
/usr/bin/data2.sh: line 4: `open (my $data, "<", $temp1.dat[0]); #Supply the spec
' file as the argument

I installed perl also.

I don't know what is the problem?

I input the file like this:
Code:
open (my $data, "<", $temp1.dat[0]);

is it correct?
# 4  
Old 11-20-2011
Quote:
Originally Posted by Johanni
I got these errors
Code:
$ data2.sh
/usr/bin/data2.sh: line 2: use: command not found
....

use is a Perl command , try renaming the file data2.pl and then runchmod +x ./data2.pl to make the script runnable.
Quote:
Originally Posted by Johanni
I input the file like this:
Code:
open (my $data, "<", $temp1.dat[0]);

is it correct?
Probably not, the original script is intended to be runnable as follows...
Code:
./datas2.pl PATH_TO_ORIGINAL_DATA_DUMP_FROM_LYNX

$ARGV[0] is the first argument provided to the script on the command line.

And here's the code after fixing up the bugs.
Code:
#!/usr/bin/perl

#These two pragmas catch typos for you.
use strict; 
use warnings;

open (my $data, "<", $ARGV[0]); #Supply the spec file as the argument
my $after_spec=0; #flag which is set if the line is a spec line.
my @values; # an array to hold all the lines that match
while(<$data>){ # loop through the file one line at a time
   chomp; #remove trailing newlines.
   if ($after_spec){ # if we set the flag on the previous line.
      push @values, $_; # add this string to the values array
      $after_spec--; # unset the flag.
   }
   elsif (/^Spec$/){ # if the line matches the pattern 
      $after_spec++; # set the flag
   }
}
for (@values){ #go through the values array cleaning up the data.
   s/^\s*(.+)\s*$/$1/;#remove leading and trailing spaces
   s/'/\\'/g; # escape single quotes
   if (! /^\d+(\.\d+)?$/){ # if this is not a number
       $_="'".$_."'"; # surround the value with quotes
   }
}
my $values_string=join(', ',@values); # create a value string 
print "INSERT INTO `specifications` VALUES ($values_string);\n" # print the result.

Output with snippet of your data above
Code:
~/tmp$ ./tmp.pl tmp.dat 
INSERT INTO `specifications` VALUES (524543, 'No', 'Yes', 'Available using application', 'Android 2.3 (Gingerbread)', 'No', 'Yes', '0.34 W/kg', '710 h', '18.2 h', '8.4 h', '610 h', 'Mono', 'Yes', 'Yes', 'Yes', 'LED-flash');


Last edited by Skrynesaver; 11-20-2011 at 08:38 AM..
This User Gave Thanks to Skrynesaver For This Post:
# 5  
Old 11-20-2011
tnx man for the reply, this seems very complete code, but it seems that my perl is not going to work,

I'm recieving this error now:

Code:
$ data2.pl
bash: /usr/bin/data2.pl: /bin/per1^M: bad interpreter: Permission denied

I,ve googled and they say to put exec in the fstab file in etc folder (I'm using cygwin)
but it didn't worked and I'm recieiving the sam error. Do you have any idea about that?

---------- Post updated at 10:07 PM ---------- Previous update was at 08:59 PM ----------

okay, I fixed the perl Smilie

now I'm receiving this:
Code:
INSERT INTO `specifications` VALUES ();

with no value

I've checked the code, it is reading the file, but I don't know what is the problem?

I input the file like this:

Code:
open (my $data, "<", "temp1.dat");

is it correct? I didn't understand the
Code:
$ARGV[0]

Could you please tell me how should I input the file?

tnxSmilie

---------- Post updated at 10:30 PM ---------- Previous update was at 10:07 PM ----------

worked Smilie)

there were some blank spaces before Spec Smilie

Previous Thread | Next Thread
Test Your Knowledge in Computers #323
Difficulty: Easy
In most programming languages, the operator ++ is equivalent to the statement -= 1;.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script to generate Excel file or to SQL output data to Excel format/tabular format

Hi , i am generating some data by firing sql query with connecting to the database by my solaris box. The below one should be the header line of my excel ,here its coming in separate row. TO_CHAR(C. CURR_EMP_NO ---------- --------------- LST_NM... (6 Replies)
Discussion started by: dani1234
6 Replies

2. Shell Programming and Scripting

Extracting content from a file in specific format

Hi All, I have the file in this format **** Results Data **** Time or Step 1 2 20 0.000000000e+00 0s 0s 0s 1.024000000e+00 Us 0s 0s 1.100000000e+00 1s 0s 0s 1.100000001e+00 1s 0s 1s 2.024000000e+00 Us Us 1s 2.024000001e+00 ... (7 Replies)
Discussion started by: diehard
7 Replies

3. Shell Programming and Scripting

Need help getting my output from MYSQL query into right format

Good afternoon! I have been lurking in this forum for awhile now. I have just recently started posting. I think this is a really good site. With that being said, I don't like to just run and get an answer before I try my best first. I have poured some blood, sweat and tears into... (4 Replies)
Discussion started by: brianjb
4 Replies

4. UNIX for Dummies Questions & Answers

help in extracting logs in readable format

hello everyone. newbie here in unix. I am trying to extract the logs of a certain job and would like to output it in a readable format, see below the CAT part: cat /var/opt/ctma/ctm/sysout/idwesct_sh30_eng_r6_cdcs_sh.LOG_05l0du_000* | egrep -i 'orderid:|file_name=' | sed... (1 Reply)
Discussion started by: eanne_may
1 Replies

5. Shell Programming and Scripting

Extracting specific lines of data from a file and related lines of data based on a grep value range?

Hi, I have one file, say file 1, that has data like below where 19900107 is the date, 19900107 12 144 129 0.7380047 19900108 12 168 129 0.3149017 19900109 12 192 129 3.2766666E-02 ... (3 Replies)
Discussion started by: Wynner
3 Replies

6. Shell Programming and Scripting

Using Awk for extracting data in specific format

please help me writing a awk script 001_r.pdb 0.0265185 001_r.pdb 0.0437049 001_r.pdb 0.0240642 001_r.pdb 0.0310264 001_r.pdb 0.0200482 001_r.pdb 0.0146746 001_r.pdb 0.0351344 001_r.pdb 0.0347856 001_r.pdb 0.036119 001_r.pdb 1.49 002_r.pdb 0.0281011 002_r.pdb 0.0319908 002_r.pdb... (5 Replies)
Discussion started by: phoenix_nebula
5 Replies

7. Shell Programming and Scripting

Help needed to format mysql output

Hi all, Does anyone know how to format the output from mysql from within a shell script? i.e. RESULT=`mysql command` echo ${RESULT} the ${RESULT} only displays the output on one line instead of how mysql would display it as columns etc (3 Replies)
Discussion started by: muay_tb
3 Replies

8. UNIX for Dummies Questions & Answers

Changing from Excel date format to MySQL date format

I have a list of dates in the following format: mm/dd/yyyy and want to change these to the MySQL standard format: yyyy-mm-dd. The dates in the original file may or may not be zero padded, so April is sometimes "04" and other times simply "4". This is what I use to change the format: sed -i '' -e... (2 Replies)
Discussion started by: figaro
2 Replies

9. Shell Programming and Scripting

convert date format to mysql date format in log file

I have a comma delimited log file which has the date as MM/DD/YY in the 2nd column, and HH:MM:SS in the 3rd column. I need to change the date format to YYYY-MM-DD and merge it with the the time HH:MM:SS. How will I got about this? Sample input 02/27/09,23:52:31 02/27/09,23:52:52... (3 Replies)
Discussion started by: hazno
3 Replies

10. UNIX for Dummies Questions & Answers

converting a tabular format data to comma seperated data in KSH

Hi, Could anyone help me in changing a tabular format output to comma seperated file pls in K-sh. Its very urgent. E.g : username empid ------------------------ sri 123 to username,empid sri,123 Thanks, Hema:confused: (2 Replies)
Discussion started by: Hemamalini
2 Replies

Featured Tech Videos