python - string encoding error


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting python - string encoding error
# 8  
Old 09-02-2011
Code:
ALTER  DATABASE  `googlecal`  DEFAULT  CHARACTER  SET utf8 COLLATE utf8_unicode_ci

do I need to do something else? I don't have control of what gets placed on the source google calendar, it could be anything, so it'd be nice to use an encoding that plays nice with others Smilie
# 9  
Old 09-02-2011
My guess would be that the python mysql client has been converting to and sending latin-1 while mysqld has been converting that to utf-8 before writing to the db (since mysqld simultaneously supports different character set/encodings per client connection/per database/server default). This works fine until you have a unicode character that's not part of latin-1.

I suggest fiddling with MySQLdb.connect() parameters. I believe you can set the encoding there and force use of Unicode for returned strings.

Sorry I can't be more specific, but I don't hack with python that much these days and my experience with mysql is quite limited.

Regards,
Alister
This User Gave Thanks to alister For This Post:
# 10  
Old 09-02-2011
Thanks alister: yes, it turns out you can specify this in the connection, so I added it to the code like:
Code:
#!/usr/bin/python   

from xml.etree import ElementTree 
import gdata.calendar.data 
import gdata.calendar.client 
import gdata.acl.data 
import atom.data 
import time 
import MySQLdb

calendar_client = gdata.calendar.client.CalendarClient()
username = 'user@gmail.com'
visibility = 'public'
projection = 'full'
feed_uri = calendar_client.GetCalendarEventFeedUri(calendar=username, visibility=visibility, projection=projection) 

# define mysql db connection/credentials 
conn = MySQLdb.connect (host = "localhost", user = "test1", passwd = "test1", db = "googlecal", charset = "utf8", use_unicode = True) 
cursor = conn.cursor ()  
feed = calendar_client.GetCalendarEventFeed(uri=feed_uri) 

print 'Events on Primary Calendar: %s' % (feed.title.text,)
for i, an_event in enumerate(feed.entry):
     print '\t%s. %s' % (i, an_event.title.text,)
     data_point = {}
     data_point[ 'title' ] = (i, an_event.title.text,)
     print '\t%s. %s' % data_point[ 'title' ]
     for a_when in an_event.when:
         print '\t\t%s. %s' % (i, an_event.content.text,)
         data_point[ 'content' ] = (i, an_event.content.text,)
         print '\t\tStart time: %s' % (a_when.start,)
         data_point[ 'start' ] = (a_when.start,)
         print '\t\tEnd time:   %s' % (a_when.end,)
         data_point[ 'end' ] = (a_when.end,)

         cursor.execute("""insert into events (id, title, content, start, end)
         values (NULL, %s, %s, %s, %s)""",
         (data_point[ 'title' ],
           data_point[ 'content' ],
          data_point[ 'start' ],
          data_point[ 'end' ]))

conn.commit()
cursor.close () 
conn.close ()

but now I get syntax error:
Code:
Traceback (most recent call last):
  File "/home/unclecameron/Documents/workspace/google_pull_calendar/src/getivdailyview_gmail.cal7.py", line 41, in <module>
    data_point[ 'end' ]))
  File "/usr/lib/pymodules/python2.7/MySQLdb/cursors.py", line 166, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib/pymodules/python2.7/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.ProgrammingError: (1064, 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version 
for the right syntax to use near \'), ("\'2011-09-09T21:00:00.000-07:00\'",))\' at line 2')

which I suspect is because the data from 'content' isn't fitting inside data_point [ 'content' ] very well? In awk/sed/bash this would relate to single/double quotes surrounding a content being put into a variable, not sure in python

Last edited by unclecameron; 09-02-2011 at 05:35 PM..
# 11  
Old 09-02-2011
Looking at your code, the first thing that stands out is that you are using 2-tuples where I suspect it should be a 1-tuple.

Quote:
Originally Posted by unclecameron
Code:
     data_point[ 'title' ] = (i, an_event.title.text,)
...
         cursor.execute("""insert into events (id, title, content, start, end)
         values (NULL, %s, %s, %s, %s)""",
         (data_point[ 'title' ],
...

data_point[ 'title' ] contains two elements though the intention is to pass only one.

Assuming that the value you want to pass to the query is an_event.title.text, you could either not include i in the tuple or you could index the desired element of the 2-tuple parameter in the execute() call, e.g. data_point['title'][1].

The same goes for data_point[ 'content' ].

Regards,
Alister

---------- Post updated at 05:26 PM ---------- Previous update was at 05:06 PM ----------

Skimming over Writing MySQL Scripts with Python DB-API, it seems that the execute function expects a tuple sequence consisting of each value to be bound, in the order they are to be bound, to the %s placeholders.

Assuming the following code ...

Quote:
Originally Posted by unclecameron
Code:
#!/usr/bin/python   

from xml.etree import ElementTree 
import gdata.calendar.data 
import gdata.calendar.client 
import gdata.acl.data 
import atom.data 
import time 
import MySQLdb

calendar_client = gdata.calendar.client.CalendarClient()
username = 'user@gmail.com'
visibility = 'public'
projection = 'full'
feed_uri = calendar_client.GetCalendarEventFeedUri(calendar=username, visibility=visibility, projection=projection) 

# define mysql db connection/credentials 
conn = MySQLdb.connect (host = "localhost", user = "test1", passwd = "test1", db = "googlecal", charset = "utf8", use_unicode = True) 
cursor = conn.cursor ()  
feed = calendar_client.GetCalendarEventFeed(uri=feed_uri) 

print 'Events on Primary Calendar: %s' % (feed.title.text,)
for i, an_event in enumerate(feed.entry):
     print '\t%s. %s' % (i, an_event.title.text,)
     data_point = {}
     data_point[ 'title' ] = (i, an_event.title.text,)
     print '\t%s. %s' % data_point[ 'title' ]
     for a_when in an_event.when:
         print '\t\t%s. %s' % (i, an_event.content.text,)
         data_point[ 'content' ] = (i, an_event.content.text,)
         print '\t\tStart time: %s' % (a_when.start,)
         data_point[ 'start' ] = (a_when.start,)
         print '\t\tEnd time:   %s' % (a_when.end,)
         data_point[ 'end' ] = (a_when.end,)

         cursor.execute("""insert into events (id, title, content, start, end)
         values (NULL, %s, %s, %s, %s)""",
         (data_point[ 'title' ],
           data_point[ 'content' ],
          data_point[ 'start' ],
          data_point[ 'end' ]))

conn.commit()
cursor.close () 
conn.close ()


... I believe the proper call to execute() would be:


Code:
         cursor.execute("""insert into events (id, title, content, start, end)
         values (NULL, %s, %s, %s, %s)""",
         (data_point[ 'title' ][1],
           data_point[ 'content' ][1],
          data_point[ 'start' ][0],
          data_point[ 'end' ][0]))



Although if it were my code, I'd modify the assignment statements to store strings instead of tuples in the data_point dictionary. Instead of:


Code:
data_point[ 'title' ] = (i, an_event.title.text,)
data_point[ 'content' ] = (i, an_event.content.text,)
data_point[ 'start' ] = (a_when.start,)
data_point[ 'end' ] = (a_when.end,)


You could try:


Code:
data_point[ 'title' ] = an_event.title.text
data_point[ 'content' ] = an_event.content.text
data_point[ 'start' ] = a_when.start
data_point[ 'end' ] = a_when.end


Which would then result in the cleaner:


Code:
         cursor.execute("""insert into events (id, title, content, start, end)
         values (NULL, %s, %s, %s, %s)""",
         (data_point[ 'title' ],
           data_point[ 'content' ],
          data_point[ 'start' ],
          data_point[ 'end' ]))


Regards,
Alister

Last edited by alister; 09-02-2011 at 06:35 PM..
This User Gave Thanks to alister For This Post:
# 12  
Old 09-02-2011
wow, okay, this really helps me understand, thanks again Alister Smilie . Though the pain, I really have learned something, and am very thankful for all the help. The final code, in case anyone wants to do something similar, is listed below:
Code:
#!/usr/bin/python   

from xml.etree import ElementTree 
import gdata.calendar.data 
import gdata.calendar.client 
import gdata.acl.data 
import atom.data 
import time 
import MySQLdb

calendar_client = gdata.calendar.client.CalendarClient()
username = 'user@gmail.com'
visibility = 'public'
projection = 'full'
feed_uri = calendar_client.GetCalendarEventFeedUri(calendar=username, visibility=visibility, projection=projection) 

# define mysql db connection/credentials 
conn = MySQLdb.connect (host = "localhost", user = "test1", passwd = "test1", db = "googlecal", charset = "utf8", use_unicode = True) 
cursor = conn.cursor ()  
feed = calendar_client.GetCalendarEventFeed(uri=feed_uri) 

print 'Events on Primary Calendar: %s' % (feed.title.text,)
for i, an_event in enumerate(feed.entry):
     data_point = {}
     data_point[ 'title' ] = an_event.title.text

     for a_when in an_event.when:
         data_point[ 'content' ] = an_event.content.text
         data_point[ 'start' ] = a_when.start
         data_point[ 'end' ] = a_when.end

         cursor.execute("""insert into events (id, title, content, start, end)
         values (NULL, %s, %s, %s, %s)""",
         (data_point[ 'title' ],
           data_point[ 'content' ],
          data_point[ 'start' ],
          data_point[ 'end' ]))

conn.commit()
cursor.close () 
conn.close ()

# 13  
Old 09-03-2011
You're quite welcome.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Url encoding a string using sed

Hi I was hoping some one would know if it is possible to url encode a string using sed? My problem is I have extracted some key value pairs from a text file with sed, and will be inserting these pairs as source variables into a curl script to automatically download some xml from our server. My... (5 Replies)
Discussion started by: Paul Walker
5 Replies

2. Solaris

View file encoding then change encoding.

Hi all!! I´m using command file -i myfile.xml to validate XML file encoding, but it is just saying regular file . I´m expecting / looking an output as UTF8 or ANSI / ASCII Is there command to display the files encoding? Thank you! (2 Replies)
Discussion started by: mrreds
2 Replies

3. Shell Programming and Scripting

Python replace string

Hi, I have a python variable with a value like this : string = "abc.de.fghijk.com:zyz.ab.fgfijk.com:abc.ef.fghijk.com" They are hostnames separated by the special character ":" . From this string I want to make a list with values : (2 Replies)
Discussion started by: ctrld
2 Replies

4. Shell Programming and Scripting

Remove lines between the start string and end string including start and end string Python

Hi, I am trying to remove lines once a string is found till another string is found including the start string and end string. I want to basically grab all the lines starting with color (closing bracket). PS: The line after the closing bracket for color could be anything (currently 'more').... (1 Reply)
Discussion started by: Dabheeruz
1 Replies

5. Shell Programming and Scripting

How to check string encoding?

I want to check if the string is WINDOWS-1251 or UTF-8 can you help me to find the string encoding??? or maybe to get URL Content-Type charset with wget? this is my function on PHP function check_utf8($str) { $len = strlen($str); for($i = 0; $i < $len; $i++){ $c =... (2 Replies)
Discussion started by: sanantonio7777
2 Replies

6. Shell Programming and Scripting

[python]string to list conversion

I have a file command.txt. It's content are as follows:- The content of file is actually a command with script name and respective arguments. arg1 and arg2 are dummy arguments , format : -arg arg_value test is a argument specifying run mode , format : -arg In my python code, i read it and... (1 Reply)
Discussion started by: animesharma
1 Replies

7. Shell Programming and Scripting

How to find the file encoding and updating the file encoding?

Hi, I am beginner to Unix. My requirement is to validate the encoding used in the incoming file(csv,txt).If it is encoded with UTF-8 format,then the file should remain as such otherwise i need to chnage the encoding to UTF-8. Please advice me how to proceed on this. (7 Replies)
Discussion started by: cnraja
7 Replies

8. Shell Programming and Scripting

Python String <--> Number

My question is so simple: A = raw_input("A ") if A == '56': VAR = (A + 54)/13 else: print "other operations" if I write in input 5656565656 i want to make some arithmetic operations if the first input is 56XXX but the output is TypeError: cannot concatenate 'str' and... (2 Replies)
Discussion started by: kazikamuntu
2 Replies

9. Shell Programming and Scripting

Python - Scan for string

Hi i have a variable 'reform' and store the lines like reform= { record string(8) ID; string(4) PRD; date("YYMMDD", split = "800101") DateofManufact; string(4) PRDC_MODULE_NUM; string(1) END_OF_RECORD = "\n"; } I need to search for the character "\n"in the above variable... (1 Reply)
Discussion started by: dhanamurthy
1 Replies

10. Shell Programming and Scripting

python and string.find

Hi all, I'm not sure if this is the right forum, but i'll give it a try. Here is my problem: i have two files having basically the same things in it (hostnames): File 1 mituap01 mituap02 mituap03 File 2: mituap01 mituap04 mituap05 my goal is to get a .py out to check if pcs' in... (0 Replies)
Discussion started by: penguin-friend
0 Replies
Login or Register to Ask a Question