Unicode Problem with SQLAlchemy?

I found this article that helped explain my troubles somewhat: amk.ca/python/howto/unicode#reading-and-... I was able to get the desired results by using the 'codecs' module and then changing my program as follows: When opening the file: infile = codecs. Open(filename, 'r', encoding='iso-8859-1') When printing the location: print location. Encode('ISO-8859-1') I can now query and manipulate the data from the table without the error from before.

I just have to specify the encoding when I output the text (I still don't entirely understand how this is working so I guess it's time to learn more about Python's unicode handling...).

I found this article that helped explain my troubles somewhat: amk.ca/python/howto/unicode#reading-and-... I was able to get the desired results by using the 'codecs' module and then changing my program as follows: When opening the file: infile = codecs. Open(filename, 'r', encoding='iso-8859-1') When printing the location: print location. Encode('ISO-8859-1') I can now query and manipulate the data from the table without the error from before.

I just have to specify the encoding when I output the text. (I still don't entirely understand how this is working so I guess it's time to learn more about Python's unicode handling...).

1 I would try "cp1252" first before "iso-8859-1". And I don't know if the following helps at all: stackoverflow. Com/questions/368805/… – tzot Jun 10 '09 at 22:10.

From sqlalchemy. Org See section 0.4.2 added new flag to String and create_engine(), assert _unicode=(True|False|'warn'|None). Defaults to False or None on create _engine() and String, 'warn' on the Unicode type.

When True, results in all unicode conversion operations raising an exception when a non-unicode bytestring is passed as a bind parameter. 'warn' results in a warning. It is strongly advised that all unicode-aware applications make proper use of Python unicode objects (i.e.

U'hello' and not 'hello') so that data round trips accurately. I think you are trying to input a non-unicode bytestring. Perhaps this might lead you on the right track?

Some form of conversion is needed, compare 'hello' and u'hello'. Cheers.

Try using a column type of Unicode rather than String for the unicode columns: Base = declarative_base() class Point(Base): __tablename__ = 'points' id = Column(Integer, primary_key=True) pdate = Column(Date) ptime = Column(Time) location = Column(Unicode(32)) weather = Column(String(16)) high = Column(Float) low = Column(Float) lat = Column(String(16)) lon = Column(String(16)) image = Column(String(64)) caption = Column(String(64)) Edit: Response to comment: If you're getting warnings about unicode encodings then there are two things you can try: Convert your location to unicode. This would mean having your Point created like this: newpoint = Point(filename, pdate, ptime, unicode(location), weather, high, low, lat, lon, image, caption) The unicode conversion will produce a unicode string when passed either a string or a unicode string, so you don't need to worry about what you pass in. If that doesn't solve the encoding issues, try calling encode on your unicode objects.

That would mean using code like: newpoint = Point(filename, pdate, ptime, unicode(location). Encode('utf-8'), weather, high, low, lat, lon, image, caption) This step probably won't be necessary but what it essentially does is converts a unicode object from unicode code-points to a specific byte representation (in this case, utf-8). I'd expect SQLAlchemy to do this for you when you pass in unicode objects but it may not.

Thank you for the suggestion. I think this is heading me in the right direction. I'm now getting warnings about the encoding of the data I'm inserting but I'm unsure of how to fix this.

I've updated my question to reflect your suggestion. – Dave Forgac Jun 8 '09 at 19:39.

I know I'm having a problem with a conversion from Unicode but I'm not sure where it's happening. I'm extracting data about a recent Eruopean trip from a directory of HTML files. Some of the location names have non-ASCII characters (such as é, ô, ü).

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions