#29 closed defect (fixed)
compiler.parse() from std. library only takes a bytestring, not unicode
Reported by: | arnarbi at gmail | Owned by: | cmlenz |
---|---|---|---|
Priority: | major | Milestone: | 0.2 |
Component: | Expression evaluation | Version: | 0.1 |
Keywords: | Cc: |
Description
A template expression like this
${tg.fleirtala(subdir.nopages, u'síða', u'síður')}
fails with UnicodeEncodeError?.
The error originates from compiler.parse() which is called in markup.eval:_compile() This simple test generates the error:
>>> from compiler import parse >>> parse(u"u'\xfe'")
The string has to be converted to a bytestring, and the encoding specified to parse via a '# -*- encoding: xxx -*-' line, or a UTF-8 byte order marker. Like this:
>>> parse("# -*- encoding: UTF-8 -*-\nu'\xc3\xbe'") Module(u'\xc3\xbe', Stmt([]))
or this
>>> parse("\xef\xbb\xbfu'\xc3\xbe'") # the \ef\xbb\xbf is the UTF-8 BOM Module(u'\xc3\xbe', Stmt([]))
Attachments (2)
Change History (3)
Changed 18 years ago by arnarbi at gmail
Changed 18 years ago by arnarbi at gmail
Same as above, but only sends marked string to parse() and doesn't store it
comment:1 Changed 18 years ago by cmlenz
- Resolution set to fixed
- Status changed from new to closed
Applied slightly modified version of the patch in [211]. Thanks!
Note: See
TracTickets for help on using
tickets.
Patch that converts unicode expressions to byte-strings and adds BOM