Opened 19 years ago

Closed 19 years ago

Last modified 18 years ago

#29 closed defect (fixed)

compiler.parse() from std. library only takes a bytestring, not unicode

Reported by: arnarbi at gmail Owned by: cmlenz
Priority: major Milestone: 0.2
Component: Expression evaluation Version: 0.1
Keywords: Cc:


A template expression like this

${tg.fleirtala(subdir.nopages, u'síða', u'síður')}

fails with UnicodeEncodeError?.

The error originates from compiler.parse() which is called in markup.eval:_compile() This simple test generates the error:

>>> from compiler import parse
>>> parse(u"u'\xfe'")

The string has to be converted to a bytestring, and the encoding specified to parse via a '# -*- encoding: xxx -*-' line, or a UTF-8 byte order marker. Like this:

>>> parse("# -*- encoding: UTF-8 -*-\nu'\xc3\xbe'")
Module(u'\xc3\xbe', Stmt([]))

or this

>>> parse("\xef\xbb\xbfu'\xc3\xbe'")    # the \ef\xbb\xbf is the UTF-8 BOM
Module(u'\xc3\xbe', Stmt([]))

comment:1 Changed 19 years ago by cmlenz

  • Resolution set to fixed
  • Status changed from new to closed

Applied slightly modified version of the patch in [211]. Thanks!

