[mkgmap-dev] Character decoding used by mkgmap when reading style files

Sun Mar 30 14:25:49 BST 2014

Dear mkgmap developers,

   I was developing some string substitution rules in the lines file and we
use non-ASCII characters in certain street names prefixes, then I have the
following rule:

highway=* & name ~ '(?i)pra[cçÇ]a\s+.*' { add
streettype:movedend='${name|subst:(?i)pra[cçÇ]a\s+~>}, Pc.'} #

   Which does not work because mkgmap is using a character decoding to read
the style files that is messing with my regexes.  I had configured my text
editor to save the files in UTF-8 and Latin-1 (ISO-8859-1) for no avail.
    I ended up in Unicode escaping the accented characters to make them
work:

highway=* & name ~ '(?i)pra[c\u00E7\u00C7]a\s+.*' { add
streettype:movedend='${name|subst:(?i)pra[c\u00E7\u00C7]a\s+~>}, Pc.'} #

    My suggestion is to make mkgmap read the style files using some broader
character decoding (eg.: UTF-8).

    A side note: the (?u) modifier is not working.  This enables Unicode
case.  Without this I must expand, say "ç" into a character class [cCçÇ].
 The (?iu) modifier would cause "ç" to match "c", "C", "ç" and "Ç", for
example.  This applies to the dozens of other accented characters found
throughout the world.  And given the multitude of street name prefixes
(Brazil, my country, has 50+), this results in quite large and complex
style files.

    This is low priority.  I can get along with Unicode escaping but this
turns the rules more difficult to read and to maintain.

Good job in your latest stable release!

Paulo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20140330/98fa1177/attachment.html>