<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
1. Yes - had I set notepad to default to UTF-8 I probably would have
evaded the bug. (as long as you don't use create new document dialog
on right click in Windows - they will always be in ANSI except if
you do some registry hacks).<br>
And yes - the mkgmap style-file is in UTF-8 - but as a windows user
you usually don't notice. Because it is without BOM - so as long as
there is no Umlaut or other special character in it, notepad++ or
probably most windows user will open the file as ANSI because as
long as you don't use any such character - it is actually still
identical. Where the mkgmap style-file in UTF-8 with BOM, it would
be clearer... (but I don't want to start a with or without BOM
discussion here).<br>
<br>
So right now only the address file in the style is quite safe -
because recently there were some special characters added.<br>
<i>mkgmap:country=POL & mkgmap:region!=* &
mkgmap:admin_level4=* { set
mkgmap:region='${mkgmap:admin_level4|subst:województwo =>}' }</i><br>
<br>
<br>
But as long as there is no working check - and mkgmap default
style-file comes in UTF-8 without BOM - there is quite big danger
the bug will happen to others too... (for my style I now set it to
UTF8 plus for added security (though it won't matter) I added a line
: <i>#this is a UTF-8 check - ÖÄÜè</i><br>
so should any editor actually change the encoding to ANSI - I would
directly notice... So such a line at the start could be an
alternative to UTF-8 with BOM.. <br>
<br>
<br>
2. about the patch:<br>
Mmmh - that patch goes a bit too far... - it actually stops at
errors on input file (not style) too I think (note the time stamp 30
seconds later):<br>
14:49:25 china cn 6555 this is run101 starting to compile openmtmbap
with mkgmap<br>
Exception in thread "main"
uk.me.parabola.mkgmap.scan.SyntaxException: Error: (stream:10089):
Bad character in input, file probably not in utf-8<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126)<br>
at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638)<br>
at
uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246)<br>
at
uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256)<br>
at
uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125)<br>
at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134)<br>
at uk.me.parabola.mkgmap.main.Main.main(Main.java:105)<br>
Could Not Find C:\OpenMTBMap\maps\ovm_6555*.img<br>
14:49:55 china cn 6555 Finished Compiling Openmtbmap - this is
run101<br>
mapsetbuilding failed - to few maxnodes??<br>
Press any key to continue . . .<br>
<br>
<br>
vs (input file in ANSI):<br>
15:11:38 china cn 6555 this is run101 starting to compile openmtmbap
with mkgmap<br>
Exception in thread "main"
uk.me.parabola.mkgmap.scan.SyntaxException: Error: (stream:10089):
Bad character in input, file probably not in utf-8<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126)<br>
at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638)<br>
at
uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246)<br>
at
uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256)<br>
at
uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125)<br>
at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134)<br>
at uk.me.parabola.mkgmap.main.Main.main(Main.java:105)<br>
Could Not Find C:\OpenMTBMap\maps\ovm_6555*.img<br>
15:11:42 china cn 6555 Finished Compiling Openmtbmap - this is
run101<br>
mapsetbuilding failed - to few maxnodes??<br>
<br>
<br>
<br>
However now that I once had a file in ANSI - (even though changed
back to UTF-8) some residue in memory means I always get directly
the error - even on default style...<br>
<br>
C:\OpenMTBMap\maps>start /low /b /wait java -jar
-XX:StringTableSize=100003 -Xms6000M -Xmx10300M
c:\openmtbmap\mkgmap.jar --max-jobs=8 "--generate-sea"
"--code-page=65001" "--precomp-sea=c:\openmtbmap\maps\sea.zip"
--nsis --index --levels="0:24, 1:2<br>
3, 2:22, 3:21, 4:20, 5:19, 6:18" --overview-levels="7:17, 8:16,
9:15, 10:14, 11:13, 12:12" --adjust-turn-headings
--add-pois-to-areas --reduce-point-density=3.4
--reduce-point-density-polygon=6 --housenumbers --link-pois-to-ways
--ignore-turn-restric<br>
tions --polygon-size-limits="24:16, 23:14, 22:12, 21:11, 20:10,
19:9, 18:8, 17:7, 16:6, 15:5, 14:4, 13:3, 12:2, 11:0, 10:0"
--description=openmtbmap_gcc --show-profiles=1
--location-autofill=bounds,is_in,nearest
--bounds=c:\openmtbmap\maps\bounds.z<br>
ip --route --country-abbr=gcc --country-name=gcc-states
--mapname=65560000 --family-id=6556 --product-id=1
--series-name=openmtbmap_gcc-states_27.07.2014
--family-name=mtbmap_gcc_27.07.2014 --tdbfile
--overview-mapname=mapsetc --keep-going --area-nam<br>
e="gcc-states_27.07.2014_openmtbmap.org" -c
e:\openmtbmap\maps\template.gcc-states 7*.img 1>NUL<br>
Exception in thread "main"
uk.me.parabola.mkgmap.scan.SyntaxException: Error: (stream:10089):
Bad character in input, file probably not in utf-8<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150)<br>
at
uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97)<br>
at
uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126)<br>
at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638)<br>
at
uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246)<br>
at
uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256)<br>
at
uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125)<br>
at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134)<br>
at uk.me.parabola.mkgmap.main.Main.main(Main.java:105)<br>
<br>
<br>
<div class="moz-cite-prefix">On 27.07.2014 12:32, Steve Ratcliffe
wrote:<br>
</div>
<blockquote cite="mid:53D4D54B.6000403@parabola.me.uk" type="cite">On
26/07/14 18:43, Felix Hartmann wrote:
<br>
<blockquote type="cite">Okay - I used ANSI. Could there maybe be a
check for this in the check
<br>
styles routine, or in general?
<br>
I do suppose that must have been the problem.
<br>
</blockquote>
<br>
Although it is not always possible to tell if a file is in the
wrong
<br>
encoding, it should have been in this case. I see that the ì
<br>
character gets converted to a unicode replacement character
(0xfffd)
<br>
<br>
If you had done:
<br>
echo 'Shì'
<br>
<br>
it would have come out something like: Sh� (hope that works in
email)
<br>
and shown the problem.
<br>
</blockquote>
yes - clearly. (and works in email somehow).<br>
<blockquote cite="mid:53D4D54B.6000403@parabola.me.uk" type="cite">
<br>
There are a couple of ways to make bad characters an error, rather
<br>
than getting replaced. The attached patch allows them to
<br>
be replaced and then throws and error when seen. This has the
<br>
advantage of giving you file name and line number of the error.
<br>
It might interfere with something valid, so give it a try.
<br>
<br>
I don't use notepad++, but these links might be useful:
<br>
<br>
<a class="moz-txt-link-freetext" href="http://superuser.com/questions/292086/how-can-i-enforce-so-notepad-uses-utf-8-every-time-i-create-a-new-file">http://superuser.com/questions/292086/how-can-i-enforce-so-notepad-uses-utf-8-every-time-i-create-a-new-file</a>
<br>
<br>
<a class="moz-txt-link-freetext" href="http://stackoverflow.com/questions/5090845/change-the-default-encoding-for-notepad">http://stackoverflow.com/questions/5090845/change-the-default-encoding-for-notepad</a>
<br>
<br>
..Steve
<br>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
keep on biking and discovering new trails
Felix
openmtbmap.org & <a class="moz-txt-link-abbreviated" href="http://www.velomap.org">www.velomap.org</a></pre>
</body>
</html>