[mkgmap-dev] Name Substitution not correctly working
From Felix Hartmann extremecarver at gmail.com on Sun Jul 27 14:54:04 BST 2014
1. Yes - had I set notepad to default to UTF-8 I probably would have evaded the bug. (as long as you don't use create new document dialog on right click in Windows - they will always be in ANSI except if you do some registry hacks). And yes - the mkgmap style-file is in UTF-8 - but as a windows user you usually don't notice. Because it is without BOM - so as long as there is no Umlaut or other special character in it, notepad++ or probably most windows user will open the file as ANSI because as long as you don't use any such character - it is actually still identical. Where the mkgmap style-file in UTF-8 with BOM, it would be clearer... (but I don't want to start a with or without BOM discussion here). So right now only the address file in the style is quite safe - because recently there were some special characters added. /mkgmap:country=POL & mkgmap:region!=* & mkgmap:admin_level4=* { set mkgmap:region='${mkgmap:admin_level4|subst:województwo =>}' }/ But as long as there is no working check - and mkgmap default style-file comes in UTF-8 without BOM - there is quite big danger the bug will happen to others too... (for my style I now set it to UTF8 plus for added security (though it won't matter) I added a line : /#this is a UTF-8 check - ÖÄÜè/ so should any editor actually change the encoding to ANSI - I would directly notice... So such a line at the start could be an alternative to UTF-8 with BOM.. 2. about the patch: Mmmh - that patch goes a bit too far... - it actually stops at errors on input file (not style) too I think (note the time stamp 30 seconds later): 14:49:25 china cn 6555 this is run101 starting to compile openmtmbap with mkgmap Exception in thread "main" uk.me.parabola.mkgmap.scan.SyntaxException: Error: (stream:10089): Bad character in input, file probably not in utf-8 at uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239) at uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189) at uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154) at uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150) at uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111) at uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145) at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105) at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97) at uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126) at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638) at uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246) at uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125) at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134) at uk.me.parabola.mkgmap.main.Main.main(Main.java:105) Could Not Find C:\OpenMTBMap\maps\ovm_6555*.img 14:49:55 china cn 6555 Finished Compiling Openmtbmap - this is run101 mapsetbuilding failed - to few maxnodes?? Press any key to continue . . . vs (input file in ANSI): 15:11:38 china cn 6555 this is run101 starting to compile openmtmbap with mkgmap Exception in thread "main" uk.me.parabola.mkgmap.scan.SyntaxException: Error: (stream:10089): Bad character in input, file probably not in utf-8 at uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239) at uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189) at uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154) at uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150) at uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111) at uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145) at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105) at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97) at uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126) at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638) at uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246) at uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125) at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134) at uk.me.parabola.mkgmap.main.Main.main(Main.java:105) Could Not Find C:\OpenMTBMap\maps\ovm_6555*.img 15:11:42 china cn 6555 Finished Compiling Openmtbmap - this is run101 mapsetbuilding failed - to few maxnodes?? However now that I once had a file in ANSI - (even though changed back to UTF-8) some residue in memory means I always get directly the error - even on default style... C:\OpenMTBMap\maps>start /low /b /wait java -jar -XX:StringTableSize=100003 -Xms6000M -Xmx10300M c:\openmtbmap\mkgmap.jar --max-jobs=8 "--generate-sea" "--code-page=65001" "--precomp-sea=c:\openmtbmap\maps\sea.zip" --nsis --index --levels="0:24, 1:2 3, 2:22, 3:21, 4:20, 5:19, 6:18" --overview-levels="7:17, 8:16, 9:15, 10:14, 11:13, 12:12" --adjust-turn-headings --add-pois-to-areas --reduce-point-density=3.4 --reduce-point-density-polygon=6 --housenumbers --link-pois-to-ways --ignore-turn-restric tions --polygon-size-limits="24:16, 23:14, 22:12, 21:11, 20:10, 19:9, 18:8, 17:7, 16:6, 15:5, 14:4, 13:3, 12:2, 11:0, 10:0" --description=openmtbmap_gcc --show-profiles=1 --location-autofill=bounds,is_in,nearest --bounds=c:\openmtbmap\maps\bounds.z ip --route --country-abbr=gcc --country-name=gcc-states --mapname=65560000 --family-id=6556 --product-id=1 --series-name=openmtbmap_gcc-states_27.07.2014 --family-name=mtbmap_gcc_27.07.2014 --tdbfile --overview-mapname=mapsetc --keep-going --area-nam e="gcc-states_27.07.2014_openmtbmap.org" -c e:\openmtbmap\maps\template.gcc-states 7*.img 1>NUL Exception in thread "main" uk.me.parabola.mkgmap.scan.SyntaxException: Error: (stream:10089): Bad character in input, file probably not in utf-8 at uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239) at uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189) at uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154) at uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150) at uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111) at uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145) at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105) at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97) at uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126) at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638) at uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246) at uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125) at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134) at uk.me.parabola.mkgmap.main.Main.main(Main.java:105) On 27.07.2014 12:32, Steve Ratcliffe wrote: > On 26/07/14 18:43, Felix Hartmann wrote: >> Okay - I used ANSI. Could there maybe be a check for this in the check >> styles routine, or in general? >> I do suppose that must have been the problem. > > Although it is not always possible to tell if a file is in the wrong > encoding, it should have been in this case. I see that the ì > character gets converted to a unicode replacement character (0xfffd) > > If you had done: > echo 'Shì' > > it would have come out something like: Sh� (hope that works in email) > and shown the problem. yes - clearly. (and works in email somehow). > > There are a couple of ways to make bad characters an error, rather > than getting replaced. The attached patch allows them to > be replaced and then throws and error when seen. This has the > advantage of giving you file name and line number of the error. > It might interfere with something valid, so give it a try. > > I don't use notepad++, but these links might be useful: > > http://superuser.com/questions/292086/how-can-i-enforce-so-notepad-uses-utf-8-every-time-i-create-a-new-file > > > http://stackoverflow.com/questions/5090845/change-the-default-encoding-for-notepad > > > ..Steve -- keep on biking and discovering new trails Felix openmtbmap.org & www.velomap.org -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20140727/1a40afef/attachment.html>
- Previous message: [mkgmap-dev] Name Substitution not correctly working
- Next message: [mkgmap-dev] Name Substitution not correctly working
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the mkgmap-dev mailing list