[mkgmap-dev] New assertion, now with code-page=632 and Japan tile
From Gerd Petermann gpetermann_muenchen at hotmail.com on Tue Nov 16 15:48:58 GMT 2021
Hi all, this small patch would be my approach. It replaces those characters which don't fit into a byte by '?' This fixes the problems with japanese codepage 932. Gerd BTW: SparseTransliterator is very sparse. We could add a few more character mappings, for example there is a housenumber that contains "1237−1" instead of "1237-1". https://www.fontspace.com/unicode/analyzer#e=77yR77yS77yT77yX4oiS77yR ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag von Ticker Berkin <rwb-mkgmap at jagit.co.uk> Gesendet: Montag, 15. November 2021 15:59 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] New assertion, now with code-page=632 and Japan tile Hi How about something like: If the full string fails to encode in the target charset, process char at a time. If a char can't be represented, try transliteration on it and, if none defined, use "?", then go through the resultant string char at a time, and if this can't be represented, drop it. Maybe a final warning at end if no transliteration for a char or transliteration couldn't be represented. Ticker On Mon, 2021-11-15 at 13:04 +0000, Gerd Petermann wrote: > Hi all, > > > Maybe we should simply stop transliteration when this happens and > > return an empty string for the label? > > any thoughts on this? > > Gerd > > ________________________________________ > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag > von Gerd Petermann <gpetermann_muenchen at hotmail.com> > Gesendet: Mittwoch, 10. November 2021 11:17 > An: Development list for mkgmap > Betreff: Re: [mkgmap-dev] New assertion, now with code- > page=632 and Japan tile > > Hi devs, > > the problem occurs with node https://www.osm.org/node/5692472121 > name=키타가키 고로케 > Google translate says the name is Korean. The (utf8) name cannot be > translated into code-page 932 (japanese) and thus mkgmap converts the > internal utf16 representation of the name to bytes. This happens in > method AnyCharsetEncoder.encodeText(String text) in this loop: > for (int i = 0; i < s.length(); i++) > outBuf.put((byte) > s.charAt(i)); > The name 키타가키 고로케 ends with 케 and the char value is \ucf00, so it is > converted to \0x00. > Maybe we should simply stop transliteration when this happens and > return an empty string for the label? > > If mkgmap is executed without the -ea run time option the map shows > name 、タ for the restaurant which is just wrong. > Gerd > > ________________________________________ > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag > von Gerd Petermann <gpetermann_muenchen at hotmail.com> > Gesendet: Mittwoch, 10. November 2021 09:43 > An: Development list for mkgmap > Betreff: Re: [mkgmap-dev] New assertion, now with code- > page=632 and Japan tile > > Hi Carlos, > > I'll try to debug this. > > BTW: I see you use *.o5m for the tiles (output from splitter). I > think this is no longer a good choice, pbf is a lot smaller and > almost as fast. Esp. when it comes to the goal of reducing disk I/O > (as with --gmapi-minimal) > > Gerd > > ________________________________________ > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag > von Carlos Dávila <carlos at alternativaslibres.org> > Gesendet: Dienstag, 9. November 2021 22:54 > An: mkgmap-dev at lists.mkgmap.org.uk > Betreff: Re: [mkgmap-dev] New assertion, now with code- > page=632 and Japan tile > > Hi Ticker > > Not sure if relevant, but note in this case assertion occurs while > compiling the tile, not the index. In fact, --index is not included > in > the command. > > El 9/11/21 a las 21:55, Ticker Berkin escribió: > > Hi > > > > I think this assertion could be removed from the code. > > > > Looking through the definition of Shift-JIS, I read it as saying > > the > > second byte shouldn't be zero, so I don't know why this happens. > > > > As with the Chinese code-pages, mkgmap has places where multi-byte > > encodings are not handled correctly in the --index generation and > > unknown meanings of flags to the Garmin software. > > > > Ticker > > > > > > > > On 09/11/2021 19:43, Carlos Dávila wrote: > > > code-page=932, sorry for the typo. > > > > > > El 9/11/21 a las 20:36, Carlos Dávila escribió: > > > > The command below produces an assertion while compiling this > > > > tile > > > > <https://files.mkgmap.org.uk/download/526/31191025.o5m> from > > > > Japan. > > > > Process continues with remaining tiles and finishes without > > > > "Number > > > > of MapFailedExceptions: 1" as expected. This is with r4813, but > > > > I > > > > also tried with an old version of mkgmap with the same result. > > > > > > > > java -Xmx27G -ea -jar mkgmap.jar--code-page=632 31191025.o5m > > > > Mkgmap version 4813 > > > > Time started: Tue Nov 09 20:18:16 CET 2021 > > > > WARNING (global): Setting max-jobs to 8 > > > > Exception in thread "main" java.lang.AssertionError: found > > > > trailing > > > > 0 in chars > > > > at > > > > uk.me.parabola.imgfmt.app.labelenc.EncodedText.<init>(EncodedTe > > > > xt.java:39) > > > > > > > > at > > > > uk.me.parabola.imgfmt.app.labelenc.AnyCharsetEncoder.encodeText > > > > (AnyCharsetEncoder.java:112) > > > > > > > > at > > > > uk.me.parabola.imgfmt.app.lbl.LBLFile.newLabel(LBLFile.java:132 > > > > ) > > > > at > > > > uk.me.parabola.imgfmt.app.lbl.PlacesFile.createPOI(PlacesFile.j > > > > ava:253) > > > > at > > > > uk.me.parabola.imgfmt.app.lbl.LBLFile.createPOI(LBLFile.java:17 > > > > 2) > > > > at > > > > uk.me.parabola.mkgmap.build.MapBuilder.processPOIs(MapBuilder.j > > > > ava:670) > > > > at > > > > uk.me.parabola.mkgmap.build.MapBuilder.makeMap(MapBuilder.java: > > > > 325) > > > > at > > > > uk.me.parabola.mkgmap.main.MapMaker.makeMap(MapMaker.java:114) > > > > at > > > > uk.me.parabola.mkgmap.main.MapMaker.makeMap(MapMaker.java:62) > > > > at > > > > uk.me.parabola.mkgmap.main.Main.lambda$processFilename$1(Main.j > > > > ava:291) > > > > at > > > > java.base/java.util.concurrent.FutureTask.run(FutureTask.java:2 > > > > 64) > > > > at > > > > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Thr > > > > eadPoolExecutor.java:1128) > > > > > > > > at > > > > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Th > > > > readPoolExecutor.java:628) > > > > > > > > at java.base/java.lang.Thread.run(Thread.java:829) > > > > > > > > > > > > _______________________________________________ > > > > mkgmap-dev mailing list > > > > mkgmap-dev at lists.mkgmap.org.uk > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > > > > _______________________________________________ > > > mkgmap-dev mailing list > > > mkgmap-dev at lists.mkgmap.org.uk > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > _______________________________________________ > > mkgmap-dev mailing list > > mkgmap-dev at lists.mkgmap.org.uk > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev at lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev at lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev at lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev at lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev at lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: cs932.patch Type: application/octet-stream Size: 739 bytes Desc: cs932.patch URL: <http://www.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20211116/29cc1dbb/attachment.obj>
- Previous message: [mkgmap-dev] New assertion, now with code-page=632 and Japan tile
- Next message: [mkgmap-dev] New assertion, now with code-page=632 and Japan tile
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the mkgmap-dev mailing list