logo separator

[mkgmap-dev] Commit r4809: fix java.lang.AssertionError while building index from unicode tiles

From Ticker Berkin rwb-mkgmap at jagit.co.uk on Wed Oct 27 10:11:36 BST 2021

Hi

There are a lot of problems relating to --index with all multi-byte
character sets except Unicode

1/ Unicode sets misc flags here and there in the MDR logic. It is
unclear if these are relating to fixed >1 byte, variable length, or
unicode explicitly.

2/ There is some logic that works out various positions of name
components in the final output encoding and this only handles single-
byte or unicode.

3/ The sort/collation tables are set for cp1252, so all but western-
european characters will be ignored, resulting in what should be
different names existing only once in the indexing structures.

The fix I'm working on for "r4809 crashes by buildung mdr" will stop
this crash, but won't change any of the above.

I don't know if there has ever been an attempt to make mkgmap indexing
work for character sets like cp836.

Ticker

On Sun, 2021-10-24 at 18:04 +0100, Ticker Berkin wrote:
> Hi Carlos & Gerd
> 
> Changing the default sort used when resources/sort/cp... matching the
> code-page doesn't exist from cp1252 to cp65001/unicode stops this
> crash. It probably gives a much better index. 
> 
> Except in the cases where the requested codepage uses
> transliterations
> from resources/chars/ascii or latin1, I think the default sort should
> be unicode. I haven't yet investigated how and when these
> transliterations occur.
>  
> Tomorrow I'll look at reasons why the exception happens, even when
> the
> sort is discarding significant characters.
> 
> Ticker
> 
> On Sun, 2021-10-24 at 18:31 +0200, Carlos Dávila wrote:
> > The reason for using code-pages other than 65001 is that many
> > Garmin 
> > here: 
> > https://openmtbmap.org/download/odbl/#Compatibility_-_Unicode_vs_Non_Unicode_cannot_authenticate_maps
> > 
> > El 24/10/21 a las 18:14, Ticker Berkin escribió:
> > > Hi Carlos
> > > 
> > > When mkgmap doesn't have a resources/sort for the given code
> > > page,
> > > it
> > > defaults the sort to cp1252 (Western European).
> > > 
> > > As part of building the the various indexes, it sorts counties,
> > > regions, cities, streets etc using this sort, but any characters
> > > that
> > > don't have a defined sort order are ignored in the ordering. The
> > > result
> > > of this is that, using cp1252 on Chinese, all names seem the
> > > same.
> > > 
> > > I suspect that indexes are mostly empty and find is ignoring
> > > them.
> > > 
> > > There is some logic that is differentiating the names in these
> > > structures on exact naming, and this inconsistency causes the
> > > assertion
> > > crash.
> > > 
> > > The actual output in the map image is cp836, which Basecamp and
> > > Mapsource appear to handle. I don't know how well it is supported
> > > by
> > > Garmin devices.
> > > 
> > > Is there a reason for using cp836 rather than cp65001/unicode?
> > > 
> > > Ticker
> > > 
> > > On Sun, 2021-10-24 at 16:22 +0200, Carlos Dávila wrote:
> > > > using copy from JOSM/paste into BaseCamp, I could test address
> > > > searches
> > > > and they seem to work.
> > > > 
> > > > El 23/10/21 a las 23:50, Ticker Berkin escribió:
> > > > > Hi Carlos
> > > > > 
> > > > > mkgmap doesn't have a resources/sort for code-page 936
> > > > > (Microsoft's
> > > > > character encoding for simplified Chinese). I was surprised
> > > > > it
> > > > > doesn't
> > > > > give any warning about this. I'll look more closely tomorrow
> > > > > to
> > > > > see
> > > > > what happens when it doesn't find the resource file.
> > > > > 
> > > > > I presume this didn't crash before, but did the index work?
> > > > > 
> > > > > I suspect this will have many of the same problems as unicode
> > > > > sort
> > > > > had
> > > > > for unspecified characters.
> > > > > 
> > > > > I'll also investigate the other change relating to collation
> > > > > strength.
> > > > > 
> > > > > Ticker
> > > > > 
> > > > > On Sat, 2021-10-23 at 22:26 +0200, Carlos Dávila wrote:
> > > > > > Hi devs.
> > > > > > 
> > > > > > With this new version I get a new crash, but now with --
> > > > > > code-
> > > > > > page=936,
> > > > > > not with unicode:
> > > > > > 
> > > > > > Exception in thread "main" java.lang.AssertionError: mdr20
> > > > > > value
> > > > > > changed
> > > > > > f=5174 t=5180 count=2995
> > > > > >            at
> > > > > > uk.me.parabola.imgfmt.app.mdr.Mdr5Record.setMdr20(Mdr5Recor
> > > > > > d.
> > > > > > java
> > > > > > :134
> > > > > > )
> > > > > >            at
> > > > > > uk.me.parabola.imgfmt.app.mdr.Mdr20.buildFromStreets(Mdr20.
> > > > > > ja
> > > > > > va:8
> > > > > > 4)
> > > > > >            at
> > > > > > uk.me.parabola.imgfmt.app.mdr.MDRFile.writeSections(MDRFile
> > > > > > .j
> > > > > > ava:
> > > > > > 335)
> > > > > >            at
> > > > > > uk.me.parabola.imgfmt.app.mdr.MDRFile.write(MDRFile.java:27
> > > > > > 0)
> > > > > >            at
> > > > > > uk.me.parabola.mkgmap.combiners.MdrBuilder.onFinish(MdrBuil
> > > > > > de
> > > > > > r.ja
> > > > > > va:3
> > > > > > 31)
> > > > > >            at
> > > > > > uk.me.parabola.mkgmap.main.Main.endOptions(Main.java:690)
> > > > > >            at
> > > > > > uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArg
> > > > > > sR
> > > > > > eade
> > > > > > r.ja
> > > > > > va:126)
> > > > > >            at
> > > > > > uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:147)
> > > > > >            at
> > > > > > uk.me.parabola.mkgmap.main.Main.main(Main.java:118)
> > > > > > 
> > > > > > mkgmap command: java -ea -jar mkgmap-r4809.jar --index
> > > > > > --bounds=bounds.zip --housenumbers --code-page=936
> > > > > > 31177013.o5m
> > > > > > 
> > > > > > https://files.mkgmap.org.uk/download/524/31177013.o5m
> > > > > > 
> > > > > > El 22/10/21 a las 9:42, svn commit escribió:
> > > > > > > Version mkgmap-r4809 was committed by gerd on Fri, 22 Oct
> > > > > > > 2021
> > > > > > > 
> > > > > > > fix java.lang.AssertionError while building index from
> > > > > > > unicode
> > > > > > > tiles
> > > > > > > mdrUnicode_v2.patch by Ticker Berkin
> > > > > > > 
> > > > > > > http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4809
> > > > > > > _______________________________________________
> > > > > > > mkgmap-dev mailing list
> > > > > > > mkgmap-dev at lists.mkgmap.org.uk
> > > > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > > > > > _______________________________________________
> > > > > > mkgmap-dev mailing list
> > > > > > mkgmap-dev at lists.mkgmap.org.uk
> > > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > > > > _______________________________________________
> > > > > mkgmap-dev mailing list
> > > > > mkgmap-dev at lists.mkgmap.org.uk
> > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > > > _______________________________________________
> > > > mkgmap-dev mailing list
> > > > mkgmap-dev at lists.mkgmap.org.uk
> > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > > 
> > > _______________________________________________
> > > mkgmap-dev mailing list
> > > mkgmap-dev at lists.mkgmap.org.uk
> > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > 
> > _______________________________________________
> > mkgmap-dev mailing list
> > mkgmap-dev at lists.mkgmap.org.uk
> > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> 
> 
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev at lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev




More information about the mkgmap-dev mailing list