[mkgmap-dev] java.lang.AssertionError while building index from unicode tiles
From Steve Ratcliffe steve at parabola.me.uk on Fri Oct 22 14:24:02 BST 2021
Hi Ticker > Problem is that resources/sort/cp65001.txt doesn't give ordering to > lots of characters; it looks like it covers only about 10,500 of the > 1,112,064 possible code-points. Many of these non-ordered characters > are being used by the names in the tile in question. I used the program in extra/src/uk/me/parabola/util/CollationRules.java to generate some of the tables. This uses the file "allkeys.txt" which can be obtained from https://www.unicode.org/Public/UCA/latest/allkeys.txt The document explaining the unicode collation rules that references that file is: http://www.unicode.org/reports/tr10/ It includes a section for programmatically deriving the weights for characters that do not have explicit entries in the table. > Assuming the actual ordering of unspecified code-points doesn't really > matter, I propose to change the logic slightly so undefined Unicode is > sorted on its 16-bit value after the range of known sorts. I think that is a good initial approach to get things working. Steve
- Previous message: [mkgmap-dev] java.lang.AssertionError while building index from unicode tiles
- Next message: [mkgmap-dev] Commit r4809: fix java.lang.AssertionError while building index from unicode tiles
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the mkgmap-dev mailing list