[mkgmap-dev] New assertion, now with code-page=632 and Japan tile
From Ticker Berkin rwb-mkgmap at jagit.co.uk on Thu Nov 18 17:47:53 GMT 2021
Hi Gerd For any code-page except Japanese/cp932, AnyCharSetEncoder takes anything that can't be represented, tries to find a reasonable ascii representation or "?", then writes this to the output. This is a big assumption for far-eastern charsets, most likely generating garbage with possible invalid shift-in/out requests... SparseTranslitorator is a very strange special case, without any explanation. Doing a bit of searching, it was submitted as a change because user had map that needed to be in Japanese/cp932 and it also contained latin characters. The characters with macrons couldn't be encoded. Many others could. The rest of Unicode that can't be encoded resulted in garbage. Your patch fixes the "rest of Unicode" problem for cp932. It misses any ability of the 'latin1' transliterator to provide reasonable replacement chars that can be encoded. It doesn't deal with possible problems for other (non-european) charsets. I've attached cs932-V3.patch that addresses both of these issues. SparseTranslitorator.java can the be removed. Ticker On Wed, 2021-11-17 at 18:00 +0000, Gerd Petermann wrote: > Hi Ticker, > > > For some other character sets the result could be invalid or > > garbage. > OK, I assumed that '?' is always at the same position, might be wrong > with that. > SparseTransliterator is only used for cs932. > > Gerd -------------- next part -------------- A non-text attachment was scrubbed... Name: cs932-v3.patch Type: text/x-patch Size: 3753 bytes Desc: not available URL: <http://www.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20211118/9b2abae1/attachment.bin>
- Previous message: [mkgmap-dev] New assertion, now with code-page=632 and Japan tile
- Next message: [mkgmap-dev] New assertion, now with code-page=632 and Japan tile
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the mkgmap-dev mailing list