[mkgmap-dev] StandardCharsets and try (with-resources)
From Ticker Berkin rwb-mkgmap at jagit.co.uk on Sun Jan 19 18:30:12 GMT 2020
Hi Gerd Here is new version of patch with line.trim() restored and exception thrown. @mike - It is likely that this will fix your problem with the display of option text with non-ascii characters; with previous code, mkgmap *read* the text incorrectly unless your local charset is was utf-8. Ticker On Fri, 2020-01-17 at 17:04 +0000, Ticker Berkin wrote: > Hi Gerd > > The line.trim() deletion wasn't intended - I'll put it back. > > I think it best to change sortForCode IOException to throw > ExitException. Maybe they meant to return some default "Sort", ie > sortForCodepage(1252), but this seems wrong. > > I started looking at CombinedStyleFileLoader. It does its Input and > Output in the default charset and I don't know if anyone uses it > anymore, but I didn't want to change any of its behaviour, so I > thought > best not to touch it. > > Reg. new class for files that use '#' for comments. Some of these > already use TokenScanner which can be configured. The only other one > that a quick grep finds is the character transliteration tables, so I > don't think it is worth it at the moment. > > Ticker > > On Fri, 2020-01-17 at 16:20 +0000, Gerd Petermann wrote: > > Hi Ticker, > > > > - I think there is a small change in the handling of lines in > > OsmMapDataSource.readDeleteTagsFile. The old code used > > line = line.trim(); > > This is missing now. Is that intended? > > > > - I also don't understand the line with your comment "// ??? I > > don't > > understand this" . Looks like an endless recursive call? > > > > - You sometimes replaced FileReader, but not in > > CombinedStyleFileLoader. Why not? > > > > We have a few places where we read files which use "#" for comment > > lines. Would it help to create a class for that? > > > > I made a few minor mods, see attachment. > > > > Gerd > > > > ________________________________________ > > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag > > von Ticker Berkin <rwb-mkgmap at jagit.co.uk> > > Gesendet: Freitag, 17. Januar 2020 13:53 > > An: Development list for mkgmap > > Betreff: [mkgmap-dev] StandardCharsets and try (with-resources) > > > > Hi Gerd > > > > Attached patch > > > > - uses StandardCharsets.* where possible. > > > > - notes some usage of the java local DefaultCharset. > > > > - changed a couple of these to force utf-8 instead. > > > > - if --read-config file gives decoding errors, names the charset > > used > > to read the file (ie DefaultCharset) instead of 'utf-8' in the > > error > > message. > > > > - accepts/ignores unicode BOM in more files > > > > - uses try (open...) {} where possible in files changed for the > > above > > reasons. > > > > There is some code in > > mkgmap/srt/SrtTextReader.java:sortForCodepage() > > that I don't understand; it would appear to get into a recursive > > loop > > on IOException. > > > > Ticker > > > > On Tue, 2020-01-14 at 09:55 +0000, Gerd Petermann wrote: > > > Hi Ticker, > > > > > > yes, and every missing close() is a brain teaser ;) > > > We have a few places where files are opened and closed in a > > > different > > > method. This is likely to cause trouble in unit tests, esp. on > > > Windows. > > > Whereever possible we should use try-with-ressources instead of > > > Utils.closeFile() and add a comment > > > like in SeaGenerator line > > > in zipFile = new ZipFile(precompSeaDir); // don't close here! > > > when a file is intentionally kept open. > > > > > > Gerd > > > > ________________________________________ > > > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im > > > Auftrag > > > von Ticker Berkin <rwb-mkgmap at jagit.co.uk> > > > Gesendet: Dienstag, 14. Januar 2020 10:43 > > > An: Development list for mkgmap > > > Betreff: Re: [mkgmap-dev] TYP files and character encoding > > > > Hi Gerd > > > > Here is updated patch that closes the file, although I find > > > > many > > > files > > > in mkgmap that don't have explicit close(), but I presume > > > .finalize() > > > will close them eventually. > > > > I'll do another patch for other text file handling, using > > > StandardCharset where possible and fixing TokenScanner message > > > for > > > bad > > > characters if not utf-8 and, if reasonable, allowing a BOM even > > > if > > > the > > > file is opened as utf-8 anyway. > > > > Ticker > > > > On Tue, 2020-01-14 at 08:21 +0000, Gerd Petermann wrote: > > > > Hi Ticker, > > > > > > > > thanks for the patch. > > > > > > > > Please review TypCompiler.CharsetProbe. BufferedReader br is > > > > not > > > > closed. Is that intended? > > > > > > > > I see that we have a mix of "utf-8" and "UTF-8" in the mkgmap > > > > sources. I think it would be good to use StandardCharsets.UTF_8 > > > > where > > > > possible > > > > and unify the rest. > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev at lists.mkgmap.org.uk > http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: utf8_v3.patch Type: text/x-patch Size: 24368 bytes Desc: not available URL: <http://www.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20200119/eede4cbe/attachment-0001.bin>
- Previous message: [mkgmap-dev] StandardCharsets and try (with-resources)
- Next message: [mkgmap-dev] TYP files and character encoding
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the mkgmap-dev mailing list