[mkgmap-dev] MDR building out-of-memory
From Gerd Petermann gpetermann_muenchen at hotmail.com on Wed May 12 17:41:45 BST 2021
Hi Ticker, reg. Sort instances: My understanding is that there is only one instance. So, no case for GC, right? Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag von Ticker Berkin <rwb-mkgmap at jagit.co.uk> Gesendet: Mittwoch, 12. Mai 2021 17:59 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] MDR building out-of-memory Hi Gerd I did a bit more experimenting with LargeListSorter and Mdr7/11 and have some changes - patch attached. 1/ explicit param to LargeListSorted to useCache 2/ do nothing if 0 or 1 records to sort 3/ reduce chunkSize to reduce memory usage, increase maxDepth so same overall limit 4/ allocate temp "merged" with correct length 5/ useCache parameters and remarks on usage in Mdr7 & 11 6/ prevent a copy of allStreets 7/ share the collator and allow it and "sort" to be garbage collected 8/ A comments that sort of partial streets must use sortkey, even if only sorting a few record. LargeListSorter is a handy way of doing this for a simple/single sortKey per record. Sortkey simply makes a byte array of the string converted into the target charset and sort does a byte comparison on this. collator.compare(str1, str2) ignores the shield chars (and possibly the other prefix/suffix markers) during the comparison and has extra dynamic processing looking to handle char aliases (PRIMARY strength etc). The comment at the top of Sort.java: * found that sorting with the sort keys and the collator gave different results in some * cases. This implementation does not. is not really correct. An alternative collator string comparison could be provided that behaves in the same way as sort.createSortKey. This would speed up 'on -the-fly' sorts, maybe becoming feasible. Ticker On Wed, 2021-05-12 at 09:08 +0000, Gerd Petermann wrote: > Hi Ticker, > > I think the MultiSortKeys were introduced because the on-the-fly solution was far too slow, at least with the normal Java sort. Could well be that the problem was solved with Java 8 or newer releases. > > I think there was a special case with maps containing huge numbers of equally named roads causing extreme run times. > This depends on the style. Some styles add a name like "tr2" for each unnamed track with tracktype=grade2. > > Gerd > > ________________________________________ > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag von Ticker Berkin <rwb-mkgmap at jagit.co.uk> > Gesendet: Mittwoch, 12. Mai 2021 10:48 > An: Development list for mkgmap > Betreff: Re: [mkgmap-dev] MDR building out-of-memory > > Hi Gerd > > Certainly no cache. Maybe reduce the chunk size, but this might > increase copying. > > It could be improved by doing a linear chunk split/sort then a multi > -way merge. This would avoid lots of copying assuming the following: > > Using the original array to store sorted chunks demands that another > array of the same full size is needed for the final merge. If each > sorted key chunk is converted to a object chunk and these merged, then > although the same total size is needed, it is made of number of smaller > arrays. > > The most space efficient solution might be have Mdr11 "implements > Comparable" and generate pairs of sortkeys on the fly and let the java > sort take care of all the details. > > The other use of LargeListSorter is Mdr7. I get a higher hit-rate > (~50%) for the first/partialSorter (1046096 allstreets). > > However, for the repeated fullNameSorter on the partial results, most > of the lists are just 1 long, with very few more than 10. I guess this > depends on --road-name-config/split-name and use of shields etc, but > LargeListSorted seems overkill. > > Ticker > > > On Tue, 2021-05-11 at 19:15 +0000, Gerd Petermann wrote: > > Hi Ticker, > > > > I think the cache is not meant to improve run time, it is used to > > deduplicate and thus reduce memory. Maybe it would be better to use a > > smaller chunk size and no cache. > > No idea why I didn't use > > merged = new ArrayList<>(len); > > > > Gerd > > > > ________________________________________ > > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag > > von Ticker Berkin <rwb-mkgmap at jagit.co.uk> > > Gesendet: Dienstag, 11. Mai 2021 19:19 > > An: Development list for mkgmap > > Betreff: Re: [mkgmap-dev] MDR building out-of-memory > > > > Hi Gerd > > > > I've looked at this, and if Mdr5 space becomes a problem again, I'll > > consider converting to it. > > > > A couple of comments: > > > > My map had 2909735 poi so the sort chunk size was ~727433. The cache > > sizes after each chunk were 563095, 603595, 597239 & 605718. Is the > > cache worth-while for this low hit-rate? Just running the Gmapsupp > > combiner on existing tiles (without --route, so no streets), I got a > > run time of 1 min 44 secs with the cache and 1 min 30 without! > > However > > most of this time is copying the tiles into gmapsupp.img so not an > > accurate statistic. > > > > You could pre-allocate List<> "merged" with the correct size. > > > > Ticker > > > > On Tue, 2021-05-11 at 14:40 +0000, Gerd Petermann wrote: > > > Hi Ticker, > > > > > > I've committed the patch as is. I've not seen big changes in > > > performance, but I've used a different (already existing) set of > > > files which was created with my own style. For me, > > > Mdr11.preWriteImpl() is the most problematic part reg. OOM errors. > > > > > > Maybe look at the code which uses LargeListSorter. > > > > > > Gerd > > > > > > ________________________________________ > > > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag > > > von Ticker Berkin <rwb-mkgmap at jagit.co.uk> > > > Gesendet: Dienstag, 11. Mai 2021 13:27 > > > An: Development list for mkgmap > > > Betreff: Re: [mkgmap-dev] MDR building out-of-memory > > > > > > Hi Gerd > > > > > > Here is updated version of patch. > > > > > > Changes from last: > > > > > > Uses your cache code for region and country (in 2 places). For > > > British > > > Isles, there are 190 regions and 7 countries, so I don't think the > > > extra memory will be a problem and there should be some performance > > > benefit. > > > > > > Delays allocating cities until it can use sortKeys.size() for > > > initial > > > allocation. For above map this is 0.07% too big, so I don't think > > > trimToSize() is worthwhile. > > > > > > Shares the Sort object between the 4 methods. > > > > > > Ticker > > > _______________________________________________ > > > mkgmap-dev mailing list > > > mkgmap-dev at lists.mkgmap.org.uk > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > _______________________________________________ > > mkgmap-dev mailing list > > mkgmap-dev at lists.mkgmap.org.uk > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > _______________________________________________ > > mkgmap-dev mailing list > > mkgmap-dev at lists.mkgmap.org.uk > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev at lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev at lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
- Previous message: [mkgmap-dev] MDR building out-of-memory
- Next message: [mkgmap-dev] Commit r4702: show stacktrace when OutOfMemoryError occurs, helps to decide if maxjobs should be decreased or if one of the combiners is the problem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the mkgmap-dev mailing list