[mkgmap-dev] [Patch v1] reduce peak memory usage for global index

Sat Apr 1 05:36:35 BST 2017

Hi Steve,

thanks. Reg. cache:
I'll check again with a larger set of input tiles. Probably you are right, the more tiles the smaller the percentage of single use keys.
I'm now creating a complete map of Europe for further tests.

Gerd
________________________________________
Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag von Steve Ratcliffe <steve at parabola.me.uk>
Gesendet: Freitag, 31. März 2017 22:42:14
An: mkgmap-dev at lists.mkgmap.org.uk
Betreff: Re: [mkgmap-dev] [Patch v1] reduce peak memory usage for global        index

Hi Gerd,

> 1) always release memory for mdr19, not only if it was filled for the device.
> 2) check if sort key has zero length, if yes, don't allocate new buffer
> 3) don't use MultiSortKey for mdr7 if --x-split-name-index is not used
> 4) count occurences of the generated key strings, cache sort keys for those keys which occur more than 3 times.
> 5) create smaller copy of byte array if number of allocated bytes was too high
>
> @Steve: Please review, I think 1-4 are okay, 5) may cause problems if my understanding of Sort.fillkey() is wrong.

Looks OK.

I agree to add cache in Mdr7, don't know why I omitted it
to begin with.

But does the complex counted cache improve memory use over the simple
case?  As the cache is temporary, after the routine returns there will
be a greater memory use from all the keys that are not de-duplicated.
It seems to me that the temporary memory required during
preWriteImpl() would also be larger unless there are a vast number of
single use keys.

> I was also surprised to see that we require such long sort keys, even with --latin1.
> I see many 0 bytes in the created keys, maybe that can be optimized further?

Its possible, see for example:

   http://www.unicode.org/reports/tr10/#Implementation_Notes

As I was working out how the Srt files worked, I gradually came to
realise that it was pretty much the same way that Java does collation.
I changed the syntax of the resource/sort/*.txt files to more closely
match the language that is used by RuleBasedCollator().  There is also
the icu4j project http://site.icu-project.org/

I can't remember how close either of them were to working as needed
for mkgmap. In the end my code was faster, so I stayed with it.
Its possible that the built in java code could be made to work and
it might be faster now, or perhaps I missed a trick to make it so
at the time.

..Steve

_______________________________________________
mkgmap-dev mailing list
mkgmap-dev at lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev