[mkgmap-dev] Increasing the performance of whole-planet splits.

Fri Jun 11 00:13:42 BST 2010

> It'll be a few days before I have time to look at this in much detail and
> give you my thoughts but from what I can see it sounds promising. We
> actually
> use protobufs at my job for shunting around vast amounts of data so I'm
> pretty
> familiar with the pros and cons of them. One concern is that even if your
> binary format becomes reasonably standardised in the OSM world we'll still
> need to maintain .osm support for the forseeable future.Perhaps that can
>
be tackled by swapping out the existing cache format with your protobuf
> format.
>

I envisioned it as an alternate format that could exist in parallel with the
osm,
but also be smaller and faster. I still need to clean up the osmosis patches
and
turn them into a plugin.

> Even that I'm not sure about in the longer term; to tackling some of the
> harder problems with the splitter will require more than just a linear
> protobuf
> or cache format. Data structures such as Btree indexes and so on will
> likely
> be required, though I'm sure we can cross that bridge when we come to it.
>

What harder problems do you think the splitter needs solved?

I'm not so sure about your lazy HashMap initialisation in Element.java,
> though
> it does highlight a bug that was introduced by the recent r109
> multithreaded
> patch! Prior to patch r109, Element objects were never recreated, rather
> .reset() was called to just reset their state for each new node/way/rel.
>

The reason for this was the costs of initializing were showing up in the
profile.

> r109 changed that to create new instances each time, hence some of the
> object
> churn you were probably seeing that lead you to your patch. I'll have a
> think
> about to do here and probably run a few benchmarks to see what works best
> (eg keeping one current Node/Way/Rel per thread might be best).
>
>
May I suggest holding off on this for a bit? The branch that adds support
for reading
the binary file format refactors this interface so that node/way/rel objects
are passed
directly instead of using an interface that looks like an XML callbacks.

[As an aside, note that in your lazy HashMap patch Collections.EMPTY_SET
> results in an unchecked assignment warning. That of course is no big deal,
> easily fixed with Collections.<String, String>emptyMap().entrySet(). The
> sting though is that the call to Collections.EMPTY_SET.iterator() actually
> creates a new iterator object each time(!). Better would be to create a
> single
> static immutable iterator and return that]
>
>
Good catch. Should I respin the patch or will you?

> > My multithread changes reduced the real and user CPU time by 30-50% on
> > a uk extract. Other tests, e.g., a whole planet, show almost no
> > effect. I think this is because a lot of my changes increase the
> > efficiency of code that is parallelized, which paradoxically ends up
> > reducing the relative benefits of parellelization.
>
> This sounds promising, I'll look at what you've done further in the next
> couple of days and hopefully get it committed this weekend.
>
> > The single biggest remaining slowdown is that for each point, the
> > splitter sends each Node to each area (represented by an OSMWriter),
> > which writes it in the output file if it is in the bounds of the
> > region (with overlap), or does nothing otherwise. On a whole planet,
> > with 840 areas and 550M nodes, that is over 5 trillion bounds-checks
> > that are not run in parallel. An index that can can map a node to the
> > regions that may contain it could avoid 99% of these checks.
>
> This is something I've been aware of for a long time but haven't found the
> time to tackle yet. One idea I have is to generate an r-tree in memory and
> run the node comparisons against that to find out what area(s) each node
> falls into.

josm already includes a quadtree implementation. I wish it could be
disentangled and
re-used across the different OSM applications. It'd be a perfect solution
here.

> I'd estimate it would reduce the number of comparisons required
> from 840/node to less than 10. I suspect that would be a win over
> generating
> a node to area index which would be expensive to create intially, plus
> expensive
> in terms of disk access (or alternatively, require lots of RAM). Also, with
> a bit of effort these r-tree comparisons could be parallelised too.
>
>
Perfection isn't necessary.

How about a 64x64 array. Divide the world into 4096 regions. Each region
contains a list of all areas --- including their overlap --- that intersect
it. Then, for each node, you only need to iterate the OSMWriters in that
region. Getting the round-offs right would be tricky, but it should be good
enough filter out almost all non-matching areas.

I'll follow up in a while once I've had a bit more time to investigate and
> respond. Many many thanks for your work on all this, very much appreciated!
>
> Chris
>

Thanks for a useful program. Glad I could help make it better.

Scott
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20100610/77e1e983/attachment.html