[mkgmap-dev] Increasing the performance of whole-planet splits.
From Scott Crosby scrosby06 at gmail.com on Fri Jun 11 00:13:42 BST 2010
> It'll be a few days before I have time to look at this in much detail and > give you my thoughts but from what I can see it sounds promising. We > actually > use protobufs at my job for shunting around vast amounts of data so I'm > pretty > familiar with the pros and cons of them. One concern is that even if your > binary format becomes reasonably standardised in the OSM world we'll still > need to maintain .osm support for the forseeable future.Perhaps that can > be tackled by swapping out the existing cache format with your protobuf > format. > I envisioned it as an alternate format that could exist in parallel with the osm, but also be smaller and faster. I still need to clean up the osmosis patches and turn them into a plugin. > Even that I'm not sure about in the longer term; to tackling some of the > harder problems with the splitter will require more than just a linear > protobuf > or cache format. Data structures such as Btree indexes and so on will > likely > be required, though I'm sure we can cross that bridge when we come to it. > What harder problems do you think the splitter needs solved? I'm not so sure about your lazy HashMap initialisation in Element.java, > though > it does highlight a bug that was introduced by the recent r109 > multithreaded > patch! Prior to patch r109, Element objects were never recreated, rather > .reset() was called to just reset their state for each new node/way/rel. > The reason for this was the costs of initializing were showing up in the profile. > r109 changed that to create new instances each time, hence some of the > object > churn you were probably seeing that lead you to your patch. I'll have a > think > about to do here and probably run a few benchmarks to see what works best > (eg keeping one current Node/Way/Rel per thread might be best). > > May I suggest holding off on this for a bit? The branch that adds support for reading the binary file format refactors this interface so that node/way/rel objects are passed directly instead of using an interface that looks like an XML callbacks. [As an aside, note that in your lazy HashMap patch Collections.EMPTY_SET > results in an unchecked assignment warning. That of course is no big deal, > easily fixed with Collections.<String, String>emptyMap().entrySet(). The > sting though is that the call to Collections.EMPTY_SET.iterator() actually > creates a new iterator object each time(!). Better would be to create a > single > static immutable iterator and return that] > > Good catch. Should I respin the patch or will you? > > My multithread changes reduced the real and user CPU time by 30-50% on > > a uk extract. Other tests, e.g., a whole planet, show almost no > > effect. I think this is because a lot of my changes increase the > > efficiency of code that is parallelized, which paradoxically ends up > > reducing the relative benefits of parellelization. > > This sounds promising, I'll look at what you've done further in the next > couple of days and hopefully get it committed this weekend. > > > The single biggest remaining slowdown is that for each point, the > > splitter sends each Node to each area (represented by an OSMWriter), > > which writes it in the output file if it is in the bounds of the > > region (with overlap), or does nothing otherwise. On a whole planet, > > with 840 areas and 550M nodes, that is over 5 trillion bounds-checks > > that are not run in parallel. An index that can can map a node to the > > regions that may contain it could avoid 99% of these checks. > > This is something I've been aware of for a long time but haven't found the > time to tackle yet. One idea I have is to generate an r-tree in memory and > run the node comparisons against that to find out what area(s) each node > falls into. josm already includes a quadtree implementation. I wish it could be disentangled and re-used across the different OSM applications. It'd be a perfect solution here. > I'd estimate it would reduce the number of comparisons required > from 840/node to less than 10. I suspect that would be a win over > generating > a node to area index which would be expensive to create intially, plus > expensive > in terms of disk access (or alternatively, require lots of RAM). Also, with > a bit of effort these r-tree comparisons could be parallelised too. > > Perfection isn't necessary. How about a 64x64 array. Divide the world into 4096 regions. Each region contains a list of all areas --- including their overlap --- that intersect it. Then, for each node, you only need to iterate the OSMWriters in that region. Getting the round-offs right would be tricky, but it should be good enough filter out almost all non-matching areas. I'll follow up in a while once I've had a bit more time to investigate and > respond. Many many thanks for your work on all this, very much appreciated! > > Chris > Thanks for a useful program. Glad I could help make it better. Scott -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20100610/77e1e983/attachment.html
- Previous message: [mkgmap-dev] Increasing the performance of whole-planet splits.
- Next message: [mkgmap-dev] Increasing the performance of whole-planet splits.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the mkgmap-dev mailing list