[mkgmap-dev] Splitter --cache parameter
From Chris Miller chris.miller at kbcfp.com on Sun Aug 23 17:49:57 BST 2009
I've just checked in some changes to the splitter that add a new --cache parameter. This is designed to speed up the splitting process, especially on large splits that require multiple passes over the .osm file, or in situations where you run the splitter several times on the same .osm file with different parameters each time. If you want to enable the disk caching, you specify the cache location as follows: --cache=<directory> This will cause the splitter to generate several files in the specified directory during the first stage of the split (the areas.list calculation). These files contain the same information as the source .osm file(s) do, but in an optimised format that allow subsequent passes over the data to happen much more quickly. The more passes that happen in the second stage of the split, the greater the speedup you will see. Some benchmarks on my PC have shown the following speed improvements when running against uncompressed .osm files: 1 pass - 5% faster 2 passes - 25% faster 3 passes - 35% faster 4 passes - 40% faster 5 passes - 45% faster If you are using compressed .osm files (bz2 compression especially), the speed improvement should be greater still, since the decompression will only need to happen once rather than on each pass. Note however that these figures are very approximate; the actual performance will vary depending on your disk and CPU speed, the particular map being processed, and what other disk and CPU activity is taking place on your PC at the same time. In some cases you might find that splits that only require a single pass will run faster without the disk cache enabled. The disk cache can also be used across multiple runs of the splitter, as long as you are splitting the same .osm file(s) each time. For example suppose you ran a splitter as follows: java -Xmx4000m -jar splitter.jar --cache=. --max-nodes=1500000 europe.osm If you then run mkgmap and discover the max-nodes setting is too high, you can run the splitter again with a lower max-nodes value like so: java -Xmx4000m -jar splitter.jar --cache=. --max-nodes=1200000 Because the cache files already exist for europe.osm as a result of the first run, there's no need to specify europe.osm on the rerun. The data will be loaded from the cache instead and the split will run much faster. Be careful to delete the cache files if you want to rerun the splitter on a different .osm file, otherwise the previously cached data will be used from the original .osm file instead. (I'll probably add a check for this situation, but there's nothing in place to prevent it just yet.) Note that the disk cache can require a lot of disk space, typically about 20-25% of the space the uncompressed .osm file takes up. For example the 27GB europe.osm file generates a cache of just over 5GB. The --cache parameter is entirely optional. if you don't specify it, the splitter will work in exactly the same way it did previously. I hope the above explanation makes sense. Any questions, comments or suggestions are welcome. Cheers, Chris
- Previous message: [mkgmap-dev] Overlays and routing
- Next message: [mkgmap-dev] Splitter --cache parameter
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the mkgmap-dev mailing list