[mkgmap-dev] splitter r254
From Gerd Petermann gpetermann_muenchen at hotmail.com on Mon Dec 10 11:12:14 GMT 2012
Hello Klaus, here is a comparison to the trunk version r202 (I hope it is complete without being to complex). It would be great if someone else could put this into a better readable format once the changes are in the trunk version. Corrections:- Prevent overflow in node counters reported here: http://gis.19327.n5.nabble.com/Bug-in-splitter-tp5610856.html - Missing data because of rounded/trimmed bounding boxes, reported here: http://gis.19327.n5.nabble.com/mkgmap-splitter-or-mkgmap-leave-out-information-on-luxembourg-osm-pbf-from-geofabrik-tp5731208.html Added debugging features :+ parameter stop-after allows to stop execution after a given program phase was executed. This saves time when debugging / testing the new split algorithm (see below) + parameter output=simultate allows to simulate the whole split process without writing data to tiles. I use this to avoid writing masses of data to my SSD + if no split-file is given, splitter will write a file densities-out.txt containing the densitiy data that was used to calculate the tile areas. When debugging, you can rename this file to densities.txt and place it into the same directory as splitter.jar. If splitter finds such a file, it will read the content of the file instead of parsing the input file. This saves a lot of time (the densities.txt for whole planet has just ~ 47Mb) Other new features (less important first): + in addition to the areas.list file splitter writes an area.poly in the osmosis polygon file format. + o5m format supported for reading and writing: For input, the file name has to end with .o5m, for output you have to specify parameter --output=o5m . The o5m format requires more disk space but is faster to read. This is espicially true on slower cpus. + polygon file handling: With parameter --polygon-file you can pass a bounding polygon to splitter. This is probaly only useful when you want to use an input file that contains much more data than the map that you want to create, for example you may create a polygon file covering scandinavia and use europe or planet as input. The polygon file is only used when splitter has to calculate the areas (no --split-file parameter given) and it is only used to calculate the areas. With a given polygon file, a special split algorithm is used which tries to create tiles that cover the bounding polygon completely, but not too much outside of the polygon. The parameter no-trim is ignored if --polygon-file is used. + a new split algorithm was implemented to address two problems: ++ r202 may create tiles with only a few nodes, this leads to serious problems described here: http://gis.19327.n5.nabble.com/Serious-Bug-Mkgmap-creating-map-that-puts-news-GPS-confirmed-etrex-30-Oregon-550-into-bootloop-tp5508055p5512646.html ++ r202 with no-trim=true may create huge, almost empty tiles, this leads to problems in mkgmap. Details were described here: http://www.mkgmap.org.uk/pipermail/mkgmap-dev/2012q1/013611.html The new algorithm tries to optimize the created tiles so that - the number of tiles is small - the aspect ratio is near 1 (values between 0.25 and 4 are considered to be nice) - no tile contains less than max-nodes/3 nodes - no tile is larger than 90° in longitudes and 85° in latitude It is not alwys possible to find a split that meats all these goals, esp. not if you provide a bounding polygon. A few users reported problems in mkgmap with the results of the new algorithm (higher memory needs, smaller max-jobs parm needed) It is not yet clear if these problems are to be solved in splitter or in mkgmap. + problem-list handling: Two new approaches have been implemented to solve the frequently reported problem of flooding: http://gis.19327.n5.nabble.com/Still-problems-with-lakes-tp5725668.html These problems are caused by the split process. Splitter r202 simply divides multipolygon relations into parts that lie within one tile. Later, mkgmap has to guess how the original polygon was closed. This guessing fails from time to time. The solution in r202 is to specify a large enough overlap value. Approach 1) The new parameter --problem-file allows to specify a list of known problem relations and ways. A list containing many problem cases can be found here: http://wiki.openstreetmap.org/wiki/Mkgmap/help/problematic_polygons To use such a file you have to specify --problem-file=<path to file> A way or relation listed in this file is treated specially by splitter: - ways: ++ if the way is closed (first and last node reference are equal), splitter calculates the bounding box of the way and writes the complete way to each tile that intersects with the bounding box (complete means with all referenced nodes that were found in the input file) ++ if the way is not closed, splitter calculates the tiles that are crossed by the way and writes the complete data to those tiles - relations: A relation is completely written to all tiles that - contain one or more nodes listed as members of the relation - contain one or more nodes listsed as members of the ways of the relation - are crossed by one or more ways of the relation - are enclosed by one or more ways of a type=multipolygon relation Note that a relation with type=multipolygon is treated similar to a single closed way, splitter calulates the bounding box of each area enclosed by one or more ways building a closed polygon. The complete relation is written to each tile that intersects with any of the calculated bounding boxes. The problem file still has some disadvantages: - not up to date until maintanance - user has to verify the result of the map, if something is wrong, he has to find the id of the way or relation that causes the problem, add it to the problem file and restart the whole process of map creation. This can be very time consuming and it is still likely that the user will not find all broken polygons. The solution is Approach 2) the parameter --keep-complete which should be used instead of --problem-file With keep-complete splitter reads the input file multiple times to detect those polygons that are divided during the split process. Splitter thus creates the list of problem cases and handles them exactly the same way as described above. Advantage of --keep-complete compared to --problem-file : - no need to maintain a list of problem cases Advantage of --keep-complete and problem-file compared to a large --overlap value: - makes sure that problem polygons are complete (of course only if input file is complete) - doesn't write a lot of "noise" like houes or road which are in the overlap area, but not at all related to the bounding box of the tile Drawback of --keep-complete compared to --problem-file: - Splitter is slower because it has to read the input file more often and the processing of all problem ways and relations requires additional memory on heap. On a 32bit system, it is not possible to split whole planet with --keep-complete, because you need around 4GB of heap to process all the problem cases. On the other hand, on a 64bit system with at least 8GB you can split planet using e.g. java -Xmx7000m -jar splitter.jar --max-areas=2048 --keep-complete--output=xml planet.o5m (note that I use output=xml because both the o5m and pbf writer require too much heap to write ~1500 areas in one pass. Each open *.o5m or *.pbf file requires more than 1Mb for string tables and other stuff, the xml writer needs almost no fixed storage) -For some tiles, unneeded data is written if they lie within the bounding box of huge multipolygon relation, but not within any of the polygons described by the relation. Gerd > Date: Sun, 9 Dec 2012 12:49:31 -0800 > From: easyclasspage at googlemail.com > To: mkgmap-dev at lists.mkgmap.org.uk > Subject: Re: [mkgmap-dev] splitter r254 > > Hi Gerd, > > release candidate ... sounds good after all the hard work. > > Is it possible for you to write a (short) summery concerning all changes and > enhancements ? > (I have to admit that I got lost in "all" the splitter threads.) > > Regards Klaus > > > > -- > View this message in context: http://gis.19327.n5.nabble.com/splitter-r254-tp5739717p5739736.html > Sent from the Mkgmap Development mailing list archive at Nabble.com. > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev at lists.mkgmap.org.uk > http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20121210/5027cfd7/attachment.html
- Previous message: [mkgmap-dev] splitter r254
- Next message: [mkgmap-dev] splitter r254
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the mkgmap-dev mailing list