[mkgmap-dev] SeaGenerator patch v1
From Gerd Petermann gpetermann_muenchen at hotmail.com on Fri Jan 11 21:00:37 GMT 2013
Hi, attached is seaGen_v2.patch > > the index for the precomp-sea method stores 131072 * 2 = 262.144 Strings > > with String.intern() in a HashMap (one instance for each job). > > That's a bit over estimated because the Strings are interned. Therefore > 131072+<number of sea tiles>+2= ~143000 Strings are used (over all > jobs). Of course the HashMap remains for each job with 131072 entries. Yes, that's right. I forgot that only for mixed tiles a file name is saved. > > > Attached is a patch that changes the index to use a byte[512][256] array > > > 131072 bytes. > > > > It reduces run time and memory needs in mkgmap (when --precomp-sea is used) > > Why does it reduce the run time? Did you measure it? See also below. think the most important point regarding runtime is that it avoids String.intern() calls. This has two effects: 1) ~ 143000 calls take some time 2) the interned strings make it more likely the further String.intern() calls have collisions in the HashMap that is used by intern(). That means that the storing of the tags is slowed down a bit. > Can you please give numbers. How much memory is saved? I don't want to > commit optimizations that change the readability of code without having > a good reason. I ran mkgmap with 20 tiles from germany (max-nodes 1600000) using the parms and style from toc-rox "Freizeitkarte". java -Xmx7000m -agentlib:hprof=cpu=samples,depth=10 -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -jar d:\mkgmap_trunk\dist\mkgmap.jar --max-jobs=4 -c f:\osm\klaus\t.cfg -c template.args > m The 1st version is r2443 , the 2nd is 2443 with the OSMId patch + the reduce_peak_mem_v1.patch (SeaGeneraor excluded) the 3rd is like the 2nd plus the attached seaGen_v2.patch Numbers: (1st,2nd,3rd) Total time 149 s / 142 s / 135s Garbagecat reported these: 1st: ======================================== SUMMARY: ======================================== # GC Events: 176 GC Event Types: PARALLEL_SCAVENGE, PARALLEL_SERIAL_OLD Max Heap Space: 4224448K Max Heap Occupancy: 3490537K Max Perm Space: 82688K Max Perm Occupancy: 44234K Throughput: 56% Max Pause: 3279 ms Total Pause: 65358 ms First Timestamp: 343 ms Last Timestamp: 147189 ms 2nd: # GC Events: 185 GC Event Types: PARALLEL_SCAVENGE, PARALLEL_SERIAL_OLD Max Heap Space: 3681024K Max Heap Occupancy: 3190176K Max Perm Space: 83968K Max Perm Occupancy: 44876K Throughput: 60% Max Pause: 3163 ms Total Pause: 57316 ms First Timestamp: 320 ms Last Timestamp: 142164 ms 3rd: # GC Events: 168 GC Event Types: PARALLEL_SCAVENGE, PARALLEL_SERIAL_OLD Max Heap Space: 3807680K Max Heap Occupancy: 3156373K Max Perm Space: 34176K Max Perm Occupancy: 31855K Throughput: 60% Max Pause: 2622 ms Total Pause: 54084 ms First Timestamp: 318 ms Last Timestamp: 135052 ms > > > > > One problem: The unpatched version would allow to have different formats > > (*.o5m, *.osm.pbf, .osm.gz) in one sea directory. > > The patched version assumes that all files have the same prefix and > > extension. > > So, if you want to update a precompiled sea tile you have to make sure that > > you keep the name and format. > > I hope this small limit is no problem? > > That's ok. An error should be printed if that's not the case. OK. I've added a check to detect that. > > Please rework the patch a bit. > 1. Instead of instantiating an Area object to get minLat etc. you can > use the Utils.toMapUnit(double) method > 2. Document what's in the byte array. Constants improve readability. > 3. Maybe create an accessor method to the byte array so you avoid to > blow up the other code with index calculations. > 4. Remove unused variables > 5. Instead of x and y use e.g. latIndex and lonIndex as variable names > so that it is easy to read the code. Good points. I did not find unused variables, but tried to do the rest. > > By the way: using the ThreadLocal requires loading the index in each > thread which is not optimal. But at the time implementing it I didn't > want to bother about synchronizing the load procedure. This might be a > point where some run time can be saved. Regarding memory there is no more reason to bother. Regarding performance it would be better to use a binary format to avoid all the pattern.split() calls, but the profiler shows that the patched version requires only ~0.2 % of the total run time, so I think it's okay now. Ciao, Gerd -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20130111/be42db45/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: seaGen_v2.patch Type: application/x-download Size: 9537 bytes Desc: not available Url : http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20130111/be42db45/attachment-0001.bin
- Previous message: [mkgmap-dev] SeaGenerator patch v1
- Next message: [mkgmap-dev] SeaGenerator patch v1
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the mkgmap-dev mailing list