<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div>
<p><span style="font-family:verdana,geneva,sans-serif;"><span style="font-size: 12px;">Hi Gerd,</span></span></p>
<p><span style="font-family:verdana,geneva,sans-serif;"><span style="font-size: 12px;">at first: many thanks for your work. I'm typically just reading your posts and using mkgmap with a lot of fun.</span></span></p>
<p><span style="font-family:verdana,geneva,sans-serif;"><span style="font-size: 12px;">For this topic I will try to explain my thougths, but maybe I didn't understood all those options</span></span></p>
<p><span style="font-family:verdana,geneva,sans-serif;"><span style="font-size: 12px;">- the original name should always be in the index, because that's it what everybody knows and expects to find<br/>
- with --x-mdr7-excl=X we should avoid the entry "X" in the index</span></span></p>
<p><span style="font-family:verdana,geneva,sans-serif;"><span style="font-size: 12px;">for example:</span></span></p>
<pre wrap=""><span style="font-family:verdana,geneva,sans-serif;"><span style="font-size: 12px;">Using --x-mdr7-excl=Road,Street,Chenin,des,de in combination with --x-split-name-index</span></span></pre>
<span style="font-family:verdana,geneva,sans-serif;"><span style="font-size: 12px;">should insert into index, for<br/>
name="ABC Straße" - in index: "ABC Straße", but not "Straße"<br/>
name="Straße des 17. Juni" - in index "Straße des 17. Juni", "des 17. Juni", "17. Juni", "Juni"<br/>
name="Chenin de Pierre Froide" - in index "Chenin de Pierre Froide", "de Pierre Froide", "Pierre Froide", "Froide"</span></span>
<p><span style="font-family:verdana,geneva,sans-serif;"><span style="font-size: 12px;">We should think like the user of the map. He will not necessarily know about splitting street names. He just likes to find the name. And he knows the name or at least a part of it. I agree that nobody will search for "des 17. Juni", everybody will search for the complete name "Straße des 17. Juni" or "17. Juni". But this is nearly impossible to describe which part we need and which not.</span> </span></p>
<div>
<div>
<div name="quote" style="margin:10px 5px 5px 10px; padding: 10px 0 10px 10px; border-left:2px solid #C3D9E5; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><span style="font-family:verdana,geneva,sans-serif;">Another option would be to allow regular expressions in the exclude list, but that would require more</span>
<div name="quoted-content"><span style="font-family:verdana,geneva,sans-serif;">effort (input file instead of single option) and probably much more run time.</span></div>
</div>
</div>
</div>
<div><span style="font-family:verdana,geneva,sans-serif;">I wouldn't do that.</span></div>
<div> </div>
<div>
<div>
<div>
<div name="quote" style="margin:10px 5px 5px 10px; padding: 10px 0 10px 10px; border-left:2px solid #C3D9E5; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<div name="quoted-content"><span style="font-family:verdana,geneva,sans-serif;">I'd prefer to have a logic which first analyses all strings added by the --x-split-name-index option so that<br/>
only those are generated which do not appear more than x % .</span></div>
</div>
</div>
</div>
<div><span style="font-family:verdana,geneva,sans-serif;">Oh that idea is good, but I'm not sure if it will work. Think about a map like a map of the Alps. This map covers a lot of different countries with different languages. I haven't checked it, but I assume that the most streets will be in german speaking countries (Germany, Austria, Switzerland). How should we compute a value for appearing of strings for France? Or another example: "Straße" is very common in Germany, but if I have a map with whole France and a small part of Germany, then "Straße" would just get a low value and therefore gets included. So if we would do something like that, then we need a value for each country. </span></div>
<div><span style="font-family:verdana,geneva,sans-serif;">And think about the street names, which nearly every town in Germany has, like "Hauptstraße". </span></div>
<div><span style="font-family:verdana,geneva,sans-serif;">So, I wouldn't do that.</span></div>
<div> </div>
</div>
<p><span style="font-family:verdana,geneva,sans-serif;"><span style="font-size: 12px;">Overall: I would prefer an easy to understand rule.</span></span></p>
<p><span style="font-family:verdana,geneva,sans-serif;"><span style="font-size: 12px;">best regards,</span><br/>
<span style="font-size: 12px;">Gert</span></span></p>
<div> </div>
<div>
<div name="quote" style="margin:10px 5px 5px 10px; padding: 10px 0 10px 10px; border-left:2px solid #C3D9E5; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<div style="margin:0 0 10px 0;"><b>Gesendet:</b> Dienstag, 04. April 2017 um 17:13 Uhr<br/>
<b>Von:</b> "Gerd Petermann" <GPetermann_muenchen@hotmail.com><br/>
<b>An:</b> "mkgmap-dev@lists.mkgmap.org.uk" <mkgmap-dev@lists.mkgmap.org.uk><br/>
<b>Betreff:</b> [mkgmap-dev] Meaning of option --x-mdr7-excl</div>
<div name="quoted-content">Hi all,<br/>
<br/>
I did not yet document this option because I don't think that it is useful as it is implemented now.<br/>
I think it works fine for english speeking countries with road names like "Abc Street" and "Xyz Road".<br/>
Using --x-mdr7-excl=Road,Street in combination with --x-split-name-index will work fine.<br/>
<br/>
A different picture is a frensh country.<br/>
Let's look at an example. Assume you have options --index and --x-split-name-index<br/>
The road name "Chemin de Pierre Froide" is added to the index as<br/>
"Chemin de Pierre Froide"<br/>
and because of --x-split-name-index the following extries are also added:<br/>
"de Pierre Froide"<br/>
"Pierre Froide"<br/>
"Froide"<br/>
Now, would you expect a change if you use option --x-mdr7-excl=Chemin,Rue,Aveue ?<br/>
And what would you expect with --x-mdr7-excl=de,du,la ?<br/>
<br/>
With the current implementation there would be no change in output, because the<br/>
the check works in this way:<br/>
Build the string that should be added to the index<br/>
Check if that string is in the exclude list, if not, add it to the index.<br/>
<br/>
I might change that like this:<br/>
Build the string that should be added to the index<br/>
Check if the first word in that string is in the exclude list, if not, add it to the index.<br/>
<br/>
With this change the option --x-mdr7-excl=Chemin,Rue,Aveue<br/>
would exclude the entry<br/>
"Chemin de Pierre Froide"<br/>
and --x-mdr7-excl=de,du,la would exclude<br/>
"de Pierre Froide"<br/>
<br/>
Another option would be to allow regular expressions in the exclude list, but that would require more<br/>
effort (input file instead of single option) and probably much more run time.<br/>
<br/>
I'd prefer to have a logic which first analyses all strings added by the --x-split-name-index option so that<br/>
only those are generated which do not appear more than x % .<br/>
<br/>
Comments?<br/>
<br/>
Gerd<br/>
_______________________________________________<br/>
mkgmap-dev mailing list<br/>
mkgmap-dev@lists.mkgmap.org.uk<br/>
<a href="http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev" target="_blank">http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev</a></div>
</div>
</div>
</div>
<div> </div>
<div class="signature"> </div></div></body></html>