<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 12pt;
font-family:Calibri
}
--></style></head>
<body class='hmmessage'><div dir='ltr'>Hi Steve,<br><br>I fear I don't understand what problem you see<br>with roads like 'The Avenue'<br>My understanding is that we put the full name into the<br>index, so the road can be found. On the other hand,<br>nobody would expect to find this road by typing <br>just avenue, right?<br><br>Gerd<br><br><div>> Date: Mon, 16 Feb 2015 00:21:26 +0000<br>> From: steve@parabola.me.uk<br>> To: mkgmap-dev@lists.mkgmap.org.uk<br>> Subject: Re: [mkgmap-dev] mixed index branch merge<br>> <br>> <br>> Hi<br>> <br>> There are some interesting comments here.<br>> <br>> I did have code to count the number of times certain words appeared in<br>> a name in attempt to automatically create a stop word list for a map.<br>> It turned out that it wasn't all that useful, for England at least.<br>> <br>> From the numbers you get stop words such as 'The', 'Avenue' and<br>> 'Road' as you would expect. However many streets have names such as<br>> 'The Avenue' 'Avenue Road' and so on that consist entirely of<br>> likely stop words. And these are not theoretical names that occur<br>> infrequently, these are names of streets that I know.<br>> <br>> I think we really need to be able to identify which parts of the<br>> name are useful to index, rather than which parts are not.<br>> <br>> So for England I think that the only rule required is to index from<br>> the beginning of the name, as now.<br>> <br>> For places where streets are named after people and there is<br>> no word for 'street' included, and the street is generally<br>> refered to by the second name then probably adding entries<br>> for all parts of the name will work.<br>> <br>> For places where there is a word for street at the beginning<br>> then we have to step over that word and any following<br>> prepositions etc. So for France not just<br>> "Rue", but any following "de", "des", "d'" etc.<br>> <br>> The required action does of course depend on language rather than<br>> country, but we don't in general have the language, so we will have to<br>> start out using the country (or perhaps region) and see how that goes.<br>> I suspect it will work quite well, but if not we can think of<br>> something else when the problems are more well known.<br>> <br>> I guess we will start out having configurable rule types and<br>> word lists, but we need to gather sensible defaults once<br>> a working system is developed for each country.<br>> <br>> ..Steve<br>> _______________________________________________<br>> mkgmap-dev mailing list<br>> mkgmap-dev@lists.mkgmap.org.uk<br>> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev<br></div>                                            </div></body>
</html>