Implementing GetGeneralizedCanonicalURL() in TopSitesCache.
Suppose that TopSites contain only "http://www.google.com/a", GetGeneralizedCanonicalURL() would allow us to find it using "http://www.google.com/a/b" as the search key.
Our previous prefix match routine (renamed to GetSpecializedCanonicalURL()) works the other way: it allows to find "http://www.google.com/a" using "http://www.google.com/" as the search key.
There are several ways to implement GetGeneralizedCanonicalURL():
1. Go through successive prefixes, and look in TopSites. E.g., "http://www.google.com/a/b/c/d" forms the queries
http://www.google.com/a/b/c/d
http://www.google.com/a/b/c
http://www.google.com/a/b
http://www.google.com/a
http://www.google.com
2. Go through stored URL TopSites and seek prefixes of input URL.
3. Similar to 2, but applying binary search, using GetSpecializedCanonicalURL() to bound calls.
Method #1 can lead to O(n^2) behaviour depending on input URL size. Method #3 probably introduces unnecessary complexity. That's why we chose method #2.
An optimization is that we "bracket" the range of TopSites URLs that share the same scheme+path+port as the input URL, then performs a search in reversed-direction.
Some refactoring are also done for TopSitesCache():
- Added helper routine GetURLFromIterator().
- Added helper inner class CanonicalURLQuery() to factor repetative code from querying |top_sitse_|.
BUG=304954
Review URL: https://codereview.chromium.org/
26032002
git-svn-id: svn://svn.chromium.org/chrome/trunk/src@228084 0039d316-1c4b-4281-b951-d872f2087c98