Posted on 2008-07-05
I like to end my URLs with a file name (just my personal preference). For SEO, I want to make sure that the default page (index.html
in my case) for a directory is indexed only once, so I wrote a little RedirectIndex stub that I had mapped to the various possible directories.
Internally, I always use the page name, so I figured that was good enough. However, GoogleBot seems to guess that index.html
is the default page, and I caught it trying to access a page with a query string without the file name. In other words, it tried to hit /tag/square/?page=2
instead of /tag/square/index.html?page=3
.
Changing the mappings from /
to /(?:[?].*)?
solved it. It seems like something in the webapp needs the group to be non-capturing. Otherwise you get
Traceback (most recent call last): File "/path/to/google_appengine/google/appengine/ext/webapp/__init__.py", line 499, in __call__ handler.get(*groups) TypeError: get() takes exactly 1 argument (2 given)
Tags: appengine