Brendel Consulting

Apr 18, 2012

Random primary keys for Django models

The automatically generated primary keys for Django models tend to be just sequentially increasing integer numbers. If you expose those IDs at any point to your users, for example in the URLs referring to particular IDs, then your users can oftentimes easily guess how many users your site has, or how many objects of a certain kind your database holds.

You could follow the approach of never exposing those primary keys to the user, and always use an additional, randomly generated key instead whenever a user needs to refer to an object in your database. However, you then need to implement this random key creation, need to maintain two fields, etc. It would be nice if we didn't have to worry about this and could just use the primary keys of our models to refer to them, internally as well as externally.

For this purpose, I have created a new base class, which you can use instead of models.Model when you create your own Django models. This base class is called RandomPrimaryIdModel . With this base class, your primary IDs will start to look random, similar to what you know from URL shorteners.

Here's an example. Let's say you have defined a Django model like this (in the example, the only change you have to make to your normal model definition is to replace the models.Model base class with RandomPrimaryIdModel):

from random_primary import RandomPrimaryIdModel

class MyModel(RandomPrimaryIdModel):
# Normally define your Django model
...

for i in xrange(3):
m = MyModel(... parameters ...).save()
print m.id

As output you might get something like this:

Q68mfU
zjvsx3
VNuL0Lp

You can tune the key length as well as the characters that are used to construct the key. The docstring of the class is pretty extensive, so please have a look.

The code for the new model base class is free to use for anyone and can be found in this Github repository.

Hopefully, this can be useful to you. I'd welcome any feedback or comment.

Labels: django, models, primary keys, python

Sep 23, 2010

How to customize Django's default messages

Ever tried to customize a message produced by one of Django's generic views or standard field validators? This is a problem I run into pretty often.

Custom validator messages

In some cases Django offers you some pretty straight forward options, such as specifying an error_messages dictionary as argument to a field validation:

email = forms.EmailField(label = "Email address",
                         error_messages = {'required' : "Please enter an email address.",
                                           'invalid'  : "Malformed address. Please correct."})

Hard-coded messages

Sometimes, however, easy options like this do not exist. For example, take the generic password_reset view offered by django.contrib.auth. If you look at the source code for this view function, you find that by default it uses PasswordResetForm - also contained in the same package - as its standard form. The problem is that this form has a few hard-coded messages in it, so there is no obvious way to modify those if you just want to continue to use the standard views and forms.

The password_reset view gives you the option of specifying your own form - with the messages you want - by setting the password_reset_form parameter. For example, by doing this in your url.py file:

(r'^password/reset/$', 'django.contrib.auth.views.password_reset',
                       { 'template_name' : 'my_pw_reset_form.html',
                         'email_template_name' : 'my_pw_reset.txt',
                         'password_resset_form' : MyForm }),

I don't know about you, but the thought of essentially copying an entire form implementation just so that I can change some message text is not very attractive to me. Some of these forms consist of more than just field definitions and I don't want to be in the business of creating and maintaining 'mini-forks' of various Django components.

A better way

It seems that there should be a better way and I think there is: Let's use Django's built-in translation system for the task.

Django's translation is capable of taking any specially marked text in any of your files (including Python source code) and looking up a translated string for it, based on the locale of your site or user. For example, in your Python source code, you could write this:

from django.utils.translation import ugettext_lazy as _
msg = _("Here is some text")

The _() function here is the special marker and also performs the on-the-fly translation at runtime. In templates you use the {% trans ... %} template tag. You can read more about it here.

Fortunately, the hard-coded messages in the standard Django views and templates are prepared for translation. Therefore, the idea now is simple: Even if otherwise you are not interested in translation and internationalization, you simply use this mechanism to provide 'custom translations' for the selected messages you are interested in!

In my settings.py file, the LANGUAGE_CODE setting is 'en-us'. You may have a different setting, but that doesn't matter. While in your settings file, make sure that USE_I18N is set to True and that you include django.contrib.messages.middleware.MessageMiddleware in your MIDDLEWARE_CLASSES.

In your project directory, make sure you have a 'locale' directory (create it if it's missing) and then run this command:

% django-admin.py makemessages -l en

This utility examines all your files for strings that are marked for translation. In this case, I am asking it to create translation files for 'en' (English), but you can define whatever was specified in your LANGUAGE_CODE setting.

Have a look at the locale/[local-id]/

LC_MESSAGES/django.po file that was created ([locale-id]

in my case is 'en', it may be different for you). If you don't have any of your own messages marked for translation, this file will be mostly empty. A .po file contains messages - that are recognized at runtime - and their translations that need to be applied. It's quite straight forward.

Now with an editor open the standard .po file for your locale that comes with a Django install. Django may be in your Python install's site-packages directory. The path therefore should be something like this:

site-packages/django/conf/locale/[locale-id]/LC_MESSAGES/django.po

In that file, search for the message you would like to customize. For our password-reset example, I don't like the standard message it displays when the email address cannot be found. In the standard django.po we find this message represented like this:

#: contrib/auth/forms.py:110
msgid ""
"That e-mail address doesn't have an associated user account. Are you sure "
"you've registered?"
msgstr ""

This shows us that currently no translation is provided for this message (msgstr is empty). If you don't have an English language locale then you probably already see a translation specified right here.

Now instead of editing the message in the standard .po file that comes with Django - I don't like editing standard files like that - copy these lines into your own django.po file in your project directory. Use msgstr to define whatever text you want to be displayed instead. Single line messages can be written right into the two quotes behind msgstr, multi-line messages are written the way you see it done for msgid. In our example, I just say:

msgstr "Unknown email. Are you sure you have registered with that address?"

So, you now have provided a second translation for your default locale. Since Django looks first in your project's directory for translations of individual messages, it finds this translation before it falls back on the default provided by Django itself.

% django-admin.py compilemessages

Now you can restart your server and you should see your customized default message in action!

Maintenance

An annoying habit of the makemessages utility is that it comments out your custom messages out of your .po file every time it is run. The utility is pretty good in not losing any changes you make in that file, but every translation you add for which makemessages does not find the text in your project's files is going to be commented out. And since you are providing a translation for a standard Django message - and Django most likely is not part of your project - this will apply to the custom messages we are talking about here.

You could just manually remove all the comment identifiers ("#~ "). However, since I'm lazy, I wrote a little script, which not only removes those comments for me, but also calls compilemessages. Just create a file called translate.sh in your project directory with the following content:

#!/bin/bash

echo "### Translating all messages..."
django-admin.py makemessages -a
echo "### Removing commented-out manual messages..."
find locale -name 'django.po' -exec sed s/^\#\~\ // -i {} \;
echo "### Compiling messages..."
django-admin.py compilemessages
echo "### Done!"

Once you give execute permissions to that file, you can get all your project's messages discovered and compiled, while also retaining any of your customizations for Django's default messages, simply by running this command.

Pretty convenient, isn't it?

If you liked this little article, please consider following me on Twitter.

Labels: django, forms, i18n, internationalization, messages, standard, views

Jul 4, 2009

Setting the initial value for Django's ModelChoiceField

Recently, I worked with a Django form that utilized the ModelChoiceField. That is a convenient field that normally is rendered as an HTML select tag, which usually appears as a drop-down menu on a web-page.

A ModelChoiceField is specified as:

    class MyForm(forms.Form):
        my_field = forms.ModelChoiceField(queryset = MyModel.objects.all())

As you can see, this field is used to easily specify a drop-down for all items in a table (or whatever the queryset specifies). The model is represented as the output of its __unicode__() function. If you evaluate the form after it was posted, the value for the field is going to be an instance for the actual model whose unicode representation was selected.

The problems for me started when I tried to set an initial value for the field: In an 'edit' form, I wanted all fields to reflect whatever had been saved for a particular model instance, of course. As I said above, if you evaluate the form after it has been submitted you get an actual model instance. Naturally, you would think that initial values would be set in a symmetric manner, by specifying a model instance:

    form = MyForm(initial = { 'my_field' : some_model_instance })

Sadly, this doesn't work. And try as I might, I couldn't find an answer to this on the Internet either (more on that in a moment). So, after looking at the Django code, it finally dawned on me that you need to specify the ID of the model instance as the initial value:

    form = MyForm(initial = { 'my_field' : some_model_instance.id })

That works now. It's a bit unfortunate that the retrieval value (a model instance) and the initial value (the ID of a model instance) are of a different type. It's an inconsistency in the Django API, I think. But in the end, I probably should have at least tried that one a bit sooner.

The surprising thing is that I couldn't find any discussion of this anywhere on the Internet. I should mention here that I am using Yahoo as my default search engine. Shortly after I finally found the solution, it occured to me to try Google. And wouldn't you know it? Right there, third hit from the top, I had the answer.

Why did Yahoo not give me this result? Well, the answer was discussed on Google Groups. Is Yahoo not indexing those? Or is Google not letting them index it?

Either way, that small issue between Yahoo and Google cost me a few hours of frustration. So, I'm posting the solution here on a non-Google page, so that Yahoo users may also find the answer to that problem in the future.

You should follow me on twitter here.

Labels: api, django, forms, google, initial, ModelChoiceField, orm, python, yahoo

Jun 1, 2009

How to do "count" with "group by" in Django

Faced with the 'age old' problem of having to do a group by query in Django, I finally came across a solution.In short, you use the lower-level query interface directly:

    q = MyModel.objects.all()
    q.query.group_by = ['field_name']

When you then iterate over 'q', you get the results grouped by the value of whatever was in 'field_name'. Great! So far, so good.

Well, a word of caution at this point: Using the low-level query API is not really something Django wants you to do, apparently. In fact, try this with sqlite and it works. Try it with PostgreSQL and it does not (all sorts of error messages). So, your mileage may vary, depending on which database you are using. Ok, let's assume that your database is fine with this...

The only challenge for me now was that I had to do a count for this.
If you form your query using the usual methods of Django's ORM, you will be disappointed. For example, this here will not work:

    q = MyModel.objects.all()
    q.query.group_by = ['field_name']
    q.count()

It appears as if this returns only the count of the first group, not a list of counts, as you would expect.

The solution is to wade a bit deeper into the low-level query API. We can instruct the query to add a count-column. In fact, this results in merely a single column being returned, just like COUNT(*). It goes like this:

    q = MyModel.objects.all()
    q.query.group_by = ['field_name']
    q.query.add_count_column()

Since this returns counts, rather than complete objects, we now need to get the individual group counts as a list of values. We add one more line:

    q.values_list('id')

The values_list() function gives you not instantiated objects but instead the tuples of values for each object. The tuples contain only the fields you specify by name in the call to values_list(). Except, we have manually added the count column with the add_count_column() function. That count column is always returned first. So, what you get as result is something like this:

    [ (3,1), (19,8), ... ]

The first value of each tuple is the count, the second value is the value of the 'id' field of the last occurrence of the grouped model in the table. If you specify something other than 'id', you get the same thing: First the count and then the value of that other field.

But that's not what we want, right? We want a list of counts. We could manually extract the first element in each tuple, but Django offers us a shortcut:

    q.values_list('id', flat=True)

Setting flat=True tells the values_list() function to just return the first element of each tuple in a plain list (not a list of tuples). And since the first element now is the count column, we finally get what we want: A list of the counts for each group.

Note that we could have specified any other fields in the call to values_list(), not just 'id'. Because we specified flat those fields are ignored. It seems, though, that at least one field needs to be specified here, even though it won't be considered as part of the output.

You should follow me on twitter here.

Labels: count, django, group by, python

May 30, 2009

Retrieving full objects with custom SQL in Django

While working with Django, I recently had to retrieve some model objects via custom SQL. Looking for examples on how to do this, I noticed that most tutorials describe how to retrieve specific values via SQL, but not entire model objects. The solution is actually very simple. But since I couldn't find it described anywhere, I thought I share it here.

The usual approach goes something like this:

   from django.db import connection, models
   class CustomManager(models.Manager):
       ...
       query = "SELECT id FROM mymodel WHERE ..."
       cursor = connection.cursor()
       cursor.execute(query)
       return [i[0] for i in cursor.fetchall()]

This then returns the IDs of the objects that were selected. If you want to return actual objects, you can modify the last line into this:

    return self.filter(id in [i[0] for i in cursor.fetchall()])

However, the problem with this approach is that we are executing two queries now: The custom SQL query, followed by the self.filter() query.

Here then is a very simple way to get complete objects, with just a single, custom SQL query:

   class CustomManager(models.Manager):
       ...
       query = "SELECT * FROM mymodel WHERE ..."
       cursor = connection.cursor()
       cursor.execute(query)
       return [MyModel(*i) for i in cursor.fetchall()]

Rather than making a list of IDs, which is then used to query for the actual objects, we make a list of the complete objects. We can do that, because we changed the custom SQL: Instead of selecting just the ID column, we are now selecting all columns (SELECT * ...).

With custom SQL queries we get a list for each row. Fortunately, the order of columns in each row reflect the order of arguments to the model's __init__() function. As a result, we can use each row's list representation as positional arguments for __init__().

You should follow me on twitter here.

Labels: custom manager, custom sql, django, python, sql