Django South raw data migration

At $dayjob one of our Django application used to have PickleFields as model attributes. They where mostly used to store metadata in the form of dicts. But as Pickling is considered a security risk we wanted to get rid of them. Since all information would have to be translated to JSON for the API anyway it made sense to store it in JSON as well. This is easy using the implementation provided by the JSONField.

Since we use South in our application changing the field type was easy.

1
2
3
4
5
from django.utils.timezone import now
from jsonfield import JSONField
class Dog(models.Model):
metadata = JSONField(default=[])

Then creating the migration script.

1
python manage.py schemamigration app --auto

However this won’t actually convert the data stored in the database to the new format. So with every request the result would be:

1
ValueError: No JSON object could be decoded

As the raw database data was a base64 encoded Pickle, which clearly is no valid JSON.

One approach to solve this issue would have been to:

  • create a schema migration to rename the Pickle field type attribute to a temporary name.
  • create a schema migration to re-add the attribute with a JSON field type.
  • create a data migration to read the temporary Pickle field values into the new JSON field.
  • create a schema migration to remove the old Pickle field type attribute.

But as I already had pushed out this code I wanted to know if there was another solution. After some research I found just that.

With the Django ORM it is possible to perform raw queries and redefine the column names. So you could do something like this:

1
m = Model.objects.raw('SELECT id, attribute as attribute_old, "[]" as attribute FROM appname_model')

Since the model doesn’t have a definition for _attribute_old _it will return the raw database value for that record.

Using this method the following data migration can be created:

1
python manage.py schemamigration app convert_picklefield_to_jsonfield
1
2
3
4
5
6
7
8
9
10
11
12
from south.v2 import DataMigration
import pickle
from base64 import b64decode, b64encode
class Migration(DataMigration):
def forwards(self, orm):
dogs = orm.Dog.objects.raw('SELECT id, metadata as metadata_old, "[]" as metadata FROM app_dog')
for dog in dogs:
dog.metadata = pickle.loads(b64decode(dog.metadata_old))
dog.save()
# rest of generated migration

And the reverse can also be made:

1
2
3
4
5
6
7
def backwards(self, orm):
dogs = orm.Dog.objects.raw('SELECT id, metadata as metadata_old, "[]" as metadata FROM app_dog')
for dog in dogs:
dog.metadata = b64encode(pickle.dumps(dog.metadata_old))
dog.save()
# rest of generated migration