Migration from Django's SQLite to AWS MySQL


Posted on January 26, 2019 at 08:35

Inspiration here.

To think that migrating a database was a simple task… As part of the Tarteel project, I’m trying to migrate our old SQLite DB to an AWS RDS MySQL one.
Although I learned alot about relational and NoSQL databases, it took me a while to wrap my head around why I couldn’t just simply migrate our sqlite3 DB to MySQL. It turns out theres a lot of caveats here and there making this a tough cookie that apparently is always depenedent on how our DB was configured. So no matter what solutions exist out there on the interwebs, our mileage will vary.

Here’s what worked for me to port our DB over to Amazon’s RDS.

Putting SQLite under duress

  1. Install mysql using sudo apt-get install mysql-server

  2. Use the command line to create a database:
    mysql -u root -p
    > CREATE DATABASE prodDB;
    > USE 'prodDB';
    > GRANT ALL PRIVILEGES ON prodDB.* TO 'admin'@'localhost' IDENTIFIED BY 'password';
    
  3. Get all the old sqlite3 data from django.
    python manage.py dumpdata > datadump.json
    

Here’s the fun part, we had these issues with our data: 1. One entry was corrupt and so I had to fix it manually (Some unicode characters weren’t converted properly). 2. MySQL doesn’t play well with Foreign Keys, specifically those related with the django-auth module. I kept getting the following error:

django.db.utils.IntegrityError: 
Problem installing fixtures: The row in table 'auth_permission' with primary key '1' has an invalid foreign key: 
auth_permission.content_type_id contains a value '1' that does not have a corresponding value in django_content_type.id.

Here’s a sample model that was giving me this nightmare:

{"model": "auth.permission", "pk": 1, "fields": {"name": "Can add annotated recording", "content_type": 1, 
"codename": "add_annotatedrecording"}

The solution (see: hack) was to manually remove all auth_permission models. These weren’t even being used so didn’t have to worry about them. How to legitemetly go about solving it - I have no idea. If anyone knows, feel free to comment below.
Next, we quickly check our mysql credentials to see if we can access the DB.

mysql -u admin -p
> SHOW DATABASES;

Run the django migration.

$ python manage.py migrate --run-syncdb

Add the following code to urls.py or run it in a shell (python manage.py shell)

>>> from django.contrib.contenttypes.models import ContentType
>>> ContentType.objects.all().delete()

Finally, load our data to the DB:

python manage.py loaddata datadump.json

Spin up a shell to test this baby out.

>>> from restapi.models import AnnotatedRecording
>>> a = AnnotatedRecording.objects.all()
>>> len(a)
9002

Ayyyy! We got the data! Lets also take a look at the MySQL DB.

mysql -u root -p
> use prodDB;
> show tables;
+--------------------------------+
| Tables_in_prodDB               |
+--------------------------------+
| our_tables                     |
+--------------------------------+
> SHOW FULL COLUMNS FROM restapi_annotatedrecordings \G;

*************************** 1. row ***************************
Some data....

Drop Table If Exists

Now its time to actually port this over to Amazon. This assumes you already setup a MySQL DB over at AWS RDS

  1. Get a dump of the sql file locally.
    mysqldump -u root -p prodDB > prod.sql
    
  2. Upload to AWS
    mysql -h <rds_host> -u <rds_user> -p <db_name> < prod.sql
    
  3. Check if the database successfully transferred
    mysql -h <rds_host> -u <rds_user> -p
    > show databases;
    databases
    > use prodDB;
    > show tables; 
    tables
    

And that is the end of my Saturday afternoon… If there’s a better way to go about this please let me know.

TODO: VPC

You thought you were done?