Belchak.com

Technology and Awesome

Django Database Migrations With South

| Comments

I have been using django for web development for almost a year now, and I just recently started using South to do database migrations. To be fair, most of the work that I have been doing with databases has centered around MongoDB and schema-less document stores instead of a traditional RDBMS. Since Django does not come with any database migration tools, my standard approach was to make sure that my models are completely thought out before running the manage.py syncdb command. The lack of a good database migration tool was one of the things that originally had turned me off to django.

Enter South. South lets you manage your database in a way very similar to how Ruby on Rails works.

Converting a project to a South-managed project is very easy:

  1. Ensure that your database and models are completely synced up. (i.e. your models are not ahead of your database or vice-versa)

  2. Install South by running [sudo] pip install south

  3. Add South to your INSTALLED_APPS list in the settings.py for your django project.

  4. Run ./manage.py syncdb in your project root directory to add the South database tables to your database.

  5. If you have an existing application that you would like to conver to a South-managed application, run the following command: ./manage.py convert_to_south YOUR_APP_NAME If not, go to the next step!

  6. Now you are ready to go! You can change one of your models and then proceed to the next step.

  7. Run the following command to get South to create an automatic migration for you: ./manage.py schemamigration YOUR_APP_NAME --auto

  8. Now you can apply your newly created migration to your database:./manage.py migrate YOUR_APP_NAME

  9. Congratulations, you have performed your first database migration using South!

South lets you apply up to or back to any migration point by running a command like: ./manage.py migrate YOUR_APP_NAME 0001 (that command would take you back to your initial migration point. You can get a list of all your migrations and a description about each one by running ./manage.py migrate YOUR_APP_NAME --list. This lists all of the migrations you have available and denotes with a (*) which ones have been applied.

South is great for working in a team. All migrations are stored in YOUR_APP_NAME/migrations, so you can simply add these to your VCS and all of your team  members will get all of your migrations. If there is a conflict in some of the migrations that you and a team member have been working on, South will detect it and let you merge the conflicts.

All in all, I am really loving South. It makes working with an RDBMS and Django much more pleasant!

Removing Old Ssh Fingerprints From Your Known_hosts the Quick and Easy Way

| Comments

Ever have this problem? You just rebuilt a machine, and when you go to SSH into it, you get the following message:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!

Many people just go edit their ~/.ssh/known_hosts file and carry on. But there is a faster/better way!

OpenSSH comes with a command called ssh-keygen that allows you to generate and manage all your keys, including your ssh fingerprints.

Simple usage for this would be:

ssh-keygen -R HOSTNAME

Automatic MongoDB Backups to S3

| Comments

One of the big problems with hosting your own database solution is that you have to do backups for it on a regular basis. Not only do you need to do backups for it, but you need to also keep backups offsite. Luckily, Amazon S3 allows a cheap and easy solution for your offsite backups.

I found a shell script solution for handling MongoDB backups, but it only does local backups. It keeps a nice history of recent backups, and rotates off the oldest ones when the threshold for age is reached. I modified the code to call a Python script that then synchronizes the newly created backup file to S3. I haven’t wired up any purging functionality yet, and I don’t know if I am going to. S3 storage is so cheap that it really doesn’t matter much. A complete solution would, of course, keep your local files and your remote off-site backups in S3 in sync, but there is also a case to be made for keeping a rich history of backups in the “cloud” so as to be able to revert to any point in history if necessary.

The script that does the magic to synchronize and purge old backups is written in Python, and uses the boto library to quickly do the work.

ACCESS_KEY='YOUR_ACCESS_KEY'
SECRET='YOUR_SECRET_KEY'
BUCKET_NAME='YOUR_BACKUPS_BUCKET' #note that you need to create this bucket first

from boto.s3.connection import S3Connection
from boto.s3.key import Key

def save_file_in_s3(filename):
    conn = S3Connection(ACCESS_KEY, SECRET)
    bucket = conn.get_bucket(BUCKET_NAME)
    k = Key(bucket)
    k.key = filename
    k.set_contents_from_filename(filename)

def get_file_from_s3(filename):
    conn = S3Connection(ACCESS_KEY, SECRET)
    bucket = conn.get_bucket(BUCKET_NAME)
    k = Key(bucket)
    k.key = filename
    k.get_contents_to_filename(filename)

def list_backup_in_s3():
    conn = S3Connection(ACCESS_KEY, SECRET)
    bucket = conn.get_bucket(BUCKET_NAME)
    for i, key in enumerate(bucket.get_all_keys()):
        print "[%s] %s" % (i, key.name)

def delete_all_backups():
    #FIXME: validate filename exists
    conn = S3Connection(ACCESS_KEY, SECRET)
    bucket = conn.get_bucket(BUCKET_NAME)
    for i, key in enumerate(bucket.get_all_keys()):
        print "deleting %s" % (key.name)
        key.delete()

if __name__ == '__main__':
    import sys
    if len(sys.argv) < 3:
        print 'Usage: %s  ' % (sys.argv[0])
    else:
        if sys.argv[1] == 'set':
            save_file_in_s3(sys.argv[2])
        elif sys.argv[1] == 'get':
            get_file_from_s3(sys.argv[2])
        elif sys.argv[1] == 'list':
            list_backup_in_s3()
        elif sys.argv[1] == 'delete':
            delete_all_backups()
        else:
            print 'Usage: %s  ' % (sys.argv[0])

There is obviously a lot more work to be done on this script, but it’s a start.

The appropriate setup for using this script and the AutoMongoBackup utility is to create a slave MongoDB node that receives synchronizations from the master. If you can handle having your Mongo instance locked for reads/writes while a backup is performed (i.e. you have a small database that backs up quickly) then you more than likely do not need to do the slave method.

Anyway, hope this helps! I’d love to hear other ideas about how else this can be done.

Problems With Facebook API and M2Crypto

| Comments

After doing some crypto updates to a django application that I am working on, I discovered that the Facebook API was dog slow for retrieving any query using HTTPS. Turns out that the M2Crypto library apparently hijacks the SSL processing of urllib and mucks everything up. Thanks to this handy blog post, I was able to fix my Python implementation of the Facebook API and get things speeding along again.

The fix is basically to add the following lines before any urllib.urlopen() call (in my case, I only have two - one for GET and one for POST):

        urllib._urlopener = urllib.FancyURLopener()
        urllib._urlopener.addheader('Connection', 'close')