A panel involving Mike Malone, Waldemar Kornewald, Adam Lowry, Ted Nyman, Jacob Kaplan-Moss, Eric Florenzano, and Alex Gaynor on NoSQL and Django.
We're pleased to announce that DjangoDose will be bringing you another exciting panel. This time our topic will be NoSQL and Django, and our guests will be:
We're also hoping to have someone representing CouchDB, but we don't know who it will be yet, we'll let you know when we do.
Like last time we've set up a Google Moderator for the panel, and we hope you'll submit and vote on questions that are interesting to you.
Apologizes for the soft static noise on the podcast. Alex's new mic needs some more tweaking. We're working on it.
Apologizes for the soft static noise on the podcast. Alex's new mic needs some more tweaking. We're working on it.
A panel involving Brian Rosner, Eric Florenzano, Alex Gaynor, James Tauber, Jannis Leidel, Kevin Fricovsky and James Bennett on reusable apps in Django. Or shall we say, well-written code that happens to be reusable.
We're happy to announce that this week we'll be recording a panel all about reusable applications. We're very happy to say that James Tauber, James Bennett, Kevin Fricovsky, and Jannis Leidel will be joining Brian, Eric, and me in bringing this to you. We want this to be about both the philosphical design points on reusable applications, as well as about the practical everyday issues developers face when working with and developing reusable applications. Most of all, we want to talk about the issues that matter to the community. To that end we've set up a Google moderator where you can voice your own opinions on what you want us to talk about. You can find that here (keep in mind you don't need to have specific questions, you can simply vote on the ones you like best).
Finally, I want to say that this type of content is the kind of thing we're very excited to be doing with DjangoDose. With the rebranding of This Week in Django as DjangoDose we've got a great opportunity to be doing this kind of discussion that doesn't necessarily fit the format of a weekly podcast. If you've got any idea for content, whether it's a roundtable discussion like this, or an article, please let us know using our contact form.
With the 1.1 release of Django came several major new features, prominent among these is aggregation, which was among the most oft-requested for Django's ORM. The addition and manner of implementation of this feature highlights several fundamental philosophies within Django, as well as provides new possibilities. Fundamentally aggregates support in Django means a way to answer queries such as "show me all the authors who have published more than 1 book" or "show me the most expensive book".
One of the most important things to understanding how the aggregates API works is to understand that Django's ORM does not try to replace SQL, or provide an API to SQL in Python. The purpose of Django's ORM is to represent some persistent datastore, and have a way to put objects in and get them out. As such nothing in the API is SQL specific, and you will never see a discussion of how to get a GROUP BY or HAVING clause in your query. Instead the discussion around aggregates is centered around answering two different kinds kinds of questions, one is "What is the maximal[or minimal, or average, or count, etc.] value of some field in this group of objects", the other is "For each of the objects in this group, what is the maximal[or minimal, or average, or count, etc.] of some other group of objects that is related". We can think of these types of queries in terms of what they return, the first returns a scalar value, while the second returns a group of objects(in our case a QuerySet) that carry with them some additional information. In the case of the second type of query, since all that additional data exists in our backend we can do all the type of operations we would expect, such as further filtering, ordering, or even computing another aggregate over those values.
Let's consider the first type of query, since they are slightly simpler. For the purpose of our examples we are going to work with two models:
from django.db import models class Author(models.Model): name = models.CharField(max_length=100) def __unicode__(self): return self.name class Book(models.Model): authors = models.ManyToManyField(Author) title = models.CharField(max_length=200) price = models.DecimalField(decimal_places=2, max_digits=6) def __unicode__(self): return self.title
These are two relatively simple models with which we can demonstrate most of the features of aggregation. For example, a simpler query might be how much does the most expensive book cost?:
>>> from django.db.models import Max >>> Book.objects.aggregate(max_price=Max('price')) {'max_price': Decimal('56.49')}
There are a few important things to note here. First, aggregate() is a sentinel method on a QuerySet, that means that unlike a lot of other methods on a Queryset you can't chain more methods afterwords. Secondly, aggregate() returns a dictionary, mapping the alias provided in aggregate to the result value. Aggregates itself takes any number of keyword arguments with the keyword being the alias, and the value being the aggregate itself. You can also give aggregate positional arguments, in which case the alias is a default one constructed based on what field is being aggregated on, and what type of aggregate is being preformed.
Out of the box Django provide's support for seven types of aggregates, sum, maximum, minimum, average, count, standard deviation, and variance. It is also possible to create your own aggregation classes, which modules like GeoDjango (django.contrib.gis) take advantage of. Each aggregate class is instantiated with a string that refers to the field the aggregation should be preformed over, with full support for the "__" syntax to refer to related fields, as seen elsewhere in Django's ORM.
The second type of query we can do preform operations for each item in the QuerySet. So for example we might ask how many author's each book has:
>>> from django.db.models import Count, Max >>> books = Book.objects.annotate(num_authors=Count('authors')) >>> books [<Book: Pro Django>, <Book: Practical Django Projects>, <Book: The Definitive Guide to Django>] >>> [book.num_authors for book in books] [1, 1, 2]
We might only want books that have more than one author::
>>> Book.objects.annotate(num_authors=Count('authors')).filter(num_authors__gt=1) [<Book: The Definitive Guide to Django>]
Or we might want to know what the greatest number of authors any book has::
>>> Book.objects.annotate(num_authors=Count('authors')).aggregate(max_authors=Max('num_authors')) 2
There are several features that distinguish the annotate() method from aggregate() one. First, it returns another QuerySet, second, each object it returns is a normal Model instance, except it has extra attributes corresponding to the aggregates were requested. However, like the aggregate() method it takes any amount of keyword or positional arguments which are handled in the same way. The QuerySet can be further manipulated, but fundamentally what annotate() does is give us access to an extra value on each object.
Strictly speaking these were all computations that we could have solved before in pure Python, however there are a number of distinct advantages to doing these at the datastore level. First, it's going to be faster, compared to our datastore Python is going to be very slow to do these calculations, our datastore is built to do these computations with large numbers of records, so we should let it do its job, second, it saves bandwidth, to do a calculations like preform an annotate, sort by its result, and take a subset of that data in Python we would need to pull in every single record, depending on the size of our dataset this could mean pulling in millions of records, which is unfeasable.
The important point to take away is that when trying to figure out how to write a query using the Django ORM it is most important not to think of the query in terms of what the SQL would look like, but instead in terms of what question are we trying to answer, and from there try to figure out what aggregations or annotations we need, and what filtering or ordering we need to preform, and what slicing we need to do. By following these steps it becomes much easier to put together queries that answer the questions that we need answered.
It's been a long time since the last episode of This Week in Django, where we had our last Tracking Trunk segment. In this episode, we talk about what we envision the Tracking Trunk of Django Dose will look like. We also talk about some of the history and a little about why the format is different from how it used to be.
When you're developing a personal website, a blog, or a website for a small client, you typically don't spend a whole lot of time worrying about setting up dedicated development, staging, and production environments. When that small client becomes bigger, your personal website becomes a brand, or you accidentally release broken code, that's when it makes sense to start separating out those environments.
From a server standpoint, it's fairly straightforward on how to set up development and staging environments: You either buy new servers, set up virtual servers, or change the configuration on existing servers to deploy a different checkout of your Django project. Essentially whatever you did to deploy your project to production, you do that same process again to prepare the server for development and staging environments.
The more interesting question for our purposes is how to deal with these new environments from the Django side. The solution that I use is simple: an environment variable and some settings overrides files. Here's the code that I put at the bottom of my settings.py file to make it all work:
import sys import os FLAVOR = os.environ.get('FLAVOR', 'localdev') def override_settings(dottedpath): try: _m = __import__(dottedpath, fromlist=[None]) except ImportError: pass else: _thismodule = sys.modules[__name__] for _k in dir(_m): setattr(_thismodule, _k, getattr(_m, _k)) override_settings('settings_overrides.' + FLAVOR) override_settings('local_settings')
There's a lot packed into this small snippet of code. Firstly, the FLAVOR setting is what determines which type of environment we want to run. As you can see, it defaults to localdev which is for the case where you're developing on your local machine. You can set this environment variable to whatever you want. Typical values for FLAVOR include "dev", "staging", "prod", and "test".
The next part of this snippet is the override_settings function. It takes a dotted path to a settings file and imports everything in that file into the current settings file. If there are duplicate settings, the one from the override_settings function wins. As you can see, the settings from e.g. 'settings_overrides.staging' are imported, according to the current FLAVOR, and then finally local settings are imported if they are found.
As an example of how this all would work, here's how the directory structure could look:
settings.py
settings_overrides/
dev.py
staging.py
prod.py
test.py
Now that we have this setup, we can simply set this environment variable before running any manage.py commands:
FLAVOR=dev python manage.py syncdb
The above would run syncdb on the development database (the DATABASE_* options from settings_overrides/dev.py). You can also use the export command to set the environment variable for the rest of the terminal session:
export FLAVOR=dev python manage.py cleanup python manage.py syncdb
Also it's easy to set the environment variable using the os.environ dictionary, so if you're using mod_wsgi and Apache, you might have a production wsgi file that looked something like this:
import os import django.core.handlers.wsgi os.environ['FLAVOR'] = 'prod' os.environ['DJANGO_SETTINGS_MODULE'] = 'settings' application = django.core.handlers.wsgi.WSGIHandler()
Using this technique is simple, straightforward, and an effective way of splitting out development, staging, and production environments for Django. It makes it easy to keep the settings isolated and cleanly separated. Watch out though, because sometimes it can be easy to forget what your FLAVOR environment variable is set to!
How do you manage development, staging, and production environments? Share your tips and tricks in the comments below.
We are pleased to announce that Django Dose, a Django content site, is now online! We couldn't be more happy that we have the terrific DjangoCon conference to launch this new site. If you're familiar with This Week in Django, you may notice some similarities, and with good reason--Django Dose is the successor to TWiD. It has only taken us way too long to get Django Dose up and running. One of our goals with Django Dose was to completely revamp the internals of the site. We wanted to add much more automation to the process which will help streamline content and make it easier for us to get you all of your Django-related content. We still have a little more ways to go, but we think we're off to a great start.
This Week in Django had the fundamental problem of being tied to a weekly format. While we don't want to change that drastically we felt we needed some change. One of the big things we've changed is that we've broken the podcast into two. We now have Tracking Trunk and Community Catchup. The reason for this was that the two couldn't co-exist very well. There would be weeks where there would be either no changes or simply documentation commits. Those didn't bode well for a weekly podcast that tried to keep the content going. We hated having to not do the tracking trunk segment because there wasn't any commits. Breaking it off now gives us the ability to record/produce the podcast when important content is available.
In addition to breaking off tracking trunk and community catchup we decided to make the callcasts a bigger part of our site. They are now a team effort and we hope to keep them coming. As the site matures and grows, we expect that it won't just stay limited to Tracking Trunk, Community Catchup, and Callcasts. But we can't do it all alone! It is our profound hope that with Django Dose, more people in the community will get involved. If you've got a great idea for an article, audio content, or video content, don't hesitate to drop us a line. We also love guest authors, so we may contact you to ask permission to re-post your great articles on Django Dose, where hopefully they can reach a larger audience.
An obvious change has been the site design. With a new brand we wanted to brand it differently. Greg Newman graciously stepped up to help us out. We can not thank him enough for the work he has put in to making Django Dose look good and work well. The site has been tweaked a bit to feature the content we feel is important.
We've enjoyed DjangoCon so far and we hope you have too! We're really looking forward to hearing back from you about this new version of the site, so if you're at DjangoCon, then make sure to grab us and tell us about your ideas. Oh, and make sure to subscribe, so you can get the latest Django Dose content!
Alex Gaynor has been contributing more and more to Django and its community. This summer, he even worked as a student in Google Summer of Code to implement Multiple Database support for Django. We finally got a chance to catch up with him and talk to him about his involvement with Django, his work on Multiple Databases, and more.
An introduction to what Django Dose is and what will be in the future. We talk about the impetus for changing formats from that of This Week in Django, into what it is today. We also talk about our hopes for the podcast as it evolves and grows. We also ask for your feedback and input on a variety of topics. We're very pleased to have the opportunity to bring Django content to you, but we can't do it without your input. Please let us know what you think about this new format in the comments and as feedback to us.