Django annotate() multiple times causes wrong answers

Be careful when feeding multiple arguments to annotate() method(Django ORM) or using it in chain… it seems it does not return correct answers.
See the link below from stackoverflow:

As an example from Mosh course(Django p1):
Consider the customer with id 214, named ‘Carlin Doggrell’, her order ids are: 954, 936, 890, 382, 205(as you can see there are 5 orders placed by her)
When trying to calculate total number of orders placed by her, you can use annotate() this way:

customer = Customer.objects.all().annotate(total_orders=Count(F(‘order__id’)))

and this returns a correct answer… but when you feed another argument to annotate(), it seems it returns a wrong answer. for example:
customers = Customer.objects.all().annotate(total_orders=Count(F(‘order__id’)),
total_order_items=Count(F(‘order__orderitem__id’)))
here the total number of orders(for customer with id=214) changes to 11(being 5 and 5 is the correct answer), 9 is the total order items in her orders and it is correct(checked them inside the database). It seems for solving this problem, we have to set Count() distinct keyword to True as follows:
customers=Customer.objects.all().annotate(total_orders=Count(F(‘order__id’),
distinct=True),
total_order_items=Count(F(‘order__orderitem__id’)))
Is it a good idea or we would be better not to use annotate() with multiple arguments?

The problem is not multiple arguments passed to annotate(). Rather, it is when annotate() involves multiple tables.

If you inspect the query generated by Django ORM, it includes two left joins. So, if one guy ordered three items in one order, there will be three rows. For example

customer_id | order_id | orderitem_id
     1      |    1      |    1
     1      |    1      |    2
     1      |    1      |    3

That’s why you get an inflated result without using distinct.

However, distinct only works with Count(). What if you need to include sum? There is no solution to get the correct result without changing how you structure the SQL.

So, it’s not a good idea to use annotate when it results in multiple left joins. But there are still valid use cases for annotating multiple things.

1 Like

Thanks for your response. So if I understood correctly, annotate() returns correct answer when we have only one left joins and we would be better avoid more than one left joins.

Yea, just be careful when your annotate() spans over multiple tables. But as long as you test the result and inspect the query until you get the hang, you will be fine. Normally, in a situation like this, there are options like issuing multiple queries, using subqueries, or using raw SQL.

1 Like

Yes, big thank you for letting me know about the alternatives.