Double sorting - list of Tuples

Here’s my code for taking an example string and creating a list of tuples of the alphanumeric characters in the string along with their frequencies of occurrence. Sorting the list of tuples according to the frequencies of occurrence is easy. The more difficult task for me was sorting the characters from earliest occurring in the alphabet to latest, BUT keeping the order of frequencies and only sorting within characters of the same frequency. The way I did this was by creating a new, empty list, then for each occurrence frequency creating a new sorted list according to the earliest to latest letter, and finally adding each sorted list in turn to the newly created list. Is there a simpler, more elegant way of accomplishing this result? (I feel like there should be, but I can’t think of it.)

My code:

from pprint import pprint

phrase = “This is a common interview question. To be or not to be, that is the question.”
char_frequency = dict()
for char in phrase.lower():
if char.isalpha():
char_frequency.setdefault(char, 0)
char_frequency[char] += 1

max_frequency = max(char_frequency.values())
char_frequency_sorted = sorted(char_frequency.items(),
key=lambda kv:kv[1],
reverse=True)
pprint(char_frequency_sorted, width=12)

print()
for char, count in char_frequency_sorted:
if count != max_frequency:
break
else:
print(f"Maximum Occurring Character: {char}, Count = {count}")

print()
counts = list(set(char_frequency.values()))
counts.sort(reverse=True)
updated_sort =
for count in counts:
filtered_count_list = list(filter(lambda item: item[1] == count, char_frequency_sorted))
updated_sort += sorted(filtered_count_list, key=lambda item:item[0])
pprint(updated_sort, width=12)

Is there a simpler, more elegant way of accomplishing this result?

The word you are looking for is “pythonic” (as in, “a more pythonic way to do this”) and yes is the answer. Python can sort by two things by returning a tuple for the key in the sorted method.

So for this case you would do something like:

result = sorted(
  char_frequency.items(),
  key=lambda kv: (-kv[1], kv[0]))

NOTE: the negative sign is required on the numeric type to reverse its sorted order.

2 Likes

Thank you very much. This is very helpful!

I created some new code with a list of 3-item tuples to extend my understanding further. I sort first by the middle item of the tuple (first integer, greatest to least), then by the last item of the tuple (second integer, greatest to least), then finally by the first item, the character, from latest in the alphabet to earliest. What I noticed is that the negative sign in the lambda expression does not work on sorting the letters in reverse order, which you had suggested in the response (mentioning numeric types). Instead, if I want to reverse the order of the letters, I need to add a third keyword argument “reverse=True” to the sorted() method. Is this the most “Pythonic” way to accomplish this effect? Also, do you have a reference I can read explaining more details and examples of this type of sorting on a list of tuples with multiple items in the tuple?

New code:

new_list = [
(“p”, 3, 5),
(“i”, 2, 7),
(“d”, 1, 8),
(“b”, 7, 4),
(“v”, 4, 9),
(“t”, 8, 2),
(“i”, 4, 1),
(“d”, 1, 9),
(“p”, 3, 7),
(“c”, 7, 1),
(“f”, 7, 5),
(“x”, 4, 1),
(“e”, 1, 9),
(“h”, 7, 4),
(“a”, 7, 4),
(“k”, 8, 2),
(“c”, 8, 2),
(“r”, 8, 1),
]
new_list_sorted = sorted(
new_list, key=lambda item: (item[1], item[2], item[0]), reverse=True
)
pprint(new_list_sorted, width=20)
#Will list the letters in reverse order for tuples with same item[1] and item[2]

Output:

[(‘t’, 8, 2),
(‘k’, 8, 2),
(‘c’, 8, 2),
(‘r’, 8, 1),
(‘f’, 7, 5),
(‘h’, 7, 4),
(‘b’, 7, 4),
(‘a’, 7, 4),
(‘c’, 7, 1),
(‘v’, 4, 9),
(‘x’, 4, 1),
(‘i’, 4, 1),
(‘p’, 3, 7),
(‘p’, 3, 5),
(‘i’, 2, 7),
(‘e’, 1, 9),
(‘d’, 1, 9),
(‘d’, 1, 8)]

When the whole thing needs to be reversed, yes. However, if you only need to reverse one of the sort keys, then you would use a different approach. For example, suppose you had a list of strings and you wanted them sorted by length (shortest to longest) and reverse alphabetical (after sorting by length).

As noted in an answer like this one on Stack Overflow:

You could create a reversor class and use it to decorate the key

The negative sign only works for reversing numeric sorting keys, but we can use this reversor class to reverse any comparable keys (I think you need the __gt__ method as well, not sure why that answer did not include it):

class reversor:
    def __init__(self, obj):
        self.obj = obj

    def __eq__(self, other):
        return other.obj == self.obj

    def __lt__(self, other):
           return other.obj < self.obj

    def __gt__(self, other):
           return other.obj > self.obj

Which would be used like this to accomplish the sort:

list_of_strings = ["just", "some", "simple", "strings", "here"]
result = sorted(
  list_of_strings,
  key=lambda item: (len(item), reversor(item)))
print(result)  # ["some", "just", "here", "simple", "strings"]

I suppose the authoritative guide is the Python documentation on sorting: Sorting HOW TO — Python 3.12.0 documentation

Interestingly, the guide recommends taking advantage of the fact that Python has stable sorting and just making multiple calls to sorted (and thereby requiring multiple reads of the data since each pass of a sorting algorithm has N*log(N) time complexity and since sorted creates a new list you have to allocate N spaces in memory for each pass). Depending on the size of the input, this may or may not be acceptable and I would view the fact that you can use a tuple for the sort key as more pythonic than following the guide in the documentation, but YMMV.

Really we are just taking advantage of the fact that the sortable key can be any comparable object and tuples are comparable so we can use them.

1 Like