Terrible Python performance

Recently I created small Python program for myself which can generate sudoku matrixes (initially I designed it to solve ken-ken puzzles), and decided to compare its performance in different languages. I have implemented exactly the same algorithms in C#, Java and JavaScript (javascript was essential here to kill the “Python is interpreted” excuse). So, the algorithm generates 15x15 sudoku matrix and the result is exactly the same in all 4 languages. But I was stunned by the performance comparison:

Java, OpenJDK 17 → 2.7 seconds
C#, .net 8 → 7.3 seconds
JavaScript, NodeJS 21 ( same result in Chrome 121) → 4.6 seconds
Python, Cpython 3.12 → 106.5 seconds
Python, Jython 2.7 → 288.7 seconds

Note, unlike the other languages, Python performance wasn’t stable, 106 seconds was the best result I could achieve, but it differs significantly from run to run, sometimes is was 200+ seconds. Note that the Algorithms doesn’t have any randoms, and do exactly the same operations each run. I tried the best for the algorithm to be optimal for Python, e.g. I tried using array.array instead of lists for the matrix, but nothing helped. Anyway, it is not about the algorithm itself, but about the performance comparison, why the hell Python is that bad?

1 Like
"""Just in case I put the code I used here, if you need other languages, let me know"""
"""sudoku gen"""
from datetime import datetime
from array import array

class SudokuGenerator:
    """Generates sudoku of the given size"""

    def __init__(self, matrix_size=5, verification_method=None) :
        self.matrix_size = matrix_size
        # self.matrix = [[0 for i in range(matrix_size)]
        #                 for i in range(matrix_size)]
        self.matrix = [array('i', (0 for _ in range(matrix_size))) for _ in range(matrix_size)]
        self.recursive_calls_counter = 0

    def start_generation(self):
        """kicks off generation"""
        return self.generate_tile(0)

    def generate_tile(self, current_tile):
        """generating tile according to the previously created ones"""
        self.recursive_calls_counter += 1
        if current_tile == self.matrix_size * self.matrix_size:
            return True
            
        for i in range(1, self.matrix_size + 1):
            # calculating current tile in 2 dimension matrix
            y = current_tile // self.matrix_size
            x = current_tile % self.matrix_size
            # checking against existing horizontal tiles
            found = False
            for xx in range(0, x):
                if self.matrix[xx][y] == i:
                    found = True
                    break
            # checking against existing vertical tiles
            if not found:
                for yy in range(0, y):
                    if self.matrix[x][yy] == i:
                        found = True
                        break
            if not found:
                self.matrix[x][y] = i
                result = self.generate_tile(current_tile + 1)
                if result:
                    return True
        return False

    def print_matrix(self):
        """printing matrix"""
        print("-" * self.matrix_size * 3)
        for n in range(0, self.matrix_size):
            msg = ""
            for m in range(0, self.matrix_size):
                msg += ", " + str(self.matrix[n][m])
            print(msg)
        print("-" * self.matrix_size * 3)

# sudokuGenerator = SudokuGenerator(5, verify_matrix)
start_time = datetime.now()
# do your work here

sudokuGenerator = SudokuGenerator(15)
res = sudokuGenerator.start_generation()
print("Found solution? ", res)
sudokuGenerator.print_matrix()
print("Recursive calls: ", sudokuGenerator.recursive_calls_counter)
print("type of matrix: ", type(sudokuGenerator.matrix))
end_time = datetime.now()
print("Duration: ", (end_time - start_time).total_seconds() * 1000, " ms")`Preformatted text`

That’s interesting. I am surprised that C# did it slower than Java and even JS. It might be because the code wasn’t as optimised as it could have been.

Python can be slow but almost doubling its execution time? It might be due to some background processes on your system.

Doubling? No, it is about 20 times(!) slower than others.
I was also surprised that Java was the fastest one and JS was faster than C# in this specific case. As for the optimization, it doesn’t matter here since it is just for comparison, and I’m not aware of any optimization of this algorithm which can be done specifically for the Python.

Also, I ran these tests multiple times without any order, so any background process running should have impacted all the languages, but others gave very stable performance, unlike Python.