Python vs C Performance Part 2

Part 1 compared Python vs C, but left out optimizing C. So, here’s part 2:

GCC

The program I use to compile C programs is GCC(GNU project C and C++ compiler). I was using it with the default options, but upon searching the internet for C optimizors with gcc, I came across this. Basically, gcc can make certain optimizations, which can make running the program faster. So, since I gave Python a chance with PyPy, it’s time to give C a chance with optimization. Let’s go!

Testing

I have a program to find prime numbers, one in Python, and one in C. Both use the same method of finding prime numbers, but similar to part 1, this is a general overview, and there will be many other variables that can affect the performance.

I will be using the following arguments for both programs:

5000 "/dev/null"

Since CPython was way slower than C, which was slower than PyPy in part 1, I will just be leaving it out of this comparison. That allows me to test with a higher number, 5000, because I don’t have to wait minutes for each attempt with CPython.

PyPy

If you want to make your own benchmark, here is the command I’m using for this one:

time pypy primeMine.py 5000 "/dev/null"

Attempt 1

real    0m17.291s
user    0m15.485s
sys 0m0.184s

Ok, this seems rather slow. I have deleted the __pycache__ folder to see how it affects performance. Hopefully, it will be better next time.

Attempt 2

The __pycache__ folder is back, after being made sometime during the first run.

real    0m15.222s
user    0m15.096s
sys 0m0.088s

Ok, so it did perform a bit better. Maybe it’ll be even faster in the next attempt?

Attempt 3

Come on PyPy, I know you can do this!

real    0m15.340s
user    0m15.230s
sys 0m0.069s

Averages

  • Real: 15.95
  • User: 15.27
  • Sys: 0.11

C with no optimization

The command I will be using to compile is:

gcc primeMine.c -o primeMine

and I will run it with:

time ./primeMine 5000 "/dev/null"

I decided not to count compilation time, even for the first run. This may seem unfair, as I deleted the ___pycache___ folder for PyPy on the first run. However, when distributing the Python code, PyPy will have to generate the cache on the first run. With C, I can distribute the binary, which means I can cut out compilation time.

Attempt 1

real    0m40.906s
user    0m40.787s
sys 0m0.065s

Wow, much slower than even PyPy’s first run.

Attempt 2

As it’s already compiled, I doubt there will be a substantial difference here.

real    0m41.276s
user    0m41.179s
sys 0m0.040s

Wow, it was actually a bit slower, although that is likely to do with higher system load, or something like that.

Attempt 3

Staring at numbers gets boring real fast.

real    0m41.245s
user    0m41.128s
sys 0m0.044s

Averages

  • Real: 41.14
  • User: 21.03
  • Sys: 0.05

Optimizing C

So, according to this(or as far as I got in it 🙂 ), -O3 should give the best performance. So, let’s compile:

gcc primeMine.c -O3 -o primeMine

So, let’s do the above tests once more!

Attempt 1

real    0m10.145s
user    0m10.065s
sys 0m0.032s

Wow, just 10 seconds. That’s significantly faster than PyPy. Let’s see if the other tests go the same way.

Attempt 2

real    0m12.055s
user    0m11.671s
sys 0m0.056s

Ok, slightly slower, but that’s likely nothing to do with the actual program.

Attempt 3

real    0m10.161s
user    0m10.069s
sys 0m0.036s

This makes more sense.

Averages

  • Real: 10.79
  • User: 10.60
  • Sys: 0.04

Ranking

Ok, so, now let’s rank in order of speed.

First place goes to optimized C, with an average of 10.79 seconds. This isn’t really surprising, as normal C is already quite close to PyPy, so it makes sense that optimizing it will make it #1.

In second place, there is PyPy, with an average of 15.95 seconds. This isn’t that far behind C, and is impressive, especially considering it’s interpreted and was not compiled beforehand(or at all).

Finally, third place goes to normal C, with an average of 41.14 seconds. This is actually surprising, as the difference was much smaller in part 1. I suspect that after the initialization time, the difference becomes bigger and bigger the larger the numbers get. Still, it’s considerably faster than CPython.

So, what’s best?

Optimizing C is pretty much the best performance that is possible to get without resorting to writing machine code or assembly. However, PyPy is not actually that much slower, and Python is one of the simplest coding languages.

I use Python to turn my ideas into programs. If the performance is good enough, then I just leave it at that. If I need more performance, then the next step is PyPy, which may or may not give a speed boost. If the majority of compute time is in Python modules written in C, and not Python itself, then PyPy may not give that much of a performance boost. Finally, if even PyPy is not fast enough, then it’s time to rewrite the program in C.

There is no best language for everything. But, the best language is one which performs well enough, but is also easy to maintain and improve. I very rarely write the first version of a program in C, and usually opt for Python. Python helps me see if the logic works, and how to improve it. Then, if needed, it can be rewritten into other languages for better performance.

Many times, however, you will not need to choose just one programming language. There are many good programs which use a large amount of languages, each one for a different purpose. One great example is the Linux kernel:

  • 96.3% C
  • 1.4% C++
  • 1.4% Assembly
  • 0.3% Objective C
  • 0.3% Makefile
  • 0.1% Perl
  • 0.2% Other

There are at least 7 languages in use, and that’s just the kernel itself. The OS likely uses even more; Ubuntu even has some Python(I’m pretty sure).

Basically

Just use what you want, and change it if your needs change. Don’t use a lower level language if it’s hard to understand, and difficult to maintain. The “best” language is up to you, but I’d recommend Python.

TL;DR

CPython is slow. C is faster than CPython. PyPy is faster than C. Optimized C is faster than PyPy.