Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 649 Vote(s) - 3.5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python: why are * and ** faster than / and sqrt()?

#1
While optimising my code I realised the following:

>>> from timeit import Timer as T
>>> T(lambda : 1234567890 / 4.0).repeat()
[0.22256922721862793, 0.20560789108276367, 0.20530295372009277]
>>> from __future__ import division
>>> T(lambda : 1234567890 / 4).repeat()
[0.14969301223754883, 0.14155197143554688, 0.14141488075256348]
>>> T(lambda : 1234567890 * 0.25).repeat()
[0.13619112968444824, 0.1281130313873291, 0.12830305099487305]

and also:

>>> from math import sqrt
>>> T(lambda : sqrt(1234567890)).repeat()
[0.2597470283508301, 0.2498021125793457, 0.24994492530822754]
>>> T(lambda : 1234567890 ** 0.5).repeat()
[0.15409398078918457, 0.14059877395629883, 0.14049601554870605]

I assume it has to do with the way python is implemented in C, but I wonder if anybody would care to explain why is so?
Reply

#2
The (somewhat unexpected) reason for your results is that Python seems to fold constant expressions involving floating-point multiplication and exponentiation, but not division. `math.sqrt()` is a different beast altogether since there's no bytecode for it and it involves a function call.

On Python 2.6.5, the following code:

x1 = 1234567890.0 / 4.0
x2 = 1234567890.0 * 0.25
x3 = 1234567890.0 ** 0.5
x4 = math.sqrt(1234567890.0)

compiles to the following bytecodes:

# x1 = 1234567890.0 / 4.0
4 0 LOAD_CONST 1 (1234567890.0)
3 LOAD_CONST 2 (4.0)
6 BINARY_DIVIDE
7 STORE_FAST 0 (x1)

# x2 = 1234567890.0 * 0.25
5 10 LOAD_CONST 5 (308641972.5)
13 STORE_FAST 1 (x2)

# x3 = 1234567890.0 ** 0.5
6 16 LOAD_CONST 6 (35136.418286444619)
19 STORE_FAST 2 (x3)

# x4 = math.sqrt(1234567890.0)
7 22 LOAD_GLOBAL 0 (math)
25 LOAD_ATTR 1 (sqrt)
28 LOAD_CONST 1 (1234567890.0)
31 CALL_FUNCTION 1
34 STORE_FAST 3 (x4)

As you can see, multiplication and exponentiation take no time at all since they're done when the code is compiled. Division takes longer since it happens at runtime. Square root is not only the most computationally expensive operation of the four, it also incurs various overheads that the others do not (attribute lookup, function call etc).

If you eliminate the effect of constant folding, there's little to separate multiplication and division:

In [16]: x = 1234567890.0

In [17]: %timeit x / 4.0
10000000 loops, best of 3: 87.8 ns per loop

In [18]: %timeit x * 0.25
10000000 loops, best of 3: 91.6 ns per loop

`math.sqrt(x)` is actually a little bit faster than `x ** 0.5`, presumably because it's a special case of the latter and can therefore be done more efficiently, in spite of the overheads:

In [19]: %timeit x ** 0.5
1000000 loops, best of 3: 211 ns per loop

In [20]: %timeit math.sqrt(x)
10000000 loops, best of 3: 181 ns per loop

**edit 2011-11-16:** Constant expression folding is done by Python's peephole optimizer. The source code (`peephole.c`) contains the following comment that explains why constant division isn't folded:

case BINARY_DIVIDE:
/* Cannot fold this operation statically since
the result can depend on the run-time presence
of the -Qnew flag */
return 0;

The `-Qnew` flag enables "true division" defined in [PEP 238][1].


[1]:

[To see links please register here]

Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through