Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 579 Vote(s) - 3.45 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Why does switch_to use push+jmp+ret to change EIP, instead of jmp directly?

#1
In [`linux/arch/x86/include/asm/switch_to.h`][1], there's the definition of the macro `switch_to`, the key lines which do the real thread switch miracle read like this (until Linux 4.7 when it changed):

asm volatile("pushfl\n\t" /* save flags */ \
pushl %%ebp\n\t" /* save EBP */ \
"movl %%esp,%[prev_sp]\n\t" /* save ESP */ \
"movl %[next_sp],%%esp\n\t" /* restore ESP */ \
"movl $1f,%[prev_ip]\n\t" /* save EIP */ \
"pushl %[next_ip]\n\t" /* restore EIP */ \
__switch_canary \
"jmp __switch_to\n" /* regparm call */ \
"1:\t" \
"popl %%ebp\n\t" /* restore EBP */ \
"popfl\n" /* restore flags */ \


The named operands have memory constraints like `[prev_sp] "=m" (prev->thread.sp)`. `__switch_canary` is defined to nothing unless `CONFIG_CC_STACKPROTECTOR` is defined (then it's a load and store using `%ebx`).


I understand how it works, like the kernel stack pointer backup/restore, and how the `push next->eip` and `jmp __switch_to` with a `ret` instruction at the end of the function, which is actually a "fake" call instruction matched with a real `ret` instruction, and effectively make the `next->eip` the return point of the next thread.

What I don't understand is, why the hack? Why not just `call __switch_to`, then after it `ret`, `jmp` to `next->eip`, which is more clean and reader-friendly.


[1]:

[To see links please register here]

Reply

#2
There's two reasons for doing it this way.

One is to allow complete flexibility of operand/register allocation for `[next_ip]`. If you want to be able to do the `jmp %[next_ip]` _after_ the `call __switch_to` then it is necessary to have `%[next_ip]` allocated to a _nonvolatile register_ (i.e. one that, by the ABI definitions, will _retain its value_ when making a function call).

That introduces a restriction in the compiler's ability to optimize, and the resulting code for `context_switch()` (the 'caller' - where `switch_to()` is used) might not be as good as could be. But for what benefit ?

Well - that's where the second reason comes in, none, really, because `call __switch_to` would be equivalent to:

pushl 1f
jmp __switch_to
1: jmp %[next_ip]

i.e. it pushes the return address; you'd end up with a sequence `push`/`jmp` (`== call`)/`ret`/`jmp` while if you do not want to return to this place (and this code doesn't), you save on code branches by "faking" a call because you'd only have to do `push`/`jmp`/`ret`. The code makes itself _tail recursive_ here.

Yes, it's a small optimization, but avoiding a branch reduces latency and latency is critical for context switches.
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through