Table of contents
Hi all, welcome back, it's been a long time since I've written a blog post, and it feels so good to be back doing it again.
So, today I wanna talk about a way of calling an ASM function from python, and this is part 0 (maybe), why maybe? well, the thing is that the way we are discussing today is more about ASM which is written as inline assembly, and not ASM which is written in a file. So I might (might) do that soon (I just haven't figured out a 'beautiful' way of doing that. I wish I was good at programming ๐ .)
Coming back to today's topic, I will try my level best to explain every line of code that I write and make sure that you understand everything that's written here.
Let's begin ๐ค
Let's have an objective, what are we building today?
We are going a build a function that adds two numbers but with a twist (yep, that's it). The twist is that instead of the usual a + b
we will do a + b + 1700
. Why 1700, it's the first number that came to my mind.
Here is a list of all the files that we have to 'fill out'.
add.c
add.h
add.c
The add.c
file contains the function that python calls and the function that contains the ASM code that adds two numbers with a twist of course.
So, here is the code, and I will explain what's happening.
__attribute__((naked)) int add_asm(int a, int b) {
__asm__(
".intel_syntax noprefix\n"
"xor rax, rax\n"
"add rax, rdi\n"
"add rax, rsi\n"
"add rax, 1700\n" // add_asm twist
"ret\n"
".att_syntax prefix\n");
}
int add_c(int a, int b) {
return add_asm(a, b);
}
Let's "decode" what's happening.
The add_c
function is just a simple wrapper function that calls the add_asm
function. The add_c
function is what we will be calling from python.
__attribute__((naked))
, you might have wondered what this is, well it's our way of instructing the C compiler in my case GCC not to add any prologue or epilogue to the final compiled assembly. Or in other words, the ASM that I write must be the only ASM code that should exist when this function is compiled.
__asm__
allows us to write inline assembly in C.
Now let's understand the assembly code that's written. The first thing that we do is to tell the compiler that we will be writing assembly in Intel Syntax (because I don't like AT&T syntax). In the end, we revert to its default.
Now, when it comes to adding we do the following steps.
xor rax, rax ; set rax to 0
add rax, rdi ; add rdi (parameter a) to rax
add rax, rsi ; add rsi (parameter b) to rax
add rax, 1700 ; add 1700 to rax
ret ; return the control back to the caller
You might be wondering how I know where the parameters are stored, well it's just a convention that everyone follows. According to this convention, the return value of a function must be stored in the rax
register.
add.h
For simplicity's sake, I'm just gonna say that the add.h
header file just contains the function declaration that we want to call from python.
#ifndef AD_ADD_H
#define AD_ADD_H
int add_c(int, int);
#endif
The AD_ADD_H
can be anything you want, the way I like to define is to have the first two letters of my first name followed by the name of the header file.
builder.py
This file will contain all the instructions on how to build the python module that we will finally import the function from.
from cffi import FFI
CDEF = '''
int add_c(int, int);
'''
ffibuilder = FFI()
ffibuilder.cdef(CDEF)
ffibuilder.set_source(
'_add_lib',
'''
#include "add.h"
''',
include_dirs=['.'],
sources=['add.c']
)
ffibuilder.compile(verbose=True) # compile with verbose output
We will be using cffi
to interface with the C code. It will build a Python wrapper for the C function based on the info that we provide.
You can install the package with the following command.
pip3 install cffi
Now, let's debunk the code.
First, we will create an instance of FFI()
class, we will then pass the C decelerations that we will be using from python.
Then we will pass on the information required to build the python module.
The first parameter to set_source
is the name of the module that it will build. So that finally when we import the module we can do this.
import _add_lib
The next parameter, you can think of it more like the extra code that we have to add to build the python module. In our case, and in most use cases, it's just the header files that we created.
The include_dirs
argument takes a list of the file paths that cffi will look for the extra header files that we've added.
The sources
argument contains a list of all the C files where it will search for the source of the function that we've declared.
Let's run it.
To compile run the following command.
python3 builder.py
We can test it using the python REPL.
>>> from _add_lib.lib import add_c
>>>
>>> add_c(10, 10)
1720
>>> add_c(0, 0)
1700
>>> add_c(0, -700)
1000
>>>
Yay, that works.
The End.
Well, that's it, if you have any doubts or questions feel free to ask me through the comments, DM, or mail. If you want more complicated examples like using strings and whatnot, just comment down below and I might make a dev post on it.
Byeee... Happy Hacking ๐คช