Calling Assembly Function From Python

Calling Assembly Function From Python

ยท

5 min read

Hi all, welcome back, it's been a long time since I've written a blog post, and it feels so good to be back doing it again.

So, today I wanna talk about a way of calling an ASM function from python, and this is part 0 (maybe), why maybe? well, the thing is that the way we are discussing today is more about ASM which is written as inline assembly, and not ASM which is written in a file. So I might (might) do that soon (I just haven't figured out a 'beautiful' way of doing that. I wish I was good at programming ๐Ÿ˜….)

Coming back to today's topic, I will try my level best to explain every line of code that I write and make sure that you understand everything that's written here.

Let's begin ๐Ÿค“

Let's have an objective, what are we building today?

We are going a build a function that adds two numbers but with a twist (yep, that's it). The twist is that instead of the usual a + b we will do a + b + 1700. Why 1700, it's the first number that came to my mind.

Here is a list of all the files that we have to 'fill out'.

add.c

The add.c file contains the function that python calls and the function that contains the ASM code that adds two numbers with a twist of course.

So, here is the code, and I will explain what's happening.

__attribute__((naked)) int add_asm(int a, int b) {
    __asm__(
        ".intel_syntax noprefix\n"

        "xor rax, rax\n"
        "add rax, rdi\n"
        "add rax, rsi\n"
        "add rax, 1700\n"  // add_asm twist
        "ret\n"

        ".att_syntax prefix\n");
}

int add_c(int a, int b) {
    return add_asm(a, b);
}

Let's "decode" what's happening.

The add_c function is just a simple wrapper function that calls the add_asm function. The add_c function is what we will be calling from python.

__attribute__((naked)), you might have wondered what this is, well it's our way of instructing the C compiler in my case GCC not to add any prologue or epilogue to the final compiled assembly. Or in other words, the ASM that I write must be the only ASM code that should exist when this function is compiled.

__asm__ allows us to write inline assembly in C.

Now let's understand the assembly code that's written. The first thing that we do is to tell the compiler that we will be writing assembly in Intel Syntax (because I don't like AT&T syntax). In the end, we revert to its default.

Now, when it comes to adding we do the following steps.

xor rax, rax        ; set rax to 0
add rax, rdi        ; add rdi (parameter a) to rax
add rax, rsi        ; add rsi (parameter b) to rax
add rax, 1700       ; add 1700 to rax
ret                 ; return the control back to the caller

You might be wondering how I know where the parameters are stored, well it's just a convention that everyone follows. According to this convention, the return value of a function must be stored in the rax register.

add.h

For simplicity's sake, I'm just gonna say that the add.h header file just contains the function declaration that we want to call from python.

#ifndef AD_ADD_H
#define AD_ADD_H

int add_c(int, int);
#endif

The AD_ADD_H can be anything you want, the way I like to define is to have the first two letters of my first name followed by the name of the header file.

builder.py

This file will contain all the instructions on how to build the python module that we will finally import the function from.

from cffi import FFI


CDEF = '''
int add_c(int, int);
'''

ffibuilder = FFI()
ffibuilder.cdef(CDEF)
ffibuilder.set_source(
    '_add_lib',
    '''
    #include "add.h"
    ''',
    include_dirs=['.'],
    sources=['add.c']
)
ffibuilder.compile(verbose=True)  # compile with verbose output

We will be using cffi to interface with the C code. It will build a Python wrapper for the C function based on the info that we provide.

You can install the package with the following command.

pip3 install cffi

Now, let's debunk the code.

First, we will create an instance of FFI() class, we will then pass the C decelerations that we will be using from python.

Then we will pass on the information required to build the python module.

The first parameter to set_source is the name of the module that it will build. So that finally when we import the module we can do this.

import _add_lib

The next parameter, you can think of it more like the extra code that we have to add to build the python module. In our case, and in most use cases, it's just the header files that we created.

The include_dirs argument takes a list of the file paths that cffi will look for the extra header files that we've added.

The sources argument contains a list of all the C files where it will search for the source of the function that we've declared.

Let's run it.

To compile run the following command.

python3 builder.py

We can test it using the python REPL.

>>> from _add_lib.lib import add_c
>>>
>>> add_c(10, 10)
1720
>>> add_c(0, 0)
1700
>>> add_c(0, -700)
1000
>>>

Yay, that works.

The End.

Well, that's it, if you have any doubts or questions feel free to ask me through the comments, DM, or mail. If you want more complicated examples like using strings and whatnot, just comment down below and I might make a dev post on it.

Byeee... Happy Hacking ๐Ÿคช

ย