Guide: Function Calling Conventions

www.delorie.com/djgpp/doc/ug/asm/calling.html

search

Guide: Function Calling Conventions

GCC follows certain rules in generating and calling its functions. If you are writing portable C or C++ code, you never need to know about these rules. However, if you are writing assembly language or nonportable code that depends on these rules, you need to know what they are. This document attemps to describe them, and gives some examples.

Notes

This document assumes a familiarity with assembly language. The assembler code used here is written in the AT&T syntax, as used by GNU as. If you're using an Intel-syntax assembler, like nasm, you'll have to translate appropriately.

What's described here are GCC's standard calling conventions. Many can be changed by using options like -mregparm, but that's outside the scope of this document.

These conventions apply to C. C++ introduces several additional complications (such as class pointers and name mangling), some of which can change between compiler versions. Thus, I suggest that asm functions called from C++ code be declared as extern "C". This will cause C calling conventions to be used.

Writing Assembly-Language Functions

Naming

In DJGPP, a function's assembly-language name is the same as its C name, with an underscore ("_") prepended. Thus, the C function foo would be named _foo in assembly language. (This is in fact true for all symbol names, such as variables.) C++ has some much more complicated rules.

Registers

GCC requires that some registers not change across a function call. If you want to use these registers in an assembly function, you must save and restore their values. They are:

%ebx
%esi
%edi
%ebp (Footnote)
The segment registers %ds, %es and %ss

Other registers are available for your use (though some have other special uses; read on).

Return Value

Integers (of any size up to 32 bits) and pointers are returned in the %eax register.
Floating point values are returned in the 387 top-of-stack register, st(0).
Return values of type long long int are returned in %edx:%eax (the most significant word in %edx and the least significant in %eax).
Returning a structure is complicated and rarely useful; try to avoid it. (Note that this is different from returning a pointer to a structure.)

If your function returns void (e.g. no value), the contents of these registers are not used.

Memory Model

Very simple; all pointers and addresses are near. You need not worry about segments (unless your asm code has a specific need to do so). Your function should end with a simple ret.

Stack Layout

When GCC calls your function, it pushes all its arguments onto the stack, starting with the last one, then issues a call. This means that, on entry to your function, the stack is laid out like this:

          Last argument
	  ...
4(%esp)	  First argument
(%esp)    Return address

Sizes and layouts of individual arguments are as follows:

Integers up to 32 bits and pointers are pushed as a single longword.
long long int is pushed as two longwords; the least significant is pushed last (and so is located first in memory).
float and double are pushed as a double-precision value, occupying 8 bytes.
long double is pushed as an extended-precision value followed by 2 bytes of padding, totalling 12 bytes.
As before, structures are more complicated and best avoided.

These rules also apply to functions which take a variable number of arguments (like printf). As with any variadic function, the function must find its own way of determining how many arguments were actually passed (usually based on one of the required args).

The stack below the return address is available for temporary storage, but be sure to decrement %esp appropriately. Memory below %esp may be overwritten asynchronously, by interrupt handlers and such. Restore its value when exiting, so that the return works correctly. You may also push and pop at will.

You may modify your arguments in place if you wish; they will not be reused by the caller. Do not, however, attempt to pop them; the caller handles this.

Calling C Functions From Assembly Language

An assembly language function may wish to call a function written in C, either your own or one from the standard library. The same rules already explained apply; you just see them from the other side.

First, you push the function's arguments (if any) onto the stack, last argument first. See above for the formats used. (Floating point values are usually most easily handled by making space on the stack and then executing a store instruction; i.e. subl $8,%esp; fstpl (%esp).)

Use a simple call instruction to call the function.

You are responsible for removing the arguments you have pushed. They may have changed, so you may not reuse them. You need not, however, discard them at once; it may be more convenient when calling several functions to leave the arguments on the stack and pop them all together at the end. addl n,%esp is an efficient way to do this. It may also be convenient in this case to use %ebp as a frame pointer, since it need not change all the time. (The C compiler does this.)

The return value may be found as detailed above.

Expect the registers %eax, %ecx, and %edx, as well as the floating-point stack, to have changed. Standard library functions may modify the %gs register, and the _far* functions may modify %fs. Other registers will be preserved.

Conclusion

These are the basic calling conventions used by GCC; however, there are special cases, optional modifications, etc. that can apply in situations not covered here. In this case, gcc -S is your best friend - from assembly output, you can usually figure out the rules. Also helpful is the GCC source: see i386.h and i386.md in config/i386. They are well commented.

Examples

These examples show how some C functions might be rewritten in assembly language. While the functions here are pretty useless themselves, hopefully they demonstrate the principles involved.

i_avg

This function finds the average of two ints.

In C:

int i_avg (int a, int b)
{
  return (a + b) / 2;
}

In assembler:

# Stack layout on entry:
#
# 8(%esp)  b
# 4(%esp)  a
# (%esp)   return address

.globl _i_avg
_i_avg:
        movl 4(%esp), %eax
	addl 8(%esp), %eax    # Add the args
	sarl $1, %eax	      # Divide by 2
	ret		      # Return value is in %eax

ull_avg

This function finds the average of two unsigned long longs. (The unsigned-ness is a cop-out to make the division easier, since there is no sard instruction.)

In C:

unsigned long long ull_avg (unsigned long long a, unsigned long long b)
{
  return (a + b) / 2;
}

In assembler:

# Stack layout on entry:
#
#          (high half of b)
# 12(%esp) b
#          (high half of a)
# 4(%esp)  a
# (%esp)   return address

.globl _ull_avg
_ull_avg:
        movl 4(%esp), %eax
	movl 8(%esp), %edx
	addl 12(%esp), %eax    # Add low halves
	adcl 16(%esp), %edx    # Add high halves, with carry
	shrdl $1, %edx, %eax
	shrl $1, %edx	       # Divide by 2
	ret		       # Return value is in %edx:%eax

ld_avg

This function finds the average of two long doubles.

In C:

long double ld_avg (long double a, long double b)
{
  return (a + b) / 2.0;
}

In assembler:

# Stack layout on entry:
#
# 16(%esp) b (12 bytes)
# 4(%esp)  a (12 bytes)
# (%esp)   return address

two:
        .double 0f2.0  # The number 2.0

.globl _ld_avg
_ld_avg:
        fldt 4(%esp)
	fldt 16(%esp)
	faddp %st(1), %st(0) # Add
	fdivl two            # Divide %st(0) by 2.0
	ret                  # Result is in %st(0)

array_of_42

This function prints a message, allocates an array of a given size, and fills it with 42.

In C:

#include <stdio.h>
#include <stdlib.h>

int *array_of_42 (int n)
{
  int *p;
  int i;
  printf("Creating array of %d elements\n", n);
  p = malloc(n * sizeof(int));
  if (!p)
      return NULL;
  for (i = 0; i < n; i++)
      p[i] = 42;
  return p;
}

In assembler:

# Stack layout:
#
# 8(%ebp)  n
# 4(%ebp)  return address
# (%ebp)   pushed %ebp
format:
        .string "Creating array of %d elements\n"

.globl _array_of_42
_array_of_42:
        # We will use a frame pointer, since %esp will be changing.
	pushl %ebp
	movl %esp, %ebp
        pushl %edi           # Save %edi, since we'll be using it.
	# First, print the message.
	pushl 8(%ebp)
	pushl $format
	call _printf
	addl $8, %esp        # Remove printf args from the stack
	# Allocate the array.
	movl 8(%ebp), %ecx
	shll $2, %ecx        # Multiply by 4, which is sizeof(int)
	pushl %ecx
	call _malloc
	popl %ecx            # Remove malloc args from stack
	orl %eax, %eax	     # Test return value
	jz finished
	# Fill the array, using stosl.
	movl %eax, %edi      # Address
	movl %eax, %edx	     # and save a copy
	movl 8(%ebp), %ecx   # Count
	movl $42, %eax	     # Fill value
	rep
	stosl
	movl %edx, %eax      # Return value
finished:
        popl %edi            # Restore it
	popl %ebp
	ret

Footnotes

About using %ebp: Note that using -fomit-frame-pointer does not release you from the requirement to preserve %ebp. With this option enabled, the compiler may use %ebp for something else, but it still expects it to be saved across function calls. Furthermore, some functions cannot be compiled by GCC without a frame pointer.

webmaster	delorie software privacy
Copyright © 1999	Updated Jun 1999