stdio.h fcntl.h io.h/unistd.h
signal.h process.h dir.h/dirent.h/direct.h
The heap memory functions malloc(), calloc(), realloc() and
free(), in stdlib.h, may also need to be changed.
In a well-written library, all of these changes will be confined to a relatively small number of files where the libc-to-OS interface occurs.
| Language | Portable code? | Fast code? | Small functions can be inlined? | Preprocessor? |
|---|---|---|---|---|
| C | Yes | No | Yes, with GCC | Yes |
| Inline assembler | No; same CPU and compiler only | Yes | Yes, with GCC | Same preprocessor as compiler |
| Non-inline assembler | No; same CPU and compatible linker only | Yes | No | No preprocessor; or different preprocessor than compiler [*] |
[*] With care, the GNU C preprocessor can be used with almost any assembler. Your makefiles must be rewritten so that assembly is a two-step process, with separate filename extensions for the initial asm file and the preprocessed file. GNU software frequently uses the extension '.S' for non-preprocessed asm, and '.s' for preprocessed asm. This causes problems in DOS systems, where filenames are not case-sensitive.
| 32-bit code | 16-bit code, TINY, SMALL, or COMPACT memory models | 16-bit code, MEDIUM, LARGE, or HUGE memory models | |
|---|---|---|---|
| Create standard stack frame, allocate 16 bytes for local variables, save registers | push ebp
mov ebp,esp
sub esp,16
push edi
push esi
... |
push bp
mov bp,sp
sub sp,16
push di
push si
... |
push bp
mov bp,sp
sub sp,16
push di
push si
... |
| Restore registers, destroy stack frame, and return | ...
pop esi
pop edi
mov esp,ebp
pop ebp
ret |
...
pop si
pop di
mov sp,bp
pop bp
ret |
...
pop si
pop di
mov sp,bp
pop bp
retf |
| Size of 'slots' in stack frame, i.e. stack width | 32 bits | 16 bits | 16 bits |
| Location of stack frame 'slots' | [ebp + 8] [ebp + 12] [ebp + 16]... |
[bp + 4] [bp + 6] [bp + 8]... |
[bp + 6] [bp + 8] [bp + 10]... |
If an argument passed to a function is wider than the stack, it will occupy more than one 'slot' in the stack frame. A 64-bit value passed to a function (long long or double) will occupy 2 stack slots in 32-bit code or 4 stack slots in 16-bit code.
Function arguments are accessed with positive offsets from the BP or EBP registers. Local variables are accessed with negative offsets. The previous value of BP or EBP is stored at [bp + 0] or [ebp + 0]. The return address (IP or EIP) is stored at [bp + 2] or [ebp + 4].
| 32-bit code | 16-bit code, all memory models | |
|---|---|---|
| 8-bit return value | AL | AL |
| 16-bit return value | AX | AX |
| 32-bit return value | EAX | DX:AX |
| 64-bit return value | EDX:EAX | space for the return value is allocated on the stack of the calling function, and a 'hidden' pointer to this space is passed to the called function |
| 128-bit return value | hidden pointer | hidden pointer |
EBX, EDI, ESI, EBP, DS, ES, SSYou need not save these registers:
EAX, ECX, EDX, FS, GS, EFLAGS, floating point registersIn some OSes, FS or GS may be used as a pointer to thread local storage (TLS), and must be saved if you modify it.
EXTERN _conv_mem_size ; NASM syntax
mov [_conv_mem_size],ax
Linux ELF does NOT use underscores. Watcom C uses trailing
underscores for function names, and leading underscores for global
variables.
If your GCC supports it, leading underscores can be turned off with the compiler option -fno-leading-underscore
In C, the calling function must 'clean up the stack' (remove function arguments from the stack after the called function returns). In Pascal, the called function must do this, before returning.
Pascal identifiers are case-insensitive. MyKewlProc() will be stored in the object code file as MYKEWLPROC
Watcom C uses a register-based calling convention. See sections 7.4, 7.5, 10.4, and 10.5 in cuserguide.pdf in the Watcom documentation. Individual functions can be declared to use the normal, stack-based calling convention.
GCC can be made to use a register calling convention by compiling with
gcc -mregparm=NNN ...
See the GCC documentation for details.
; C prototype ('extern' and parameter names 'arg1' and 'arg2' are optional):
; extern unsigned long long shr64(unsigned long long arg1, int arg2);
BITS 32
SECTION .text
GLOBAL _shr64 ; omit the underscores for Linux ELF
_shr64: push ebp
mov ebp,esp
; push ecx ; ECX is 'caller-save' for GCC
mov ecx,[ebp + 16] ; ECX=arg2, at slot #3
mov eax,[ebp + 8] ; EDX:EAX=arg1, at slot #1...
mov edx,[ebp + 12] ; ...and slot #2
again: shr edx,1
rcr eax,1 ; EDX:EAX >>= CL
loop again
; pop ecx
pop ebp
ret ; 64-bit return value in EDX:EAX
; C prototype: ; extern unsigned long shr32(unsigned long arg1, int arg2); SEGMENT _TEXT PUBLIC CLASS=CODE GLOBAL _shr32 _shr32: push bp mov bp,sp push cx mov cx,[bp + 8] ; CX=arg2, at slot #3 mov ax,[bp + 4] ; DX:AX=arg1, at slot #1... mov dx,[bp + 6] ; ...and slot #2 again: shr dx,1 ; DX:AX >>= CL rcr ax,1 loop again pop cx pop bp ret ; 32-bit return value in DX:AX
| as | NASM |
|---|---|
.ifdef UNDERBARS .macro EXP sym .global \sym \sym: .global _\sym _\sym: .endm .macro IMP sym .extern _\sym .equ \sym,_\sym .endm .else .macro EXP sym .global \sym \sym: .endm .macro IMP sym .extern \sym .endm .endif |
%ifdef UNDERBARS
%macro EXP 1
GLOBAL _$%1
_$%1:
GLOBAL $%1
$%1:
%endmacro
%macro IMP 1
EXTERN _$%1
%define %1 _$%1
%endmacro
%else
%macro EXP 1
GLOBAL $%1
$%1:
%endmacro
%macro IMP 1
EXTERN $%1
%endmacro
%endif
|
nasm -dUNDERBARS=1 ... as --defsym UNDERBARS=1 ...ELF systems (e.g. Linux) do not require leading underscores.
A good C (and C++) standard reference is at: http://www.dinkumware.com/htm_cl/index.html
The Better String library for C (bstrlib): http://bstring.sf.net/
- The unit of linkage is the module. For C, module == file. Put each function into its own file to prevent bloat (linking of unrelated and unnecessary functions).