[ >> ]

[Index]

14. Backend Interface

14.1 Introduction

This chapter is under construction!

This chapter describes some of the internals of vbcc and tries to explain what has to be done to write a code generator for vbcc. However if someone wants to write one, I suggest to contact me first, so that it can be integrated into the source tree.

You have to create a new directory for the new target named ‘machines/<target-name>’ and write the files ‘machine.c’, ‘machine.h’ and ‘machine.dt’. The compiler for this target will be called vbcc<target-name> and can be built doing a make TARGET=<target-name> bin/vbcc<target-name>.

From now on integer means any of char, short, int, long, long long or their unsigned couterparts. Arithmetic means integer or float or double or long double. Elementary type means arithmetic or pointer.

Note that this documentation may mention explicit values when introducing symbolic constants. This is due to copying and pasting from the source code. These values may not be up to date and in some cases can be overridden. Therefore do never use the absolute values but rather the symbolic representations.

14.2 Building vbcc

This section deals with the steps necessary to build the typical vbcc executables from the sources.

14.2.1 Directory Structure

The vbcc-directory contains the following important files and directories:

‘vbcc/’: The main directory containing the compiler sources.
‘vbcc/Makefile’: The Makefile used to build vbcc.
‘vbcc/frontend/’: Directory containing the source to vc, the compiler driver.
‘vbcc/machines/<target>/’: Directory for the <target> backend.
‘vbcc/machines/ucpp/’: Directory containing the builtin preprocessor.
‘vbcc/vsc/’: Directory containing source to vsc, the instruction scheduler.
‘vbcc/bin/’: Directory the executables will be placed in.

All compiling is done from the main directory. The frontend vc is not target-dependend and therefore only one version is created.

Every available target has at least one subdirectory with its name in ‘vbcc/machines’ and contains at least the files ‘machine.h’, ‘machine.c’ and ‘machine.dt’. Target-specific object-files will also be stored in that directory.

The executables will be placed in ‘vbcc/bin/’. The main compiler will be called vbcc<target>.

14.2.2 Adapting the Makefile

Before building anything you have to insert correct values for CC, NCC, LDFLAGS and NLDFLAGS in the ‘Makefile’.

CC: Here you have to insert a command that invokes an ANSI C compiler you want to use to build vbcc. It must support ‘-D’, ‘-I’, ‘-c’ and ‘-o’ the same like e.g. vc or gcc. Additional options should also be inserted here. E.g. if you are compiling for the Amiga with vbcc you should add ‘-DAMIGA’.
LDFLAGS: Here you have to add options which are necessary for linking. E.g. some compilers need special libraries for floating-point.
NCC
NLDFLAGS: These are similar to CC and LDFLAGS but they must always describe a native compiler, i.e. Programs compiled with NCC/NLDFLAGS must be executable on the host system. This is needed because during the build programs may have to be executed on the host.

An example for the Amiga using vbcc would be:

      CC = vc -DAMIGA -c99
      LDFLAGS = -lmieee
      NCC = $(CC)
      NLDFLAGS = $(LDFLAGS)

An example for a typical Unix-installation would be:

      CC = cc
      LDFLAGS = -lm
      NCC = $(CC)
      NLDFLAGS = $(LDFLAGS)

The following settings are probably necessary for Open/Free/Any BSD i386 systems:

      CC = gcc -D_ANSI_SOURCE
      LDFLAGS = -lm
      NCC = $(CC)
      NLDFLAGS = $(LDFLAGS)

14.2.3 Building vc

Note to users of Open/Free/Any BSD i386 systems: You will probably have to use GNU make instead of BSD make, i.e. in the following examples replace "make" with "gmake".

Type:

      make bin/vc

14.2.4 Building vsc

Type:

      make TARGET=<target> bin/vsc<target>

For example:

      make TARGET=alpha bin/vscalpha

Omit this step if there is no file ‘machines/<target>/schedule.c’.

14.2.5 Building vbcc

Type:

      make TARGET=<target> bin/vbcc<target>

For example:

      make TARGET=alpha bin/vbccalpha

During the build the program dtgen will be generated and executed on the host-system. First it will ask you whether you are building a cross-compiler.

Answer y only if you are building a cross-compiler (i.e. a compiler which does not produce code for the same machine it it running on).

Note that it does _not_ matter if you are cross-building a compiler, i.e. if you are running on system A and building a B->B compiler by using an A->B compiler then you can answer n.

If you answered y you will be asked if your system/compiler offers certain datatypes. This refers to the compiler you described with CC in the Makefile. E.g. if CC is an A->B cross-compiler you have to answer the questions according to B. To each question answer y or n depending if such a datatype is available on that compiler. If you answered y you have to type in the name of that type on the compiler (e.g. signed int, unsigned char etc.). If there are not enough datatypes available to build vbcc an error message will be printed and the build aborts.

14.2.6 Configuring

Consult the vbcc-documentation for information on how to create the necessary config-files.

14.2.7 Building Cross-Compilers

As there is often confusion when it comes to cross-building compilers or building cross-compilers, here is what has to be done to cross-build a B->C cross-compiler on system A with only a native A->A compiler available.

This is done by first building an A->B compiler and then cross-building the B->C compiler using the A->B compiler.

For the first step you use the A->A compiler for CC as well as NCC. Now you type:

      make bin/vc
      make TARGET=B bin/vscB   # omit if there is no machines/B/schedule.c
      make TARGET=B bin/vbccB

The questions about datatypes are answered according to A. Then you should write a ‘vc.config’ for the vbccB cross-compiler.

Now create a second directory containing all the sources to vbcc and set CC/LDFLAGS to vc using the config-file for vbccB and NCC/NLDFLAGS to the A->A compiler. Type:

      make bin/vc
      make TARGET=C bin/vscC   # omit if there is no machines/C/schedule.c
      make TARGET=C bin/vbccC

14.3 The Intermediate Code

vbcc will generate intermediate code for every function and pass this code to the code generator which has to convert it into the desired output.

In the future there may be a code generator generator which reads a machine description file and generates a code generator from that, but it is not clear whether this could simplify much without taking penalties in the generated code. Anyway, this would be a layer on top of the current interface to the code generator, so that the interface described in this document would still be valid and accessible.

14.3.1 General Format

The intermediate code is represented as a doubly linked list of quadruples (I am calling them ICs from now on) consisting mainly of an operator, two source operands and a target. They are represented like this:

struct IC{
    struct IC *prev;
    struct IC *next;
    int code;
    int typf;
    int typf2;
    [...]
    struct obj q1;
    struct obj q2;
    struct obj z;
    [...]
    struct ext_ic ext;  /* optional */
};

The only members relevant to the code generator are prev, next, code, typf, typf2, q1, q2, z and (optionally) ext_ic.

prev and next are pointers to the previous and next IC. The first IC has prev==0 and the last one has next==0.

typf and typf2 are the type of the operands of this IC. In most ICs all operands have the same type and therefore only typf is used. However, some ICs have operands of different types (e.g. converting an operand to another type or adding an integer to a pointer). typf2 is used in these cases.

Macros are provided which yield the type of an operand. q1typ(), q2typ() and ztyp() return the type of the first source operand, the second source operand and the destination, respectively. They have to be passed a pointer to a valid IC as argument. The results are undefined if the IC does not contain the specified operand (e.g. q2typ() for an IC with only a single operand).

The standard types which are defined by default are:

    #define CHAR 
    #define SHORT 
    #define INT 
    #define LONG 
    #define LLONG
    #define FLOAT 
    #define DOUBLE 
    #define LDOUBLE
    #define VOID 
    #define POINTER 
    #define ARRAY 
    #define STRUCT 
    #define UNION 
    #define ENUM          /*  not relevant for code generator     */
    #define FUNKT

and can be additionally or’ed by

    #define UNSIGNED
    #define CONST
    #define VOLATILE
    #define UNCOMPLETE

However, only UNSIGNED is of real importance for the code generator. typf&NQ yields the type without any qualifiers, typf&NU yields the type without any qualifiers but UNSIGNED.

It is possible for backends to define additional types. See section Target-specific extended Types for documentation on how to extend the type system.

14.3.2 Operands

q1, q2 and z are the source1 (quelle1 in German), source2 and target (ziel) operands, respectively. If a result has to be computed, it always will be stored in the object z and the objects q1 and q2 usually may not be destroyed during this operation (unless they are aliased with the destination).

The objects are described by this structure:

struct obj{
    int flags;
    int reg;
    int dtyp;
    struct Var *v;
    struct AddressingMode *am;
    union atyps{
        zchar vchar;
        zchar vuchar;
        zshort vshort;
        zushort vushort;
        zint vint;
        zuint vuint;
        zlong vlong;
        zulong vulong;
        zllong vllong;
        zullong vullong;
        zmax vmax;
        zumax vumax;
        zfloat vfloat;
        zdouble vdouble;
        zldouble vldouble;
    } val;
};

flags specifies the kind of object. It can be a combination of

#define KONST 1: The object is a constant. Its value is in the corresponding (to typf or typf2) member of val.
#define VAR 2: The object is a variable. The pointer to its struct Var is in v. val.vlong vontains an offset that has to be added to it. Fur further details, see section Variables.
#define DREFOBJ 32: The content of the location in memory the object points to is used. dtyp contains the type of the pointer. In systems with only one pointer type, this will always be POINTER.
#define REG 64: The object is a register. reg contains its number.
#define VARADR 128: The address of the object is to be used. Only together with static variables (i.e. storage_class == STATIC or EXTERN).

The possible combinations of these flags should be:

0 (no object)
KONST
KONST|DREFOBJ
REG
VAR
VAR|REG
REG|DREFOBJ
KONST|DREFOBJ
VAR|DREFOBJ
VAR|REG|DREFOBJ
VAR|VARADR

Also some other bits which are not relevant to the code generator may be set.

Constants will usually be in q2 if possible. One of the sources always is not constant and the target is always an lvalue. The types of the operands can be queried using the macros q1typ(), q2typ() and ztyp(). In most cases (i.e. when not explicitly stated) the type is an elementary type (i.e. arithmetic or pointer).

am can be used to store information on special addressing modes. This has to be handled by the by the code generator. However am has to be 0 or has to point to a struct AddressingMode that was allocated using malloc() when the code generator returns. struct AddressingMode has to be defined in ‘machine.h’.

val stores either the value of the object if it is a constant or an offset if it is a variable.

code specifies the operation. Fur further details see section Operations.

14.3.3 Variables

A struct Var looks like:

    struct Var{
        int storage_class;
        [...]
        char *identifier;
        [...]
        zmax offset;
        struct Typ *vtyp;
        [...]
        char *vattr;
        unsigned long tattr;   /* optional */
    };

The relevant entries are:

identifier

The name of the variable. Usually only of interest for variables with external-linkage.

storage_class

One of:

            #define AUTO 1
            #define REGISTER 2
            #define STATIC 3
            #define EXTERN  4
            #define TYPEDEF 5       /*  not relevant    */

The backend should use the macros isauto(), isstatic() and isextern() to check which category a variable falls into.

offset

Contains an offset relative to the beginning of the variable’s storage. Used, for example, when accessing members of structures.

vtyp

The type of the variable (see section Composite Types).

vattr

A string with attributes used in the declaration of the variable. See section Target-specific Attributes for further details.

tattr

Flags used when declaring the variable. See section Target-specific Attributes for further details.

If the variable is not assigned to a register (i.e. bit REG is not set in the flags of the corresponding struct obj) then the variable can be addressed in the following ways (with examples of 68k-code):

isauto(storage_class) != 0

offset contains the offset inside the local-variables section. The code generator must decide how it’s going to handle the activation record. If offset < 0 then the variable is a function argument on the stack. In this case the offset in the parameter-area is - (offset + maxalign).

The code generator may have to calculate the actual offset to a stack- or frame-pointer from the value in offset.

                offset + val.vlong(sp)

Note that storage_class == REGISTER is equivalent to AUTO - whether the variable is actually assigned a register is specified by the bit REG in the flags of the struct obj.

isextern(storage_class) != 0

The variable can be addressed through its name in identifier.

                val.vlong + '_'identifier

isstatic(storage_class) != 0

The variable can be addressed through a numbered label. The label number is stored in offset.

                val.vlong+'l'offset

14.3.4 Composite Types

The C language offers types which are composed out of other types, e.g. structures or arrays. Therefore, a C type can be an arbitrarily complex structure. Usually the backend does not need to deal with those structures. The ICs contain only the simple type flags, e.g. INT or STRUCT, but not the members of a structure (instead the size is given).

Most backends do not have to deal with complex types at all, but there are some ways to access them, if needed (for example, they may be needed when generating debug information). Therefore, this chapter describes the representation of such full types.

Types are represented by the following structure:

struct Typ {
  int flags;
  struct Typ *next;
  struct struct_declaration *exact;
  zmax size;
  char *attr;
};

flags is the simple type as it is generally used in the backend. The meaning of the other members depends on flags. attr is an attribute that can be added to the type using the sytax __attr("...") (which is parsed like a type-qualifier, e.g. const). If several attributes are specified for a type, the strings will be concatenated, separated by semi-colons.

If the type is a pointer (ISPOINTER(flags) != 0), then next will point to the type the pointer points to.

If the type is an array (ISARRAY(flags) != 0), then size contains the number of elements and next points to a type structure representing the type of each array element.

If the type is a structure (ISSTRUCT(flags) != 0), a union (ISUNION(flags) != 0) or a function (ISFUNC(flags) != 0), then exact is a pointer to a struct_declaration (which is also used to represent unions and function prototypes) that looks like this:

struct struct_declaration {
  int count;
  int label;
  int typ;
  ...
  struct struct_list (*sl)[];
  char *identifier;
};

count is the number of members, label can be used to store a label when generating debug information. typ is either STRUCT, UNION or FUNKT to denote whether it applies to a structure, union or function-prototype.

identifier is only available for struct- and union-tags.

sl points to an array of struct struct_lists which contain information on each member/parameter:

struct struct_list {
  char *identifier;
  struct Typ *styp;
  zmax align;
  int bfoffset;
  int bfsize;
  int storage_class;
  int reg;
};

identifier is the identifier of the member/parameter, if available. styp denotes the full type, align the alignment in bytes (only for struct/union), bfoffset and bfsize the size and offset of bitfield-members, storage_class the storage class of function parameters (may be AUTO or REGISTER) and reg denotes the register a parameter is passed in.

Example: If struct Typ *t points to a structure-type, then the type of the second structure member can be accessed through (*t->exact->sl)[1].styp.

A prototyped function will have a last argument of type VOID unless it is a function accepting a variable number of arguments. If a function was declared without a prototype it will have no parameters, a function declared with prototype accepting no arguments will have one parameter of type VOID.

Also, in the case of a function type, the next-member of a struct Typ points to the return type of the function.

14.3.5 Operations

This section lists all the different operations allowed in the intermediate code passed to the backend. It lists the symbolic name of the code value (the value should not be used), a template of the operands and a description. The description sometimes contains internals (e.g. which types are stored in typf and which in typf2), but they should not be used. Access them using the macros provided (e.g. q1typ,q2typ,ztyp) whenever possible.

#define ASSIGN 2

Copy q1 to z. q1->z.

q2.val.vmax contains the size of the objects (this is necessary if it is an array or a struct). It should be accessed using the opsize()-macro. typf does not have to be an elementary type!

The only case where typf == ARRAY should be in automatic initializations.

It is also possible that (typf&NQ) == CHAR but the size is != 1. This is created for an inline memcpy/strcpy where the type is not known.

#define OR 16

#define XOR 17

#define AND 18

Bitwise boolean operations. q1,q2->z.

All operands are integers.

#define LSHIFT 25

#define RSHIFT 26

Bit shifting. q1,q2->z.

'q2' is the number of shifts. All operands are integers.

#define ADD 27

#define SUB 28

#define MULT 29

#define DIV 30

Standard arithmetic operations. q1,q2->z.

All operands are of arithmetic types (integers or floating point).

#define MOD 31

Modulo (%). q1,q2->z.

All operands are integers.

#define KOMPLEMENT 33

Bitwise complement. q1->z.

All operands are integers.

#define MINUS 38

Unary minus. q1->z.

All operands are of arithmetic types (integers or floating point).

#define ADDRESS 40

Get the address of an object. q1->z.

z is always a pointer and q1 is always an auto variable.

#define CALL 42

Call the function q1. q1.

q2.val.vmax contains the number of bytes pushed on the stack as function arguments for this call (use the pushedargsize()-macro to access this size). Those may have to be popped from the stack after the function returns depending on the calling mechanism.

A CALL IC has a member arg_cnt which contains the number of arguments to this function call. arg_list[i] (with i in the range 0...arg_cnt-1) contains the pointer to the IC passing the i-th argument.

#define CONVERT 50

Convert one type to another. q1->z.

z is always of the type typf, q1 of type typf2.

Conversions between floating point and pointers do not occur, neither do conversions to and from structs, unions, arrays or void.

#define ALLOCREG 65

Allocate a register. q1.

From now on the register q1.reg is in use. No code has to be generated for this, but it is probably useful to keep track of the registers in use to know which registers are available for the code generator at a certain time and which registers are trashed by the function.

#define FREEREG 66

Release a register. q1.

From now on the register q1.reg is free.

Also it means that the value currently stored in q1.reg is not used any more and provides a little bit of data flow information. Note however, if a FREEREG follows a branch, the value of the register may be used at the target of the branch.

#define COMPARE 77

Compare and set condition codes. q1,q2(->z).

Compare the operands and set the condition code, so that BEQ, BNE, BLT, BGE, BLE or BGT works as desired. If z.flags == 0 (this is always the case unless the backend sets multiple_ccs to 1 and ‘-no-multiple-ccs’ is not used) the condition codes will be evaluated only by an IC immediately following the COMPARE, i.e. the next instruction (except possible FREEREGs) will be a conditional branch.

However, if a target supports several condition code registers and sets the global variable multiple_ccs to 1, vbcc might use those registers and perform certain optimizations. In this case z may be non-empty and the condition codes have to be stored in z.

Note that even if multiple_ccs is set, a backend must nevertheless be able to deal with z == 0.

#define TEST 68

Test q1 against 0 and set condition codes. q1(->z)

This is equivalent to COMPARE q1,#0 but only the condition code for BEQ and BNE has to be set.

#define LABEL 69

Generate a label. typf specifies the number of the label.

#define BEQ 70

#define BNE 71

#define BLT 72

#define BGE 73

#define BLE 74

#define BGT 75

Branch on condition codes. (q1).

typf specifies the label where program execution shall continue, if the condition code is true (otherwise continue with next statement). The condition codes mean equal, not equal, less than, greater or equal, less or equal and greater than. If q1 is empty (q1.flags == 0), the codes set by the last COMPARE or TEST must be evaluated. Otherwise q1 contains the condition codes.

On some machines the type of operands of a comparison (e.g unsigned or signed) is encoded in the branch instructions rather than in the comparison instructions. In this case the code generator has to keep track of the type of the last comparison.

Similarly, in some architectures, the compare and the branch can be combined.

#define BRA 76

Branch always. typf specifies the label where program execution continues.

#define PUSH 78

Push q1 on the stack (for argument passing). q1.

q2.val.vmax contains the size of the object (should be accessed using the opsize()-macro), z.val.vmax contains the size that has to be pushed (access it using the pushsize()-macro). These sizes may differ due to alignment issues.

q1 does not have to be an elementary type (see ASSIGN). Also, q1 can be empty. This is used for ABIs which require stack-slots to be omitted.

Depending on ORDERED_PUSH the PUSH ICs are generated starting with the first or the last arguments. The direction of the stack-growth can be chosen by the backend. Note that this is only used for function-arguments, they can be pushed in opposite direction of the real stack.

#define ADDI2P 81

Add an integer to a pointer. q1,q2->z.

q1 and z are always pointers (of type typf2) and q2 is an integer of type typf. z has to be q1 increased by q2 bytes.

#define SUBIFP 82

Subtract an Integer from a pointer. q1,q2->z.

q1 and z are always pointers (of type typf2) and q2 is an integer of type typf. z has to be q1 decreased by q2 bytes.

#define SUBPFP 83

Subtract a pointer from a pointer. q1,q2->z.

q1 and q2 are pointers (of type typf2) and z is an integer of type typf. z has to be q1 - q2 in bytes.

#define GETRETURN 93

Get the return value of the last function call. ->z.

If the return value is in a register, its number will be q1.reg. Otherwise q1.reg will be 0. GETRETURN immediately follows a CALL IC (except possible FREEREGs).

#define SETRETURN 94

Set the return value of the current function. q1.

If the return value is in a register, its number will be z.reg. Otherwise z.reg will be 0. SETRETURN is immediately followed by a function exit (i.e. it is the last IC or followed by an unconditional branch to a label which is the last IC - always ignoring possible FREEREGs).

#define MOVEFROMREG 95

Move a register to memory. q1->z.

q1 is always a register and z an array of size regsize[q1.reg].

#define MOVETOREG 96

Load a register from memory. q1->z.

z is always a register and q1 an array of size regsize[z.reg].

#define NOP 97

Do nothing.

14.4 Type System

14.4.1 Target Data Types

As the compiler should be portable, we must not assume anything about the data types of the host system which is not guaranteed by ANSI/ISO C. Especially do not assume that the data types of the host system correspond to the ones of the target system.

Therefore, vbcc will provide typedefs which can hold a data type of the target machine and (as there is no operator overloading in C) functions or macros to perform arithmetic on these types.

The typedefs for the basic target’s data types (they can be extended by additional types) are:

zchar: Type char on the target machine.
zuchar: Type unsigned char on the target machine.
zshort: Type short on the target machine.
zushort: Type unsigned short on the target machine.
zint: Type int on the target machine.
zuint: Type unsigned int on the target machine.
zlong: Type long on the target machine.
zulong: Type unsigned long on the target machine.
zllong: Type long long on the target machine.
zullong: Type unsigned long long on the target machine.
zmax: A type capable of storing (and correctly doing arithmetic on) every signed integer type. Defaults to zllong.
zumax: A type capable of storing (and correctly doing arithmetic on) every unsigned integer type. Defaults to zullong.
zfloat: Type float on the target machine.
zdouble: Type double on the target machine.
zldouble: Type long double on the target machine.
zpointer: A byte pointer on the target machine. Not really used.

These typedefs and arithmetic functions to work on them will be generated by the program dtgen when compiling vbcc. It will create the files ‘machines/$(TARGET)/dt.h’ and ‘dt.c’.

These files are generated from ‘machines/$(TARGET)/machine.dt’ which must describe what representations the code generator needs. dtgen will then ask for available types on the host system and choose appropriate ones and/or install emulation functions, if available.

In ‘machine.dt’, every data type representation gets a symbol (the ones which are already available can be looked up in ‘datatypes/datatypes.h’ - new ones will be added when necessary). The first 14 lines must contain the representations for the following types:

signed char
unsigned char
signed short
unsigned short
signed int
unsigned int
signed long
unsigned long
signed long long
unsigned long long
float
double
long double
void *

If the code generator can use several representations, these can be added on the same line separated by spaces. E.g. the code generator for m68k does not care if the integers are stored big-endian or little-endian on the host system because it only accesses them through the provided arithmetic functions. It does, however, access floats and doubles through byte-pointers and therefore requires them to be stored in big-endian-format.

14.4.2 Target Arithmetic

Now you have a lot of functions/macros performing operations using the target machine’s arithmetic. You can look them up in ‘dt.h/dt.c’. E.g. zmadd() takes two zmax and returns their sum as zmadd. zumadd() does the same with zumax, zldadd() with long doubles. No functions for smaller types are needed because you can calculate with the wider types and convert the results down if needed.

Therefore, there are also conversion functions which convert between types of the target machine. E.g. zm2zc takes a zmax and returns the value converted to a zchar. Again, look at ‘dt.h/dt.c’ to see which ones are there.

A few functions for converting between target and host types are also there, e.g. l2zm takes a long and returns it converted to zmax.

At last there are functions for comparing target data types. E.g. zmleq(a,b) returns true if zlong a <= zlong b and false otherwise. zleqto(a,b) returns true if zlong a == zlong b and false otherwise.

14.5 ‘`machine.h`’

This section describes the contents of the file ‘machine.h’. Note that some optional macros/declaration may be described someplace else in this manual.

#include "dt.h"

This should be the first statement in ‘machine.h’.

struct AddressingMode { ... };

If machine-specific addressing modes (see section Adressing Modes) are used, an appropriate structure can be specified here. Otherwise, just enter the following code:

struct AddressingMode {
    int never_used;
};

#define MAXR <n>

Insert the number of available registers.

#define MAXGF <n>

Insert the number of command line flags that can be used to configure the behaviour of the code generator. This must be at least one even if you do not use any flags.

#define USEQ2ASZ <0/1>

If this is set to zero, vbcc will not generate ICs with the target operand being the same as the 2nd source operand. This can sometimes simplify the code-generator, but usually the code is better if the code-generator allows it.

#define MINADDI2P <type>

Insert the smallest integer type that can be added to a pointer. Smaller types will be automatically converted to type MINADDI2P when they are to be added to a pointer. This may be subsumed by shortcut() in the future.

#define MAXADDI2P <type>

Insert the largest integer type that can be added to a pointer. Larger types will be automatically converted to type MAXADDI2P when they are to be added to a pointer. This may be subsumed by shortcut() in the future.

#define BIGENDIAN <0/1>

Insert 1 if integers are represented in big endian, i.e. the most significant byte is at the lowest memory address, the least significant byte at the highest.

#define LITTLEENDIAN <0/1>

Insert 1 if integers are represented in little endian, i.e. the least significant byte is at the lowest memory address, the most significant byte at the highest.

#define SWITCHSUBS <0/1>

Insert 1 if switch-statements should be compiled into a series of SUB/TEST/BEQ instructions rather than COMPARE/BEQ. This may be useful if the target has a more efficient SUB-instruction which sets condition codes (e.g. 68k).

#define INLINEMEMCPY <n>

Insert the largest size in bytes allowed for inline memcpy. In optimizing compilation, certain library memcpy/strcpy-calls with length known at compile-time will be inlined using an ASSIGN IC if the size is less or equal to INLINEMEMCPY. The type used for the ASSIGN IC will be UNSIGNED|CHAR.

This may be replaced by a variable of type zmax in the future.

#define ORDERED_PUSH <0/1>

Insert 1 if PUSH ICs for function arguments shall be generated from left to right instead right to left.

#define HAVE_REGPARMS 1

Insert this line if the backend supports register parameters (see section Register Parameters).

#define HAVE_REGPAIRS 1

Insert this line if the backend supports register pairs (see section Register Pairs).

#define HAVE_INT_SIZET 1

Insert this line if size_t shall be of type unsigned int rather than unsigned long.

#define EMIT_BUF_LEN <n>

Insert the maximum length of a line of code output.

#define EMIT_BUF_DEPTH <n>

Insert the number of ouput lines that should be buffered. This can be useful for peephole-optimizing the assembly output (see below).

#define HAVE_TARGET_PEEPHOLE <0/1>

Insert 1 if the backend provides an asm_peephole() function (see section Peephole Optimizations on Assembly Output).

#define HAVE_TARGET_ATTRIBUTES 1

Insert this line if the backend provides old target-specific variable-attributes (see section Target-specific Attributes).

#define HAVE_TARGET_PRAGMAS 1

Insert this line if the backend provides target-specific #pragma-directives (see section Target-specific #pragmas).

#define HAVE_REGS_MODIFIED 1

Insert this line if the backend supports inter-procedural register-allocation (see section Inter-procedural Register-Allocation).

#define HAVE_TARGET_RALLOC 1

Insert this line if the backend supports context-sensitive register-allocation (see section Context-sensitive Register-Allocation).

#define HAVE_TARGET_EFF_IC 1

Insert this line if the backend provides a mark_eff_ics() function (see section Marking of efficient ICs).

#define HAVE_EXT_IC 1

Insert this line if the backend provides a struct ext_ic (see section Extended ICs).

#define HAVE_EXT_TYPES 1

Insert this line if the backend supports additional types (see section Target-specific extended Types).

#define HAVE_TGT_PRINTVAL 1

Insert this line if the backend provides an own printval function see section Target-specific printval).

#define JUMP_TABLE_DENSITY <float>

#define JUMP_TABLE_LENGTH <int>

These values controls the creation of jump-tables (see section Jump Tables).

#define ALLOCVLA_REG <reg>

#define ALLOCVLA_INLINEASM <inline-asm>

#define FREEVLA_REG <reg>

#define FREEVLA_INLINEASM <inline-asm>

#define OLDSPVLA_INLINEASM <inline-asm>

#define FPVLA_REG <reg>

Necessary defines for C99 variable-length-arrays (@pxref{vlas}).

#define HAVE_LIBCALLS 1

Insert this line if the backend wants certain ICs to be replaced with calls to library functions (@pxref{libcalls}).

#define AVOID_FLOAT_TO_UNSIGNED 1

#define AVOID_UNSIGNED_TO_FLOAT 1

Insert these lines to tell the frontend not to generate CONVERT ICs that convert between unsigned integers and floating point. In those cases, additional intermediate code will be generated that implements the conversion using only signed integers.

14.6 ‘`machine.c`’

This is the main part of the code generator. The first statement should be #include "supp.h" which will include all necessary declarations.

The following variables and functions must be provided by machine.c.

14.6.1 Name and Copyright

The codegenerater must define a zero-terminated character array char cg_copyright[]; containing name and copyright-notice of the code-generator.

14.6.2 Command Line Options

You can use code generator specific commandline options. The number of flags is specified as MAXGF in ‘machine.h’. Insert the names for the flags as char *g_flags_name[MAXGF]. If an option was specified (g_flags[i]&USEDFLAG) is not zero. In int g_flags[MAXGF] you can choose how the options are to be used:

0: The option can only be specified. E.g. if g_flags_name[2]=="myflag", the commandline may contain ‘-myflag’ and (g_flags[2]&USEDFLAG)!=0.
VALFLAG: The option must be specified with an integer constant, e.g. ‘-myflag=1234’. This value can be found in g_flags_val[2].l.
STRINGFLAG: The option must be specified with a string, e.g. ‘-myflag=Hallo’. The pointer to the string can be found in g_flags_val[2].p.

14.6.3 Data Types

The following variables have to be initialized to describe the representation of the data types.

MAX_TYPE

This macro contains the number of different types. In case of target-specific extended types (see section Target-specific extended Types) this is set by the backend, otherwise the frontend will use a default.

zmax char_bit;

The number of bits in a char on the target (usually 8).

zmax align[MAX_TYPE+1];

This array must contain the necessary alignments for every type in bytes. Some of the entries in this array are not actually used, but align[type&NQ] must yield the correct alignment for every type. align[CHAR] must be 1.

The alignment of a structure depends not only on sizetab[STRUCT] but also on the alignment of the members. The maximum of the alignments of all members and sizetab[STRUCT] is the alignment of any particular structure, i.e. sizetab[STRUCT] is only a minimum alignment.

The same applies to unions and arrays.

zmax maxalign;

This variable must be set to an alignment in bytes that is used when pushing arguments on the stack. (FIXME: describe stackalign)

zmax sizetab[MAX_TYPE+1];

This array must contain the sizes of every type in bytes.

zmax t_min[MAX_TYPE+1];

This array must contain the smallest representable number for every signed integer type.

zumax t_min[MAX_TYPE+1];

This array must contain the largest representable number for every signed integer type.

zumax tu_min[MAX_TYPE+1];

This array must contain the largest representable number for every unsigned integer type.

As zmax and zumax may be no elementary types on the host machine, those arrays have to be initialized dynamically (in init_cg()). It is recommended to use explicit typenames, e.g. sizetab[INT]=l2zm(4L); to keep it portable and allow later extensions of the type system.

Also note that those values may not be representable as constants by the host architecture and have to be calculated using the functions for arithmetic on the target’s data types. E.g. the smallest representable value of a 32bit twos-complement data type is not guaranteed to be valid on every ANSI C implementation.

You may not use simple operators on the target data types but you have to use the functions or convert them to an elementary type of the host machine before (if you know that it is representable as such).

14.6.4 Register Set

The following variables have to be initialized to describe the register set of the target.

MAXR

The valid registers are numbered from 1..MAXR. MAXR must be #defined in ‘machine.h’.

char *regnames[MAXR+1]

This array must contain the names for every register. They do not necessarily have to be used in the assembly output but are used for explicit register arguments.

zmax regsize[MAXR+1]

This array must contain the size of each register in bytes. It is used to create storage if registers have to be saved.

int regscratch[MAXR+1]

This array must contain information whether a register is a scratchregister, i.e. may be destroyed during a function call (1 or 0). vbcc will generate code to save/restore all scratch-registers which are assigned a value when calling a function (unless it knows the register will not be modified). However, if the code generator uses additional scratch-registers it has to take care to save/restore them.

Also, the code generator must take care that used non-scratch-registers are saved/restored on function entry/exit.

int regsa[MAXR+1]

This array must contain information whether a register is in use or not at the beginning of a function (1 or 0). The compiler will not use any of those registers for register variables or temporaries, therefore this can be used to mark special registers like a stack- or frame-pointer and to reserve registers to the code-generator. The latter may be reasonable if for many ICs code cannot be generated without using additional registers.

You must set regsratch[i] = 0 if regsa[i] == 1. If you want it to be saved across function calls the code generator has to take care of this.

int reg_prio[MAXR+1];

This array must contain a priority (>=0) for every register. When the register allocator has to choose between several registers which seem to be equal, it will choose the one with the highest priority (if several registers have the same priority it is undefined which one will be taken).

Note that this priority is only the last decision factor if everything else seems to be equal. If one register seems to give a higher cost saving (according to the estimation of the register allocator) but has a lower priority, it will nevertheless be used. The priority can be used to fine-tune the register selection. Some guidelines:

- Scratch registers might have a higher priority than non-scratch registers (although the register-allocator will usually handle this anyway).
- Registers which are more restricted should have a higher priority (if they seem to give the same saving it is usually better to use the restricted registers and try to keep the more versatile ones for situation in which they can give better savings).
- Registers which are used for argument-passing should have lower priority than registers not used for arguments. The priority within the argument-registers should decrease as the frequency of usage as argument increases (typically the register for the first argument ist used most frequently, etc.).

Note that for the array zmax regsize[] the same comments mentioned in the section on data types regarding initialization apply.

14.6.5 Functions

The following functions have to be implemented by the code generator. There may be optional additional functions described in other sections.

int init_cg(void);

This function is called after the commandline arguments are parsed. It can set up certain internal data, etc. The arrays regarding the data types and the register set can be set up at this point rather than with a static initialization, however the arrays regarding the commandline options have to be static initialized. The results of the commandline options are available at this point.

If something goes wrong, 0 has to be returned, otherwise 1.

void cleanup_cg(FILE *f);

This function is called before the compiler exits. f is the output file which must be checked against 0 before using.

int freturn(struct Typ *t);

This function has to return the number of the register return values of type t are passed in. If the type is not passed in a register, 0 must be returned. Usually the decision can be made only considering t->flags, ignoring the full type (see section Composite Types).

int regok(int r, int t, int mode);

Check whether the type t can be stored in register r and whether the usual operations (for this type) can be generated. Return 0, if not.

If t is a pointer and mode==0 the register only has to be able to store the pointer and do arithmetic, but if mode!=0 it has to be able to dereference the pointer.

mode==-1 is used with context-sensitive register-allocation (see section Context-sensitive Register-Allocation). If the backend does not support it, this case can be handled equivalent to mode==0.

If t==0 return whether the register can be used to store condition codes. This is only relevant if multiple_ccs is set to 1.

int dangerous_IC(struct IC *p);

Check if this IC can raise exceptions or is otherwise dangerous. Movement of ICs which are dangerous is restricted to preserve the semantics of the program.

Typical dangerous ICs are divisions or pointer dereferencing. On certain targets floating point or even signed integer arithmetic can raise exceptions, too.

int must_convert(int from,int to,int const_expr);

Check if code must be generated to convert from type from to type to. E.g. on many machines certain types have identical representations (integers of the same size or pointers and integers of the same size).

If const_expr != 0 return if a conversion was necessary in a constant expression.

For example, a machine may have identical pointers and integers, but different sets of registers (one set supports integer operations and the other pointer operations). Therefore, must_convert() would return 1 (we need a CONVERT IC to move the value from one register set to the other).

This would imply that vbcc would not allow a cast from a pointer to an integer or vice-versa in constant expressions (as it will not generate code for static initializations). However, in this case, a static initialization would be ok as the representation is identical and registers are not involved. Therefore, the backend can return 1 if const_expr == 0 and 0 otherwise.

int shortcut(int code, int t);

In C no operations are done with chars and shorts because of integral promotion. However sometimes vbcc might see that an operation could be performed with the short types yielding the same result.

Before generating such an instruction with short types vbcc will ask the code generator by calling shortcut() to find out whether it should do so. Return true iff it is a win to perform the operation code with type t rather than promoting the operands and using e.g. int.

void gen_code(FILE *f, struct IC *p, struct Var *v, zmax offset);

This function has to emit code for a function to stream f. v is the function being generated, p is a pointer to the list of ICs, that has to be converted. offset is the space needed for local variables in bytes.

This function has to take care that only scratchregisters are destroyed by this function. The array regused contains information about the registers that have been used by vbcc in this function. However if the code generator uses additional registers it has to take care of them, too.

The regs[] and regused[] arrays may be overwritten by gen_code() as well as parts of the list of ICs. However the list of ICs must still be a valid list of ICs after gen_code() returns.

All assembly output should be generated using the available emit functions. These functions are able to keep several lines of assembly output buffered and allow peephole optimizations on assembly output (see section Peephole Optimizations on Assembly Output).

void gen_ds(FILE *f, zmax size, struct Typ *t);

Has to emit output that generates size bytes of type t initialized with proper 0.

t is a pointer to a struct Typ which contains the precise type of the variable. On machines where every type can be initialized to 0 by setting all bits to zero, the type does not matter. Otherwise see section Composite Types.

All assembly output should be generated using the available emit functions.

void gen_align(FILE *f, zmax align);

Has to emit output that ensures the following data to be aligned to align bytes.

All assembly output should be generated using the available emit functions.

void gen_var_head(FILE *f, struct Var *v);

Has to print the head of a static or external variable v. This includes the label and necessary informations for external linkage etc.

Typically variables will be generated by a call to gen_align() followed by gen_var_head() and (a series of) calls to gen_dc() and/or gen_ds(). It may be necessary to keep track of the information passed to gen_var_head().

All assembly output should be generated using the available emit functions.

void gen_dc(FILE *f, int t, struct const_list *p);

Emit initialized data. t is the basic type that has to be emitted. p points to a struct const_list.

If p->tree != 0 then p->tree->o is a struct obj which has to be emitted. This will usually be the address of a variable of storage class static or unsigned, possibly with an offset added (see section Operands for further details).

if p->tree == 0 then p->val is a union atyps which contains (in the member corresponding to t) the constant value to be emitted.

All assembly output should be generated using the available emit functions.

void init_db(FILE *f);

If debug-information is requested, this functions is called after init_cg(), but before any code is generated. See also Debug Information.

void cleanup_db(FILE *f);

If debug-information is requested, this functions is called prior to cleanup_cg(). See also Debug Information.

14.7 Available Support Functions, Macros and Variables

This section lists a series of general variables, macros and functions which are available to the backend and may prove useful. Note that there may be additional support specific to certain features which will be mentioned at appropriate sections in this manual.

MAXINT

A constant for the largest target integer type (zmax). It is outside the range of the other types and cannot be accessed by an application (although there will usually be an accessible type with the same representation).

MAX_TYPE

The type number of the last type.

NQ

A mask. t & NQ will delete all type-qualifiers of a type.

NU

A mask. t & NU will delete all type-qualifiers but UNSIGNED of a type.

q1typ(p)

Yields the type of the first source operand of IC p. Undefined if the operand is not used!

q2typ(p)

Yields the type of the second source operand of IC p. Undefined if the operand is not used!

ztyp(p)

Yields the type of the destination operand of IC p. Undefined if the operand is not used!

iclabel(p)

Returns the label of an IC. Only defined if p->code is LABEL, BEQ, BNE, BLT, BGT, BLE or BGE.

opsize(p)

Returns the size of the operand of an ASSIGN or PUSH IC as zmax.

pushsize(p)

Returns the stack-adjustment value of a PUSH IC as zmax. It is always greater or equal than opsize(p).

pushedargsize(p)

Returns the space occupied by arguments passed on the stack as parameters for a function call. Only valid for CALL ICs.

isstatic(sc)

Tests whether the storage-class sc denotes a variable with static storage and no external linkage.

isextern(sc)

Tests whether the storage-class sc denotes a variable with static storage and external linkage.

isauto(sc)

Tests whether the storage-class sc denotes a variable with automatic storage-duration.

t_min(t)

t_max(t)

These macros yield the smallest and largest representable value of any target integer type, e.g. t_min(INT) or t_max(UNSIGNED|LONG).

ISPOINTER(t)

ISINT(t)

ISFLOAT(t)

ISFUNC(t)

ISSTRUCT(t)

ISUNION(t)

ISARRAY(t)

ISSCALAR(t)

ISARITH(t)

These macros test whether the simple type t is a pointer type, an integral type, a floating point type, a function, a structure type, a union type, an array type, a scalar (integer, floating point or pointer) and an arithmetic type (integer or floating point), respectively.

int label;

The number of the last label used so far. For a new label number, use ++label.

zmax falign(struct Typ *t);

This function returns the alignment of a full type. Contrary to the align[] array provided by the backend (which is used by this function), it will yield correct values for composite types like structures and arrays.

zmax szof(struct Typ *t);

This function returns the size in bytes of a full type. Contrary to the sizetab[] array provided by the backend (which is used by this function), it will yield correct values for composite types like structures and arrays.

void *mymalloc(size_t size);

void *myrealloc(void *p,size_t size);

void myfree(void *p);

Memory allocation functions similar to malloc(), realloc() and free. They will automatically clean up the exit in the case an allocation fails. Also, some debug possibilities are available.

void emit(FILE *f,const char *fmt,...);

void emit_char(FILE *f,int c) ;

void emitval(FILE *f,union atyps *p,int t);

void emitzm(FILE *f,zmax x);

void emitzum(FILE *f,zumax x);

All output produced by the backend should be produced using these functions. emit() uses a format like printf(), emitval(), emitzm() and emitzum() are suitable to output target integers as decimal text. Currently emitting floating point constants has to be done by the backend.

int is_const(struct Typ *);

Tests whether a full type is constant (e.g. to decide whether it can be put into a ROM section).

int is_volatile_obj(struct obj *);

int is_volatile_ic(struct IC *);

Tests whether an object or IC is volatile. Only of interest to the backend in rare cases.

int switch_IC(struct IC *p);

This function checks whether p->q2 and p->z use the same register (including register pairs). If they do, it will try to swap p->q1 and p->q2 (only possible if the IC is commutative). It is often possible to generate better code if p->q2 and p->z do not collide. Note however, that it is not always possible to eliminate a conflict and the code generator still has to be able to handle such a case.

The function returns 0 if no modification took place and non-zero if the IC has been modified.

union atyps gval;

void eval_const(union atyps *p,int t);

void insert_const(union atyps *p,int t);

For every target data type there is a corresponding global variable of that type, e.g. zchar vchar, zuchar vuchar, zmax vmax etc. These two functions simplify handling of target data types by transferring between a union atyps and these variables.

eval_const() reads the member of the union corresponding to the type t and converts it into all the global variables while insert_const() takes the global variable according to t and puts it into the appropriate member of the union atyps.

The global variable gval may be used as a temporary union atyps by the backend.

void printzm(FILE *f,zmax x);

void printzum(FILE *f,zumax x) ;

void printval(FILE *f,union atyps *p,int t);

void printtype(FILE *o,struct Typ *p);

void printobj(FILE *f,struct obj *p,int t);

void printic(FILE *f,struct IC *p);

void printiclist(FILE *f,struct IC *first);

This is a series of functions which print a more or less human readable version of the corresponding type to a stream. These functions are to be used only for debugging purposes, not for generating code. Also, the arguments must contain valid values.

bvtype

BVSIZE(n)

vbcc provides macros and functions for handling bit-vectors which may also be used by the backend. bvtype is the basic type to create bit-vectors of. BVSIZE(n) yields the number of bytes needed to implement a bit-vector with n elements.

bvtype *mybv = mymalloc(BVSIZE(n));

BSET(bv,n)

BCLR(bv,n)

BTST(bv,n)

Macros which set, clear and test the n-th bit in bit-vector bv.

void bvunite(bvtype *dest,bvtype *src,size_t len);

void bvintersect(bvtype *dest,bvtype *src,size_t len);

void bvdiff(bvtype *dest,bvtype *src,size_t len);

These functions calculate the union, intersection and difference of two bit-vectors. dest is the first operand as well as the destination. len is the length of the bit-vectors in bytes, not in bits.

void bvcopy(bvtype *dest,bvtype *src,size_t len);

void bvclear(bvtype *dest,size_t len);

void bvsetall(bvtype *dest,size_t len);

These functions copy, clear and fill bit-vectors.

int bvcmp(bvtype *bv1,bvtype *bv2,size_t len);

int bvdointersect(bvtype *bv1,bvtype *bv2,size_t len);

These functions test whether two bit-vectors are equal or have a non-empty intersection, respectively. The do not modify the bit-vectors.

14.8 Hints for common Optimizations

While it is no easy job to produce a stable code generator for a new target architecture, there is a huge difference between a simple backend and a highly optimized code generator which produces small and efficient high quality code. Although vbcc is able to do a lot machine independent global optimizations for every target automatically, it is still common for an optimized backend to produce code up to twice as fast on average as a simple one.

Sometimes, a simple backend is sufficient and the work required to produce high-quality code is not worthwile. However, this section lists a series of common backend optimizations which are often done in case that good code-quality is desired. Note that neither are all of these optimizations applicable (without modifications or at all) to all architectures nor is this an exhaustive list. It is just a list of recommendations to consider. You have to make sure that the optimization is safe and beneficial for the architecture you are targetting.

14.8.1 Instruction Combining

While ICs are often a bit more powerful than instructions of a typical microprocessor, sometimes several of them can be implemented by a single instruction or more efficient code can be generated when looking at a few of them rather than at each one separately.

In the simple case, this can be done by looking at the current IC, deciding whether it is a candidate for combining and then test whether the next IC (or ICs) are suitable for combining. This is relatively easy to perform, however some care has to be taken to verify that the combination is indeed legal (e.g. what happens if the first IC modifies a value which is used by the following IC).

A more sophisticated implementation might look at a larger sequence of instructions to find more possibilities for optimization. Detecting whether the combination is legal becomes much more difficult then.

Sometimes the IC migh compute a temporary result which would be eliminated by the complex machine instruction. Then it is necessary to verify that it was indeed a temporary result which is not used anywhere else. As long as the result is in a register, this can be done by checking for a FREEREG IC.

Examples for instruction combining are multiply-and-add or bit-test instructions which are available on many architectures. Special cases are complex addressing modes and instructions which can automatically set condition codes which are described in the following sections.

14.8.2 Adressing Modes

The intermediate code generated by vbcc does not use any addressing-modes a target might offer. Therefore the code generator must find a way to combine several statements if it wants to make use of these modes. E.g. on the m68k the intermediate code

        ADDI2P  int     a0,#20->a1
        ASSIG   int     #10->(a1)
        FREEREG         a1

could be translated to

        move.l  #10,20(a0)

(notice the FREEREG which is important).

To aid in this there is a pointer to a struct AdressingMode in every struct obj. A code generator could e.g. do a pass over the intermediate code, find possible uses for addressing-modes, allocate a struct AddressingMode and store a pointer in the struct obj, effectively replacing the obj.

If the code generator supports extended addressing-modes, you have to think of a way to represent them and define the struct AddressingMode so that all modes can be stored in it. The machine independant part of vbcc will not use these modes, so your code generator has to find a way to combine several statements to make use of these modes.

A possible implementation of a structure to handle the addressing mode described above as well as a register-indirect mode could be:

#define IMM_IND 1
#define REG_IND 2

struct AddressingMode {
  int flags;   /* either IMM_IND or REG_IND */
  int base;    /* base register */
  zmax offset; /* offset in case of IMM_IND */
  int idx;     /* index register in case of REG_IND */
}

When the code generator is done that pointer in every struct obj must either be zero or point to a mymalloced struct AddressingMode which will be free’d by vbcc.

Following is an example of a function which traverses a list of ICs and inserts addressing modes with constant offsets where possible.

/* search for possible addressing-modes */
static void find_addr_modes(struct IC *p)
{
  int c,c2,r;
  struct IC *p2;
  struct AddressingMode *am;

  for(;p;p=p->next){
    c=p->code;

    if(IMM_IND&&(c==ADDI2P||c==SUBIFP)&&
       isreg(z)&&(p->q2.flags&(KONST|DREFOBJ))==KONST){
      /* we have found addi2p q1,#const->reg */
      int base;zmax of;struct obj *o;

      eval_const(&p->q2.val,p->typf);
      /* handle sub instead of add */
      if(c==SUBIFP)
        of=zmsub(l2zm(0L),vmax);
      else
        of=vmax;

      /* Is the offset suitable for an addressing mode? */
      if(ISVALID_OFFSET(vmax)){
        r=p->z.reg;
        /* If q1 is a register, we use it as base-register,
           otherwise q1 is loaded in the temporary register
           and this one used as base register. */
        if(isreg(q1))
          base=p->q1.reg;
        else
          base=r;

        o=0;
        /* Now search the following instructions. */
        for(p2=p->next;p2;p2=p2->next){
          c2=p2->code;

          /* End of a basic block. We have to abort. */
          if(c2==CALL||c2==LABEL||(c2>=BEQ&&c2<=BRA)) break;

          /* The temporary register is used. We have to abort. */
          if(c2!=FREEREG&&(p2->q1.flags&(REG|DREFOBJ))==REG&&
             p2->q1.reg==r)
              break;
      	  if(c2!=FREEREG&&(p2->q2.flags&(REG|DREFOBJ))==REG&&
             p2->q2.reg==r) 
              break;

          if(c2!=CALL&&(c2<LABEL||c2>BRA)&&c2!=ADDRESS){
            /* See, if we find a valid use (dereference) of the
               temporary register. */
            if(!p->q1.am&&(p2->q1.flags&(REG|DREFOBJ))==(REG|DREFOBJ)&&
              p2->q1.reg==r){
              if(o) break;
              o=&p2->q1;
            }
            if(!p->q1.am&&(p2->q2.flags&(REG|DREFOBJ))==(REG|DREFOBJ)&&
               p2->q2.reg==r){
              if(o) break;
              o=&p2->q2;
            }
            if(!p->q1.am&&(p2->z.flags&(REG|DREFOBJ))==(REG|DREFOBJ)&&
               p2->z.reg==r){
              if(o) break;
              o=&p2->z;
            }
          }
          if(c2==FREEREG||(p2->z.flags&(REG|DREFOBJ))==REG){
            int m;
            if(c2==FREEREG)
              m=p2->q1.reg;
            else
              m=p2->z.reg;
            if(m==r){
            /* The value of the temporary register is not used any more
               (either due to FREEREG or because it is overwritten).
               If we have found exactly one dereference, we can use
               a target addressing mode. */
              if(o){
                o->am=am=mymalloc(sizeof(*am));
                am->flags=IMM_IND;
                am->base=base;
                am->offset=zm2l(of);
                if(isreg(q1)){
                  /* The base already was in a register. We can
                     eliminate the ADDI2P IC. */
                  p->code=c=NOP;p->q1.flags=p->q2.flags=p->z.flags=0;
                }else{
                  /* The base was not in a register.
                     We have to load it . */
                  p->code=c=ASSIGN;p->q2.flags=0;
                  p->typf=p->typf2;p->q2.val.vmax=sizetab[p->typf2&NQ];
                }
              }
              break;
            }
            if(c2!=FREEREG&&m==base) break;
            continue;
          }
        }
      }
    }
  }
}

14.8.3 Implicit setting of Condition Codes

Many architectures have instruction that automatically set the condition codes according to the computed result. For these architectures it is generally a good idea to keep track of the setting of condition codes (e.g. if they reflect the state of some object or register). A subsequent TEST or COMPARE instruction can then often be eliminated.

Care has to be taken to delete this information if either the condition codes may be modified or the object they represent is modified. Also, this optimization is usually hard to do across labels.

Some architectures provide versions of instructions which set condition codes as well as versions which do not. This obviously enable more optimizations, but it is more difficult to make use of this. One possibility is to search the list of ICs backwards starting from every suitable TEST or COMPARE instruction. If an IC is found which computes the tested object, the IC can be marked (extended ICs can be used for marking, see section Extended ICs).

14.8.4 Register Parameters

While passing of arguments to functions can be done by pushing them on the stack, it is often more efficient to pass them in registers if the architecture has enough registers.

To use register parameters you have to add the line

#define HAVE_REGPARMS 1

to ‘machine.h’ and define a

        struct reg_handle {...}

This struct is used by the compiler to find out which register should be used to pass an argument. ‘machine.c’ has to contain an initialized variable

        struct reg_handle empty_reg_handle;

which represents the default state, and a function

        int reg_parm(struct reg_handle *, struct Typ *, int vararg, struct Typ *);

which returns the number of the register the next argument will be passed in (or 0 if the argument is not passed in a register). Also, it has to update the reg_handle in a way that successive calls to reg_parm() yield the correct register for every argument.

vararg is different from zero, if the argument is part of the variable arguments of a function accepting a variable number of arguments.

It is also possible to return a negative number x. In this case, the argument will be passed in register number -x, but also a stack-slot will be reserved for this argument (i.e. a PUSH IC without an operand will be generated). If ‘-double-push’ is specified, the argument will also be written to the stack-slot (i.e. it will be passed twice, in a register and on the stack).

14.8.5 Register Pairs

Often, there are types which cannot be stored in a single machine register, but it may be more efficient to store them in two registers rather than in memory. Typical examples are integers which are bigger than the register size or architectures which combine two floating point registers into one register of double precision.

To make use of register pairs, the line

#define HAVE_REGPAIRS 1

has to be added to ‘machine.h’. The register pairs are declared as normal registers (each register pair counts as an own register and MAXR has to be adjusted). Usually only adjacent registers are declared as register pairs. Note that regscratch must be identical for both registers of a pair.

Now the function

int reg_pair(int r,struct rpair *p);

must be implemented. If register r is a register pair, the function has to set p->r1 and p->r2 to the first and second register which comprise the pair and return 1. Otherwise, zero has to be returned.

14.8.6 Elimination of Frame-Pointer

Local variables on the stack can usually be addressed via a so-called frame-pointer which is set set to current stack-pointer at the entry of a function. However, in the code generated by vbcc, the difference between the stack-pointer and the frame-pointer is fixed at any instruction.

Therefore it is possible to keep track of this offset (by counting the bytes every time code for pushing or popping from the stack is generated). Using this offset, local variables can perhaps be addressed using the stack-pointer directly. Benefit would be smaller function entry/exit code as well as an additional free register which can be used for other purposes.

Note that only few debuggers can handle such a situation.

14.8.7 Delayed popping of Stack-Slots

In most ABIs arguments which are pushed on the stack are not popped by the called function but the caller pops them by adjusting the stack after the callee returns (otherwise variable arguments would be hard to implement).

If several functions are called in sequence, it is not necessary to adjust the stack after each call but the arguments for several calls can be popped at once. It can be implemented by keeping track of the size to be popped and deferring popping to a point where it has to be done (e.g. a label or a branch). Also, in the case of nested calls, care has to be taken to pop arguments at the right time.

Note that this usually saves code-size and execution time but will increase stack-usage. Therefore, it may not be advisable for small systems.

14.8.8 Optimized Return

Return instructions are not explicitly represented in ICs. Rather, they are branches to a label which is the last IC in the list (except possible FREEREGs).

It is possible to generate working code by translating these branches normally, but directly inserting the function exit code instead of a branch is often faster. It is most recommendable if the exit code is small (e.g. no registers have to be restored and no stack-frame removed).

Another common possibility for optimization is a function call as the last IC. If return addresses are pushed on the stack and no function exit code is needed, it is usually possible to generate a jump-instruction, i.e. replace

    call  somefunc
    ret

    jmp   somefunc

14.8.9 Jump Tables

An important optimization is the creation of jump-tables for a series of comparisons with constants. Such series are usually created by a C switch construct, but vbcc can also recognize some of them if they are created through if-sequences.

‘supp.c’ provides the function calc_case_table(<IC>,<density>) to check for constructs that can be replaced by a jump table. The arguments are the start IC to look for (it has to be a COMPARE-IC with a constant as q2) and a minimal density. The density reflects the number of cases that are used divided by the range of cases. If the density is high, vbcc will use jump-tables only for sequences that have few unused cases inside. If the case tables occupy multiple ranges, vbcc is able to split them up and create multiple jump-tables.

calc_case_table returns a pointer to a struct case_table with the following content:

num: The number of cases.
typf: The type of the case IDs.
next_ic: The first IC after the list of ICs that can be replaced by the jump-table.
density: The case density.
vals: The values of the case IDs (array containing num entries).
labels: The labels of the code corresponding to the case IDs (array containing num entries).
min: The lowest case ID.
max: The highest case ID.
diff: max-min.

If the backend decides to emit a jump-table, it has to generate code that will check that the control expression lies between min and max. If not, the jump-table must not be executed. Code for the computed jump must then be generated. The actual table can be emitted using emit_jump_table(). Processing can then continue with next_ic.

14.8.10 Context-sensitive Register-Allocation

The regok() function is only a simple means of telling the register allocator which registers to use. It works fairly well with orthogonal register and instruction sets. However, it does not really care about the operations performed and it allocates variables to registers only according to their type, not according to the operations performed.

Some architectures provide different kinds of registers which are able to store a type, but not all of them are able to perform all operations or some operations are more expensive with some registers. To do good register allocation for these systems, the operations which are used on variables have to be considered.

If the backend wants to support this kind of register allocation, it has to define HAVE_TARGET_RALLOC and provide the following functions or macros:

int cost_move_reg(int x,int y);

The cost of copying register x to register y.

int cost_load_reg(int r,struct Var *v);

The cost of loading register r from variable v.

int cost_save_reg(int r,struct Var *v);

The cost of storing register r into variable v.

int cost_pushpop_reg(int r);

The cost of storing register r during function prologue and restoring it in the epilogue.

int cost_savings(struct IC *p,int r,struct obj *o);

Estimate the savings which would be obtained if the object o in IC p would be assigned to register r (in this IC). If the backend was not able to emit code in this case, INT_MIN must be returned.

If (o->flags & VKONST) != 0, the register allocator is thinking about putting a constant (or address of a static variable) in a register. In this case, the real object which would be put in a register is found in o->v->cobj.

The unit of the costs can be chosen by the backend, but should be some reasonable small values.

If regok() is called with a third parameter of -1, it is possible to return non-zero for a register which cannot perform all operations. The register allocator will call cost_savings() and returning INT_MIN can be used to prevent this register from being allocated, if the register is not suitable for a certain operation.

14.8.11 Inter-procedural Register-Allocation

To support inter-procedural register allocation, the backend must tell the optimizer which registers are used by a function. As the backend might use some registers internally, the frontend can not know this.

Apart from defining HAVE_REGS_MODIFIED in ‘machine.h’, the backend has to mark all registers that are modified in the bitfield regs_modified. A register can be marked with BSET(regs_modified,<reg>). For a call IC, the function calc_regs() (from ‘supp.h’) can be called to mark the registers used by a call IC. It will return 1 if it was able to determine all registers used by this IC.

If the register usage could be determined for the entire function, the backend can set the bit ALL_REGS in the fi-member of the function variable (v->fi->flags|=ALL_REGS;).

14.8.12 Conditional Instructions

FIXME: To be written.

14.8.13 Extended ICs

If the backend defines HACE_EXT_IC, it has to define a struct ext_ic in ‘machine.h’. This structure will be added to each IC and can be used by the backend for private use.

14.8.14 Peephole Optimizations on Assembly Output

Some optimizations are easier to do on the generated assembly code rather than doing them before emitting code. Therefore it is possible to do peephole optimizations on the emitted code before it is really written to a file.

EMIT_BUF_DEPTH lines will be stored in a ring buffer and are available to examination and modification by a function emit_peephole(). The actual assembly output is stored in emit_buffer, the index of the first line to be output in emit_f and the index of the last one in emit_l (note that you have to calculate modulo EMIT_BUF_DEPTH - it is a ring buffer).

The output may be modified in memory and the first line may be removed using remove_asm(). If a modification took place, a non-zero value has to be returned (0 otherwise). The following example code would combine two consecutive additions to the same register:

int emit_peephole(void)
{
  int entries,i,r1,r2;
  long x,y;
  /* pointer to the lines in order of output */
  char *asmline[EMIT_BUF_DEPTH];
  i=emit_l;
  /* compute number of entries in ring buffer */
  if(emit_f==0)
    entries=i-emit_f+1;
  else
    entries=EMIT_BUF_DEPTH;
  /* the first line */
  asmline[0]=emit_buffer[i];
  if(entries>=2){
    /* we have at least two line sin the buffer */
    /* calculate the next line (modulo EMIT_BUF_DEPTH) */
    i--;
    if(i<0) i=EMIT_BUF_DEPTH-1;
    asmline[1]=emit_buffer[i];
    if(sscanf(asmline[0],"\tadd\tR%d,#%ld",&r1,&x)==2&&
       sscanf(asmline[1],"\tadd\tR%d,#%ld",&r2,&y)==2&&
       r1==r2){
      sprintf(asmline[1],"\tadd\tR%d,#%ld\n",r1,x+y);
      remove_asm();
      return 1;
    }
  }
  return 0;                                                                    
}

Be very careful when doing such optimizations. Only perform optimizations which are really legal. Especially assembly code often has side effects like setting of flags.

Depending on command line flags inline assembly code may or may not be passed through this peephole optimizer. By default, it will be done, enabling optimizations between generated code and inline assembly.

14.8.15 Marking of efficient ICs

If the backend sets HAVE_EFF_ICS in ‘machine.h’, it has to provide a function void mark_eff_ics(void). This function will be called (possibly multiple times) by the frontend. The function has to set or clear the bit EFF_IC in the member flags of every IC.

The flag should be set when the operation is in a context that suggests it will translate to efficient machine code. The optimizer will transform this IC less aggressively.

As this is all happens before register allocation, the decision is of a very heuristic nature.

14.8.16 Function entry/exit Code

At entry and exit of function, there is usually some code to set up the new environment for this function. For example, registers will have to be saved/restored, a frame pointer may be set up and a stack frame will be created. It is generally worthwile to optimize this entry/exit code. For example, if no registers need to be saved and no local variables are used on the stack, it may not be necessary to create a stack frame.

The exact possibilities for optimization depend on the architecture and the ABI.

14.8.17 Multiplication/division with Constants

Many architectures do not provide instruction for multiplication, division or modulo calculation. And on most architectures providing such instructions they are rather slow. Therefore, it is recommended to emit cheaper instructions, if possible.

Usually, this can only be done if one operand of the operation is a constant. Multiplications may be replaced by a series of shift and add instructions, for example. The simplest and most important cases to replace are multiplication, division and modulo with a power of two. Multiplication by x can be replaced by a left shift of log2(x), unsigned division of x can be replaced by logical right shift of log2(x) and unsigned modulo by x can be replaced by anding with x-1.

Note that signed division and modulo can usually not be replaced that simple because most division instructions give different results for some negative values. An additional adjustment would be necessary to get correct results. Whether this is still an improvement, depends on the architecture details.

The following function can be used to test whether a value is a power of two:

static long pof2(zumax x)
/*  Yields log2(x)+1 oder 0. */
{
  zumax p;int ln=1;
  p=ul2zum(1L);
  while(ln<=32&&zumleq(p,x)){
    if(zumeqto(x,p)) return ln;
    ln++;p=zumadd(p,p);
  }
  return 0;
}

14.8.18 Block copying

There are many cases of copying of larger data. For the backend, those will mostly be used in PUSH and ASSIGN ICs. It is very important to implement those as efficient as possible.

Some things to consider:

- When alignment is known, use word-copy instead of byte-copy.
- Copy small blocks by a series of copy instructions.
- For larger blocks, loading addresses in registers may help.
- For large blocks, use a loop. Implement it efficiently and try to unroll the loop a few times.
- For very large blocks, calling a library function may be useful. While this creates some overhead, the function can dynamically check the alignment or perhaps even use special hardware, if available.
- Set INLINEMEMCOPY to reasonable values. Set it to a very high value if you implement very good block copying.

14.8.19 Optimized Library Functions

FIXME: To be written.

14.8.20 Instruction Scheduler

FIXME: To be written.

14.9 Hints for common Extensions

This section lists some common extensions to the C language which are often very helpful when using a compiler in practice. Depending on the kind of target system they may range from nobody-really-cares to absolutely essential. For example, consider the ability to specify the section within an object file a variable or function should be placed in. This is rarely of any interest when targetting a Unix-like operating system. On a stand-alone embedded system, however, it may be absolutely necessary.

Therefore, consider this list as a recommendation of ideas that might be helpful.

14.9.1 Inline Assembly

The possibility to insert assembly code into C source is very handy in many cases. It can be used in headers to implement specially optimized versions of time-critical library routines or enable access to CPU features which are not otherwise accessible by normal C constructs.

In general, almost all work is done by the frontend and only a few lines have to be inserted in the backend to make it work. Therefore, it is recommended to always support this important feature.

Everything that has to be done is to check a certain condition when code for a CALL IC is generated. Instead of emitting a normal call instruction, call the emit_inline_asm() function:

if((p->q1.flags & (VAR|DREFOBJ)) == VAR &&
    p->q1.v->fi &&
    p->q1.v->fi->inline_asm){
        emit_inline_asm(f,p->q1.v->fi->inline_asm);
    }else{
        emit(f,"\tcall\t");
        emit_obj(f,&p->q1,t);
        emit(f,"\n");
    }

Note that argument-passing, adjusting the stack after a CALL IC etc. is not affected. Only the actual emitting of call code is changed in the case of inline assembly.

14.9.2 -speed/-size

Often it is desired to generate code which runs as fast as possible but sometimes small code is needed. The command line options ‘-speed’ and ‘-size’ are provided for the user to specify his intention.

These options already may change the intermediate code produced by the frontend, but the backend should also respect these switches, if possible. The variables optspeed and optsize can be queried to see if these options were specified.

If e.g. optspeed was specified, the backend should choose faster code-sequences, even if code-size is increased significantly. Vice-versa, if optsize is specified, it should always choose the shorter code if there is a trade-off between size and speed.

Typical cases for such tradeoffs are for example, block-copy (ASSIGN and PUSH) ICs. Often it is possible to call a library function or generate a simple short loop for small code, but an unrolled inlined loop for fast code.

14.9.3 Target-specific Macros

A backend is able to provide macro definitions which are automatically active. It is recommended to define macros which allow applications to query the target architecture and the selected chip (if possible). Also, it is recommended to provide internal macros for backend specific attributes using the __attr() and __vattr() attributes.

The definition of these macros can be done in init_cg() (the results of command line parsing are available at this point). There is a variable

char **target_macros;

which can be set to an array of pointers to strings which contain the macro definitions. The array has to be terminated by a null pointer and the syntax of the macro definitions is similar to the command line option ‘-D’:

static char *marray[] = {
  "__TARGET_ARCH__",
  "__section(x)=__vattr(\"section(\"#x\")\")",
  0
};
...
target_macros = marray;

14.9.4 stdarg.h

FIXME: To be written.

14.9.5 Section Specifiers

Especially for embedded systems it can be very important to be able to place variables and functions in specific section to override default placement. This can relatively easily be done using variable attributes (see section Target-specific Attributes).

14.9.6 Target-specific Attributes

There are two ways of adding target-specific attributes to variables and functions. A general way is the use of __vattr() which will add the string argument to the vattr member of the corresponding struct Var, separating it by a semi-colon. The backend can use this information by parsing the string. The frontend will just build the string, it will not interpret it. If a backend offers attributes using the __vattr() mechanism, it is recommended to provide target-specific macros (see section Target-specific Macros) which expand to the appropriate __vattr()-syntax. Only these macros should be documented.

A second way to specify attributes is enabled by adding

#define HAVE_TARGET_PRAGMAS 1

to ‘machine.h’ and adding an array

char *g_attr_name[];

to ‘machine.c’. This array should point to the strings used for the attributes, terminated by a null-pointer, e.g.:

char *g_attr_name[] = {
  "__far",
  "__near",
  "__interrupt",
  0
};

These attributes can be queried in the member

unsigned long tattr;

of a struct Var. The first attribute is represented by bit 1, the second by bit 2 and so on. Using this mechanism, the frontend will check for redeclarations with different setting of attributes or multiple specification of attributes. However, only boolean attributes are possible. If parameters have to be specified, the __vattr()-mechanism has to be used.

14.9.7 Target-specific `#pragma`s

FIXME: To be written.

14.9.8 Target-specific extended Types

FIXME: To be written.

14.9.9 Target-specific `printval`

FIXME: To be written.

14.9.10 Debug Information

Debug information which enables (source level) debugging of compiled programs is an important feature to improve the user-friendliness of a compiler. Depending on the object format and debugger used, the format and capabilities of debug information can vary widely. Therefore, it is the responsibility of each backend to generate debug information. However, for common debug standards there will be modules which can be used by the backends and will do most of the work. Currently there is one such module for the DWARF2 debug standard.

The compiler frontend provides a variable debug_info which can be queried to test whether debug information is desired. Also, the functions init_db() and cleanup_db() are helpful.

Each struct Var contains the members char *dfilename and int dline which specify the file and line number of the variable’s definition. Also, every IC contains the members char *file and int line with the file name and line number this IC belongs to. Note however, that there may be ICs with file == 0 - not all ICs can be assigned a certain code location. Also, ICs do not always have increasing line numbers and line numbers may repeat. Not all debuggers may be able to deal with this.

14.9.10.1 DWARF2

There is support for the DWARF2 debug standard which can be added to a backend rather easily. The following additions are necessary:

Add the line
#include "dwarf2.c"
to ‘machine.c’.

Add the following lines to init_db():

dwarf2_setup(sizetab[POINTER],
             ".byte",
             ".2byte",
             ".4byte",
             ".4byte",
             labprefix,
             idprefix,
             ".section");
dwarf2_print_comp_unit_header(f);

The arguments to dwarf2_setup() have the following meanings:

The size of an address on the target.
An assembler directive to create one byte of initialized storage.
An assembler directive to create two bytes of initialized storage (without any padding for alignment).
An assembler directive to create four bytes of initialized storage (without any padding for alignment).
An assembler directive to create initialized storage representing a target address (without any padding for alignment).
A prefix which is used for emitting numbered labels (or empty string).
A prefix which is used for emitting external identifiers (or empty string).
An assembler directive to switch to a new named section.

Add the line
dwarf2_cleanup(f);
to cleanup_db().
Write the function
static int dwarf2_regnumber(int r);
which returns the DWARF2 regnumber for a vbcc register number.
Write the function
static zmax dwarf2_fboffset(struct Var *v);
which returns the offset of variable v from the DWARF2 frame-pointer.

Write the function

static void dwarf2_print_frame_location(FILE *f,struct Var *v);

which prints a DWARF2 location of the frame pointer. It can use the function

void dwarf2_print_location(FILE *f,struct obj *o);

to output the location. For example, if the frame pointer is a simple register, it might look like this:

static void dwarf2_print_frame_location(FILE *f,struct Var *v)
{
  struct obj o;
  o.flags=REG;
  o.reg=frame_pointer_register;
  o.val.vmax=l2zm(0L);
  o.v=0;
  dwarf2_print_location(f,&o);
}

Before emitting code for an IC p, execute the code
if(debug_info) dwarf2_line_info(f,p);
After emitting code for a function v, a new numbered label has to be emitted after the function code and the function
void dwarf2_function(FILE *f,struct Var *v,int endlabel);
must be called.

Note that the DWARF2 standard supports use of location lists which can be used to describe a variable whose location changes during the program (e.g. in a register for some time, then in memory and again in a register) as well as a moving frame pointer (very useful if no separate frame pointer is used but all local variables are accessed through a moving stack pointer). Unfortunately, none of the debuggers I have tried so far could handle these location lists. Therefore, the current DWARF2 module does not output location lists, but future version will probably offer them as an option.

Without location lists, accessing local variables will only work with a fixed frame pointer and no register variables. Even with these restrictions, function parameters which are passed in registers will not be correctly displayed during the function entry code.

14.9.11 Interrupt Handlers

Especially for embedded systems, support for writing interrupt handlers in C is a common feature. Variable attributes (see section Target-specific Attributes) can be used to mark functions which are used as interrupt handlers.

Typical changes which might be necessary for interrupt handlers are:

- Using a different return instruction.
- Saving all modified registers, including scratch-registers.
- Creating an entry in the interrupt vector table.

14.9.12 Stack checking

Dynamic checking of the stack used (or possibly extending the stack size if possible) is another useful feature. If the variable stack_check is set, stack-checking code should be emitted, if possible. Every function should call a library function (usually called __stack_check) and pass the maximum size of stack used in this function as argument. This obviously has to be done before allocating the stack-frame.

The library function is responsible to take into account its own stack-frame.

14.9.13 Profiling

FIXME: To be written.

14.9.14 Variable-length Arrays

With the -c99 option, vbcc supports variable-length arrays that are allocated on the stack. The backend has to take several steps to support this:

vlas: When this variable is non-zero, the current function uses variable-length arrays. The backend may take necessary steps to support this. For example, if local variables are usually addressed via stackpointer, switching to a separate framepointer may be necessary.
ALLOCVLA_INLINEASM: This define must contain inline code that is called when a vla is allocated. It has to create additional room on the stack and return a pointer to the beginning of the new space.
ALLOCVLA_REG: The register in which to pass the size to be allocated on the stack. 0 will pass on the stack.
FREEVLA_INLINEASM: This define must contain inline code that is called when a vla is freed. It has to restore the old stack pointer.
FREEVLA_REG: The register in which to pass the old stack pointer. 0 will pass on the stack.
OLDSPVLA_INLINEASM: This define must contain inline code that is called before the first vla is allocated. It has to return the current stack pointer before any vla has been allocated.
FPVLA_REG: An additional register used in functions containing vlas. The backend can specify a register (usually framepointer) that can not be used in functions with vlas. Therefore, it is possible to use this register in other functions (for example, if local variables are usually addressed directly through the stackpointer).

14.9.15 Library Calls

Sometimes operations may be very complicated to generate code for (e.g. floating-point operations for machines without FPU, multiplication/division on some architectures or big data types like long long). Those are usually implemented by calling library functions.

vbcc can be told to generate calls to library functions for certain ICs. When defining HAVE_LIBCALLS, the backend must provide the function char *use_libcall(<code>,<typf>,<typf2>). This function gets called with the elements code, typf and typf2 of an IC. If use_libcall returns a name, this library function will be called instead of the IC. Otherwise, 0 must be returned.

All library functions have to be declared in init_cg() with declare_builtin(), supporting the following arguments:

name: The name of the library function. Usually, a reserved identifier should be chosen (e.g. starting with __).
ztyp: The return type of the function (only integral and float types are supported).
q1typ: The type of the first parameter (only integral and float types are supported).
q1reg: The register to pass the first argument in (0 passes via stack).
q2typ: The type of the second parameter (only integral and float types are supported). For functions with a single parameter, use 0.
q2reg: The register to pass the second argument in (0 passes via stack).
nosidefx: If this is non-zero, the function will be declared to have no side-effects and allow some more optimizations.
inline_asm: Inline assembly can be specified for the function.

Note that not all ICs can be converted to library calls.

14.10 Changes from 0.7 Interface

The backend interface has changed in several ways since vbcc 0.7. The following list mentions most(all?) differences between the old and new interfaces (not including new optional features which do not have to be used):

- There are more types (LLONG, LDOUBLE, MAXINT). Therefore the align[] and sizetab[] arrays have dimension MAX_TYPES+1 rather than 16.
- The representation and access of t_min[] and t_max[] has been changed.
- zmax replaces zlong as largest integer target type. zlong is only used when actually referring to a long on the target. Also, the macros for target arithmetic are available for zmax/zumax instead of zlong/zulong.
- PUSH ICs contain a second size (actual stack-adjustment).
- The second argument of SHIFT ICs has an own type.
- DREFOBJ objects contain the type of the dereferenced pointer (only meaningful if there are different pointer types).
- The new CONVERT IC replaces the series of old ICs (CONVINT etc.).
- emit()-functions have to be used to generate output rather than fprintf().
- The functions init_db() and cleanup_db() have to be provided (they do not have do to anything).
- A new array reg_prio[] is needed and controls the order in which registers are allocated.
- The parameters of must_convert() have changed.
- Static functions must not use identifiers, but have to use numbered labels.

Volker Barthelmann vb@compilers.de

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by vb on January 3, 2015 using texi2html 1.82.

14. Backend Interface

14.1 Introduction

14.2 Building vbcc

14.2.1 Directory Structure

14.2.2 Adapting the Makefile

14.2.3 Building vc

14.2.4 Building vsc

14.2.5 Building vbcc

14.2.6 Configuring

14.2.7 Building Cross-Compilers

14.3 The Intermediate Code

14.3.1 General Format

14.3.2 Operands

14.3.3 Variables

14.3.4 Composite Types

14.3.5 Operations

14.4 Type System

14.4.1 Target Data Types

14.4.2 Target Arithmetic

14.5 ‘machine.h’

14.6 ‘machine.c’

14.6.1 Name and Copyright

14.6.2 Command Line Options

14.6.3 Data Types

14.6.4 Register Set

14.6.5 Functions

14.7 Available Support Functions, Macros and Variables

14.8 Hints for common Optimizations

14.8.1 Instruction Combining

14.8.2 Adressing Modes

14.8.3 Implicit setting of Condition Codes

14.8.4 Register Parameters

14.8.5 Register Pairs

14.8.6 Elimination of Frame-Pointer

14.8.7 Delayed popping of Stack-Slots

14.8.8 Optimized Return

14.8.9 Jump Tables

14.8.10 Context-sensitive Register-Allocation

14.8.11 Inter-procedural Register-Allocation

14.8.12 Conditional Instructions

14.8.13 Extended ICs

14.8.14 Peephole Optimizations on Assembly Output

14.8.15 Marking of efficient ICs

14.8.16 Function entry/exit Code

14.8.17 Multiplication/division with Constants

14.8.18 Block copying

14.8.19 Optimized Library Functions

14.8.20 Instruction Scheduler

14.9 Hints for common Extensions

14.9.1 Inline Assembly

14.9.2 -speed/-size

14.9.3 Target-specific Macros

14.9.4 stdarg.h

14.9.5 Section Specifiers

14.9.6 Target-specific Attributes

14.9.7 Target-specific #pragmas

14.9.8 Target-specific extended Types

14.9.9 Target-specific printval

14.9.10 Debug Information

14.9.10.1 DWARF2

14.9.11 Interrupt Handlers

14.9.12 Stack checking

14.9.13 Profiling

14.9.14 Variable-length Arrays

14.9.15 Library Calls

14.10 Changes from 0.7 Interface

14.5 ‘`machine.h`’

14.6 ‘`machine.c`’

14.9.7 Target-specific `#pragma`s

14.9.9 Target-specific `printval`