JLOC.EXE version 0.6 by John S. Fine johnfine@erols.com

This program is a linker/locator for use when you need more control over placement of sections within the image, than a standard linker provides.

Use this program entirely at your own risk. I make no claim that this program is suitable or safe for any purpose.

ZERO-PRICE SHAREWARE:

If you continue to use this program after a reasonable initial test, you must register it. You do not need to send any money. Registration consists entirely of sending an EMAIL to johnfine@erols.com to tell me that you are using it, and where you got your copy. Bug reports and suggestions are also welcome, though I can't promise to do anything with them.

Contents:

Usage

JLOC control_file output_file map_file sym_file

The map and symbol files are optional.

DPMI

JLOC was built using DJGPP. You don't need DJGPP to run it, but you do need DPMI. DOS sessions in Windows or OS/2 will have DPMI available. If you are running true DOS, you might not have DPMI. Download a copy of csdpmi4b.zip . Put CWSDPMI.EXE somewhere in your path. If DPMI is not active when you start JLOC, the stub at the beginning of JLOC will run CWSDPMI to provide DPMI services for JLOC. CWSDPMI exits when JLOC exits.

Control Files

Concepts

JLOC requires a control file to tell it how to build your program.

There are several definitions required to understand the structure of a control file:

Control file syntax

Scope line syntax

Namespaces and Sections must have names which are different from any other Namespaces and Sections. They may be the same as the names of files, groups, classes, segments or symbols.

A scope line introduces (or reintroduces) a Namespace or a Section. It has the name of the Namespace or Section followed by ":". The ":" must be followed by a blank, tab, or newline.

When a previously defined scope is reintroduced, there should be no additional information on the line.

When a scope is introduced for the first time, additional information is required on section lines and optional on namespace lines

Namespace line syntax

This line gives the name of a Namespace and indicates that following lines (up to the next scope line) will be filename or symbol lines.

An optional "^" on the namespace line specifies that all symbols within the namespace will be converted to upper case. Remember to put a space or tab after the ":" and before the "^".

Section line syntax

section:  base  start  i_start  selector

The start is optional. If ommitted JLOC will compute a default start.

The i_start is optional. If ommitted JLOC will compute a default i_start.

The selector is optional. If omitted JLOC will use base/16 as the selector.

Commas are optional between the parameters of a Section line; However, if you omit a parameter and do not omit following parameters, the commas are required to make clear which parameters are present.

To exclude a section from the output file and from the map, use "-1" in place of i_start. Definitions within the section are still available for resolving FIXUPs in other sections. "-1" is not needed to exclude the usual BSS segment because uninitialized space which would fall at the end of the output file is automatically excluded from the output file (but not from the map). Segments of debug information (such as DEBSYM and DEBTYP produced by MASM) get no special handling by JLOC. If you wish to exclude them from the image, you must define a section for them and explicitly exclude it from the image.

An i_start value of -2 is used for a section that contains patches to be applied in the last stage of linking.

Following the section line, you should have one or more chunk selection lines to specify which chunks belong to that section.

Computing a default value for start

When the start value is omitted from a section line, JLOC will compute a default value. If there is any previous section line in the control file, which is not a patch section, then JLOC will add the start and length of the section described by the last such line to compute the default for the current line. If there is no such line preceeding the current line then JLOC will use zero as the default start.

If the start expression begins with a #, then the default is computed and then aligned as specified.

Computing a default value for i_start

When the i_start value is omitted from a section line, JLOC will compute a default value. If there is any previous section line in the control file, which is not a patch section and is also not omitted from the image by an i_start value of -1, then JLOC will compute the differences between the i_start and start of the section described by the last such line and add that difference to the start value of the current line to compute the default for the current line. This mimics the behavior of an ordinary linker in which the difference between i_start and start is the same for all sections. If there is no such line preceeding the current line then JLOC will use zero as the difference (It will set i_start equal to start).

Filename line syntax

Filenames appear within the scope of a namespace. One filename per line, no wild-cards, no spaces within a name. Spaces or tabs are permitted before or after the (full path) name.

Symbol definition syntax

symbol = expression

A symbol definition occurs within the scope of a namespace. The symbol may be used by FIXUPs in any file within the namespace.

Symbols (both those defined in the control file, and public symbols defined in the OBJ and LIB files) may be used in any expression in the control file. When used in an expression the symbol name must always be prefixed by the namespace name (namespace.symbol), even within the scope of the namespace.

When defining a symbol, DO NOT prefix it with the name of the namespace.

Chunk selection syntax

namespace, group, class, segment, filename

Chunk selection lines occur within the scope of a section. They specify which chunks belong to that section. Each chunk belongs to exactly one section. If more than one chunk selection line fits a chunk, that chunk is selected by only the first selection line which fits.

Each of the items on a chunk selection line is optional. Any item which is either omitted or replaced by "*" will match any chunk. A chunk is selected if every item that is specified matches.

The commas are also optional; However, each set of spaces and tabs is reduced to just one space, so you need either an "*" or a "," to mark the location of any omitted item that is followed by specified item(s).

The filename (if specified) must exactly match the name as specified on a filename line in the namespace scope.

Expressions

Spaces are not permitted within expressions. An expression consists of hex numbers, decimal numbers, symbols, and section values combined using operators and grouped with parenthesis.

Numbers

All numbers must start with a digit (add a leading zero if necessary). Decimal numbers must end with "."; All other numbers are hex.

Section values

Section values must be prefixed with the section name (section.value). The defined values are:
base Linear address, relative to which all its offsets are computed.
start Starting linear address.
image Starting address within image file.
length Length.
i_length Length excluding uninitialized data at the end.
after start+length.
i_after image+length (Note the use of length, not i_length).

Operators

| Bitwise or
^ Bitwise xor
& Bitwise and
# Align A#B = (A+B-1) & (-B)
> Right shift
< Left shift
+ Add
- Subtract
* Multiply
/ Divide (unsigned)
% Modulo (unsigned)

Sequence rules

Miscellaneous Features

Patch stage

By setting the i_start value of a section to -2, that section is excluded from the body of the image and used as a sequence of patches which are applied to the body of the image in the last stage of linking.

Each patch consists of:

  1  DWORD  Must be zero
  1  DWORD  Offset of patch within the image file
  1  DWORD  Length of patch
  N  BYTES  Contents of patch
It is directly followed (no alignment) by the next patch.
To generate an offset within the image file you may want to use the Empty GROUP feature.

Patches are useful for doing link-time initialization of things that would normally require run-time initialization. Link-time initialization is especially powerful for programs that execute in ROM. A good example is the interrupt descriptor table for a stand-alone protected mode program. You can find sample source code for that at http://www.erols.com/johnfine/#jlocpatch

Empty GROUPs

An empty GROUP is one that does not have any segments listed in ANY of its declarations. It is not empty if any module declares it with any segments.

When JLOC is asked to compute the offset of something relative to an empty group it computes instead the offset of that object within the image file.

There are many reasons you might need the offset relative to the image file, and in complex images that might be hard to compute from normal addresses. For example,

TASM Syntax:
  image GROUP
 ...  offset image:something
NASM Syntax:
  GROUP image
 ...  something WRT image

Building Descriptors

386 descriptors are stored with important fields, such as offset or address, broken into subfields. If you wanted to build a descriptor at assembly time you would be limited to values that are known at assembly time in order to break the values into subfields.

JLOC allows you to build simplified descriptors at assembly time which can contain externs and other values that are not known until link time. JLOC can then rearrange the bits to form a true descriptor. JLOC does this step after applying "fixups" which resolve externs etc. and before applying patches (which might move the descriptor to a new location).

To make JLOC rearrange the bits of a descriptor you must declare a global symbol whose name begins with ?fixD..@ at the address of the descriptor.

There is more documentation on this subject inside the sample file GDT.INC which is distributed with JLOC.


Examples


ALL:
   bootstrap.obj
BOOT: 0 7C00 0
   *

This file is used to link a bootstrap (floppy or hard-drive boot) program. The line "ALL:" defines a namespace. The second line lists the (only) obj file. The third line defines a section. Bootstrap programs always load at 0:7C00. The section line tells JLOC that the assumed segment register points to 0, the section is actually loaded at 7C00 in memory, but at 0 in the image file output by JLOC. In a traditional linker it is hard to achieve this difference between the base and the start. The "*" is a chunk selection specifying that all remaining chunks go in this section. JLOC requires that you assign every chunk to some section.


ALL:
   file1.obj
   file2.obj
   file3.obj
   last.obj
DATA:  0F0000 MAIN.after MAIN.i_after
   *,DGROUP
   ,,,FONT
FINAL:  0F0000 100000-FINAL.length FINAL.start-0F0000
   *,*,*,*,last.obj
MAIN: 0F0000 0F0000 0
   *

This file is used to link a BIOS. All of the BIOS fits within F000:xxxx. The DATA section is defined as directly following the MAIN section within F000:xxxx. It is defined here as containing all chunks in group DGROUP as well as chunks with segment name FONT.

The FINAL section contains code which must end at exactly F000:FFFF. (BIOS's normally have sections that must sit at predefined addresses. Getting them there with traditional linkers is very hard). With JLOC you can use expressions to compute where sections go at link time.

The MAIN section contains everything else and is loaded first.

Chunks are assigned according to the sequence in the control file. In the above example, if last.obj includes a chunk in DGROUP, that chunk would go in DATA, not in FINAL.

SECTIONS are loaded in memory exactly where you specify. The order of the sections in the control file doesn't affect the load order. The above example loads in the order MAIN,DATA,FINAL.


VCPI:
   {list of OBJ's omitted}
   read_from = OS.i_start
   read_size = OS.length
   gdt = KERNEL.gdt
   entry = KERNEL.entry
KERNEL:
   {list of OBJ's omitted}
LOADER:  0 100 0
   VCPI
OS: 0 0FF800000 LOADER.i_after 8
   *

This file is used to build a test copy of an OS. Since I edit/assemble/link the OS in DOS, I want to start it from DOS for testing.

There are two programs connected together. The first is a VCPI client. Assuming QEMM or EMM386 (or similar) is loaded, the VCPI client turns them off and takes control of the system. It sets up paging and loads the OS.

JLOC is told that the VCPI client loads at 0:100. Actually it loads at nnnn:100, where nnnn isn't known until load time. I haven't yet added the ability for JLOC to handle that (and to produce EXE files rather than BIN or COM file). VCPI is simply coded carefully to work despite that restriction.

The symbol tables of the two programs are separate (so they each might have a symbol named "printf"). However, we tell the VCPI program about two symbols from the OS. Its gdt and its entry point. It can use those two symbols as if they were its own. We also define two symbols for it giving the start and length of the OS section within the image file.

When I build a 32 bit OS, I like to use the top 4Mb of the linear address space for page tables and the 4Mb below that for the OS. I also like to use true flat segments in the OS so offset 0 in the OS code segment equals linear address 0. This means that the code starts at offset FF800000 in the code segment. That is rather hard (not impossible) with a traditional linker. With JLOC, I just ask for it and get it.

In the above example, I just lumped all the OS code and data together. In a real OS, I would probably have multiple sections. This OS must use two selectors for the flat segment. 8 is the flat code selector. As shown above, JLOC will use 8 as the selector for anything in OS. When coding, I was careful not to leave any data selector for the linker to resolve (they are all resolved at assembly time). Actually I wrote the OS test before I wrote JLOC, so then I didn't leave any selectors for the linker to resolve. I have no idea how to get a traditional linker to resolve protected mode selectors. The 8 in the above example is only used in the "JMP entry" instruction at the end of VCPI.