Tag Archives: code-example

Preprocessor in Fifty Lines of SPITBOL

As part of the conversion of SPITBOl to generate gnu assembler (gas) instead of NASM format, I learned that the GAS assembler is less powerful than that of NASM. For example, in NASM I could
use ‘define’ to map a register name to ‘eax’ in 32-bit mode, or ‘rax’ in 64-bit mode.

No problemo … SPITBOL to the rescue.

Here is the simple preprocessor I wrote in about twenty minutes:

*	rename registers according to word size

	target = host(0)
	target break('_') . os "_" rem . ws

	prefix = (eq(ws,32) '%e', '%r')

	word = (eq(ws,32) 'dword','qword')
	defines = 'M_WORD ' word ' '
	defines = defines 'D_WORD' ' ' (eq(ws,32) '.long', '.quad') ' '

	define('a(ref)')			:(a.end)
						:(next)
a	ident(os,"osx")				:s(a.err)
	a = '[' ref ']' 			:(return)
a.end

	define('m(ref)')			:(m.end)
m	ident(os,"osx")				:s(m.err)
	m = (eq(ws,32) 'd', 'q') 'word ptr ' ref ']' :(return)
m.end

	rmap = table(20)

	s = 'XLsiXRdiXSspXTsiWAcxWBbxWCdxW0axIAbp'
rinit	s len(2) . min  len(2) . reg = 		:f(rdone)
	rmap[min] = reg				:(rinit)
rdone
	rpat =  'IA' | ('X' any('LRST')) | ('W' any('ABC0')) 

next
	line = input				:f(end)

aloop	line breakx('A') . first 'A(' bal . ref ')' rem . last = first a(ref) last	:s(mloop)
	defs = defines

dloop
	defs break(' ') . key ' ' break(' ') . val ' ' =	:f(mloop)
dloop.1	line key = val				:s(dloop.1)f(dloop)

mloop	line breakx('M') . first 'M(' bal . ref ')' rem . last = first m(ref) last	:s(mloop)

rloop 	line rpat . reg = prefix rmap[reg]	:s(rloop)
	output = line				:(next)
err	output = 'error '
end

SPITBOL OSX Port Status, and a New Use for SPITBOL in the Port

I’ve spent the last month or so trying to port SPITBOL to Apple’s OSX.

What I thought would be a simple port — since the Linux x32 and x64 port was solid — turned out to be more daunting.

OSX is now 64-bits by default, and uses a different object format (macho) than does Linux (elf).

The main problem I’ve run into is that in 64-bits the storage model is different, so that code and data must use what is called RIP-addressing, where RIP stands for Relative to Instruction Pointer.

The first problem I ran into was a crash of NASM, the assembler I’ve been using.

Once the folks at NASM fixed that, I was unable to get even the simplest program working in RIP-mode for 64-bit OSX.

I then realized — I wish I had thought of this sooner — that OSX might support 32-bit mode.

Indeed it does. So I tried to build 32-bit SPITBOL using NASM.

This also gave problems, mainly in that it generated bad refs for the first three of so globals defined in m.s. I tried to get around this by moving their declaration to C code, but even after doing that, SPITBOL crashed.

So I have decided to convert the Minimal code generator to target GAS (the GNU assembler) and not NASM. This also involves converting the approximately 1500 lines of assembler needed to link the SPITBOL compiler to the runtime code written in C from NASM to GAS.

I’m in the midst of this, and it’s going well so far, so I’m hopeful the port will get done.

As part of the conversion from NASM to GAS, I’ve learned that GAS is less powerful when it comes to macros and substitutions. For example, in NASM, in 64-bit mode, I can write “%define WA RCX” to map the Minimal register WA to the machine register RCX. For 32-bits I can write “%define WA ECX.”

But I can’t do this in GAS, and so I wrote a program that, given the word size, maps the Minimal register names in upper-case to the corresponding hardware registers:

*	rename Minimal registers to x86_64 registers according to word size for x86_64

	prefix = (eq(host(0),32) "%e", "%r")

	rmap = table(20)
	rmap['XL'] = 'si';  rmap['XR'] = 'di';  rmap['XS'] = 'sp';  rmap['XT'] = 'si'
	rmap['WA'] = 'cx';  rmap['WB'] = 'bx';  rmap['WC'] = 'dx';  rmap['W0'] = 'ax' 
	rmap['IA'] = 'bp'

	rpat =  'IA' | ('X' any('LRST')) | ('W' any('ABC0')) 
next
	line = input				:f(end)
loop
	line rpat . reg = prefix rmap[reg]	:s(loop)
	output = line				:(next)
end

Being able to write code such as this, in a short time (it took about twenty minutes to write and debug) is why it is worth the long slog of implementation and porting.

It’s just such damn fun to write code in SPITBOL.

Introduction to the Macro SPITBOL MINIMAL Reference Manual

The source code for MACRO SPITBOL contains extensive documentation. I have extracted the specification of the MINIMAL (Machine Independent Macro Assembly Language) and the specification of the OSINT (Operatint System INterface) and converted the plain text to HTML, resulting in what is now the “MINIMAL Reference Manual.

As part of this effort I wrote an introduction in order to give a sense of the flavor of the code. Here is that introduction:

Introduction

The implementation of MACRO SPITBOL is written in three languages: MINIMAL, C, and assembler.

The SPITBOL compiler and runtime is written in MINIMAL, a machine-independent portable assembly language.

The runtime is augmented by procedures written in C that collectively comprise OSINT (Operating System INTerface). These procedures provides such functions as input and output, system initialization and termination, management of UNIX pipes, the loading of external functions, the writing and reading of save files and load modules, and so forth.

The implementation also includes assembly code. This size of this code varies according to the target machine. About 1500 lines are needed for the x86 architecture running UNIX.

This code provides such functions as macros that define the translation of MINIMAL instructions that take more than a few machine-level instructions, support for calling C procedures from MINIMAL, for calling MINIMAL procedures from C, for creating save files and load modules, and for resuming execution from save files or load modules.

To give some idea of the flavor of the code, consider the following simple SPITBOL program that copies standard input to standard output.

loop output = input :s(loop)
end

By default, the variable input is input-associated to standard input, so each attempt to get its value results in reading in a line from standard input and returning the line as a string. The read fails if there are no more lines, and succeeds otherwise.

Similarly, the variable output is output-associated with standard output, so each assignment to output causes the assigned value to be written to the standard output file.

The osint procedure for writing a line is SYSOU. It is called from within SPITBOL as part of assignment, as shown in the follwing excerpt from the MINIMAL source:

*      here for output association

asg10  bze  kvoup,asg07      ignore output assoc if output off
asg1b  mov  xl,xr            copy trblk pointer
       mov  xr,trnxt(xr)     point to next trblk
       beq  (xr),=b_trt,asg1b loop back if another trblk
       mov  xr,xl            else point back to last trblk
.if    .cnbf
       mov  -(xs),trval(xr)  stack value to output
.else
       mov  xr,trval(xr)     get value to output
       beq  (xr),=b_bct,asg11 branch if buffer
       mov  -(xs),xr         stack value to output
.fi
       jsr  gtstg            convert to string
       ppm  asg12            get datatype name if unconvertible

*      merge with string or buffer to output in xr

asg11  mov  wa,trfpt(xl)     fcblk ptr
       bze  wa,asg13         jump if standard output file

*      here for output to file

asg1a  jsr  sysou            call system output routine
       err  206,output caused file overflow
       err  207,output caused non-recoverable error
       exi                   else all done, return to caller

From the OSINT C code (the C procedure name starts with ‘z’ since there is intermediate code (shown below) to call from MINIMAL to C at runtime):

zysou()
{
    REGISTER struct fcblk *fcb = WA(struct fcblk *);
    REGISTER union block *blk = XR(union block *);
    int result;

    if (blk->scb.typ == type_scl) {
	/* called with string, get length from SCBLK */
	SET_WA(blk->scb.len);
    } else {
	/* called with buffer, get length from BCBLK, and treat BSBLK
	 * like an SCBLK
	 */
	SET_WA(blk->bcb.len);
	SET_XR(blk->bcb.bcbuf);
    }

    if (fcb == (struct fcblk *) 0 || fcb == (struct fcblk *) 1) {
	if (!fcb)
	    result = zyspi();
	else
	    result = zyspr();
	if (result == EXI_0) 
	    return EXI_0;
	else 
	    return EXI_2;
    }

    /* ensure iob is open, fail if unsuccessful */
    if (!(MK_MP(fcb->iob, struct ioblk *)->flg1 & IO_OPN)) {
	 return EXI_1;
    }

    /* write the data, fail if unsuccessful */
    if (oswrite
	(fcb->mode, fcb->rsz, WA(word), MK_MP(fcb->iob, struct ioblk *),
	 XR(struct scblk *)) != 0)
	 return EXI_2;

    /* normal return */
    return EXI_0;
}

Here is the assembly code that is used to call a C procedure from MINIMAL. The code is for 32-bit X86
and is written using NASM (Netwide Assembler) syntax.

	%macro	mtoc	1
	extern	%1
	; save minimal registers to make their values available to called procedure
	mov     dword [reg_wa],ecx     
        mov     dword [reg_wb],ebx
        mov     dword [reg_wc],edx	; (also reg_ia)
        mov     dword [reg_xr],edi
        mov     dword [reg_xl],esi
        mov     dword [reg_cp],ebp	; Needed in image saved by sysxi
        call    %1			; call c interface function
;       restore minimal registers since called procedure  may have changed them
        mov     ecx, dword [reg_wa]	; restore registers
        mov     ebx, dword [reg_wb]
        mov     edx, dword [reg_wc]	; (also reg_ia)
        mov     edi, dword [reg_xr]
        mov     esi, dword [reg_xl]
        mov     ebp, dword [reg_cp]
;	restore direction flag in (the unlikely) case that it was changed
        cld
;	note that the called procedure must return exi action in eax
	ret
	%endmacro

  ...

	global	sysou			; output record
sysou:
	mtoc	zysou
  • Pages

  • March 2021
    M T W T F S S
    1234567
    891011121314
    15161718192021
    22232425262728
    293031  
  • RSS The Wayward Word Press

  • Recent Comments

    daveshields on SPITBOL for OSX is now av…
    Russ Urquhart on SPITBOL for OSX is now av…
    Sahana’s Respo… on A brief history of Sahana by S…
    Sahana’s Respo… on A brief history of Sahana by S…
    James Murray on On being the maintainer, sole…
  • Archives

  • Blog Stats

  • Top Posts

  • Top Rated

  • Recent Posts

  • Archives

  • Top Rated