Category Archives: post

SPITBOL Status Report: NASM to GAS conversion complete

The conversion of asm.spt to generate gnu assembler (gas) code instead of nasm format is complete.

For those who know of such matters, I now generate att syntax, as it’s easier to generate from a program than intel syntax.

A few hours ago I finally got a compile with no errors, so I could run the executable.

It’s still dying early on. The good news is that with the use of gas I can now use ‘-g’ for debugging and get useful information from the ‘ddd’ debugger.

Simply put, I have debugging resources at hand that should suffice, so it’s just a matter of slogging along until its done. I don’t think it’ll take that long.

If you want to follow the details, or track my work, checkout branch ‘gas’.

I’ll keep you posted.

SPITBOL OSX Port Status, and a New Use for SPITBOL in the Port

I’ve spent the last month or so trying to port SPITBOL to Apple’s OSX.

What I thought would be a simple port — since the Linux x32 and x64 port was solid — turned out to be more daunting.

OSX is now 64-bits by default, and uses a different object format (macho) than does Linux (elf).

The main problem I’ve run into is that in 64-bits the storage model is different, so that code and data must use what is called RIP-addressing, where RIP stands for Relative to Instruction Pointer.

The first problem I ran into was a crash of NASM, the assembler I’ve been using.

Once the folks at NASM fixed that, I was unable to get even the simplest program working in RIP-mode for 64-bit OSX.

I then realized — I wish I had thought of this sooner — that OSX might support 32-bit mode.

Indeed it does. So I tried to build 32-bit SPITBOL using NASM.

This also gave problems, mainly in that it generated bad refs for the first three of so globals defined in m.s. I tried to get around this by moving their declaration to C code, but even after doing that, SPITBOL crashed.

So I have decided to convert the Minimal code generator to target GAS (the GNU assembler) and not NASM. This also involves converting the approximately 1500 lines of assembler needed to link the SPITBOL compiler to the runtime code written in C from NASM to GAS.

I’m in the midst of this, and it’s going well so far, so I’m hopeful the port will get done.

As part of the conversion from NASM to GAS, I’ve learned that GAS is less powerful when it comes to macros and substitutions. For example, in NASM, in 64-bit mode, I can write “%define WA RCX” to map the Minimal register WA to the machine register RCX. For 32-bits I can write “%define WA ECX.”

But I can’t do this in GAS, and so I wrote a program that, given the word size, maps the Minimal register names in upper-case to the corresponding hardware registers:

*	rename Minimal registers to x86_64 registers according to word size for x86_64

	prefix = (eq(host(0),32) "%e", "%r")

	rmap = table(20)
	rmap['XL'] = 'si';  rmap['XR'] = 'di';  rmap['XS'] = 'sp';  rmap['XT'] = 'si'
	rmap['WA'] = 'cx';  rmap['WB'] = 'bx';  rmap['WC'] = 'dx';  rmap['W0'] = 'ax' 
	rmap['IA'] = 'bp'

	rpat =  'IA' | ('X' any('LRST')) | ('W' any('ABC0')) 
next
	line = input				:f(end)
loop
	line rpat . reg = prefix rmap[reg]	:s(loop)
	output = line				:(next)
end

Being able to write code such as this, in a short time (it took about twenty minutes to write and debug) is why it is worth the long slog of implementation and porting.

It’s just such damn fun to write code in SPITBOL.

Ralph Griswold’s Oral History of the SNOBOL Languages

Courtesy of the YAHOO SNOBOL4 mailing list, I have just heard of a history of the SNOBOL programming languages  by Ralph Griswold, in the form of a transcription of an interview with Ralph in 1972. It is a PDF file:

Narrative Account of the Development of the SNOBOL Programming Languages by Ralph Griswold

Jerry Coleman, Yankees Infielder and Padres Broadcaster, Dies at 89

The Wayward Word Press notes with sadness the recent death of Jerry Coleman, a Hall-of-Fame baseball broadcaster.

Though I hadn’t heard his voice for almost two decades, I was immediately able to recall it, and how much pleasure I enjoyed over the years listening to it, on the CBS Baseball Game of the Week.

That made me realize this is the true test of a great baseball broadcaster — if you can recall their voice, and smile when you do so.

Only a handful of announcers I have followed over the years, mainly as a New York Mets fan, pass this test:

The Times’s obituary says in part:

As a Marine pilot, he flew in the Pacific during World War II and was recalled to fly during the Korean conflict, becoming the only major league player to survive combat in both wars.

And as a broadcaster for the Padres since 1972, he was known to get lost in the clouds of the English language as he never did in the cockpit.

He once blurted: “Winfield goes back to the wall, he hits his head on the wall and it rolls off! It’s rolling all the way back to second base. This is a terrible thing for the Padres.”

And then there was this: “On the mound is Randy Jones, the left-hander with the Karl Marx hairdo.”

Coleman acknowledged that there was a “term that’s associated with me — ‘Colemanisms,’ or what you might call flubs,” he said in “An American Journey: My Life On the Field, In the Air, and On the Air,” a memoir written with Richard Goldstein and published in 2008.

“Maybe I talk too quickly, too soon,” he added. “I may have said the one on Winfield. ‘Winfield goes back. He hit his head against the wall. It’s rolling toward the infield.’ I meant the ball, of course. I just didn’t get around to saying, ‘It wasn’t his head rolling toward the infield.’ I skip a word here and there.”

But he could be entirely clear when he had something to say on an issue. After baseball began to acknowledge the enlarged physiques of some players and their ballooning home run totals, in 2005, Coleman spoke out in favor of strong penalties for abuse of steroids. “If I’m emperor, the first time, 50 games; the second time, 100 games, and the third strike, you’re out,” he said.

Major League Baseball adopted that penalty structure by the end of the year.

Though I didn’t recall his getting lost in the clouds, that passage brought a smile to my face, especially noting the miraculous recovery of Dave Winfield after losing his head.

See also Jerry Coleman Quotes and Legends Of The Err Waves: Jerry Coleman and Ralph Kiner give their listeners tongues of fun by William Taaffe

SPITBOL Update: SPITBOL X86-64 can compile “hello world”

The SPITBOL project is pleased to announce some progress in porting MACRO SPITBOL to x86-64 Linux. (I use MINT, so this should also work on straight Ubuntu.)

This version is able to compile such simple programs as “hello world” but is not yet able to compile itself.

It can be found at http://github.com/hardbol/spitbol

It has the git tag “x86-64-hello-world” and there is a file with this tag in the Downloads section.

Reaching this milestone has been a long slog, albeit an interesting project in relearning X86 assembly language and coding in SPITBOL.

The translator consists of about 3000 lines of SPITBOL code. LEX.SPT consists of about 1000 lines. It produces a file of lexemes which are fed to ASM.SPIT, which consists of about 2000 lines of code. ASM generates assembly code suitable for input to the NASM assembler.

The hard part was to configure ASM.SPT so it can generate code for X86-32 or X86-64.

To try out the system, do

$ make clean;make
$ ./spitbol test/hello.spt

To see the program in action, set Z_TRACE to 1 at the start of ASM.SPT. Then try

$ ./spitbol test/hello.spt >& ad
$ make z

This will produce a *large* file “ae” with an instruction-by-instruction trace of the MINIMAL code, showing the hardware instructions executed and a report of differential changes at the machine register level.

One of the more challenging — and fun — parts of this exercise in porting has been to produce that trace. I found available debuggers, such as GDB and its graphical front-end DDD, of little use in dealing with assembly language, and so had to write my own debugging trace tool.

I’ll keep you posted on further developments.

thanks,dave

SPITBOL Status Report

Early this summer I started work on what I thought would be a modest change: convert SPITBOL from using the GNU GAS assembler to my favorite X86 assembler, NASM. I also wanted to do some code cleanup as part of this.

Alas! What I thought would take a few weeks took a few months.

I make no excuses. The fault was mine, and I knew it was my job to fix it.

I also soon realized that while it would take longer, the work was needed, as I hadn’t worked on SPITBOL in almost three decades, and my programming skills, especially in X86 assembler were — to say the least — quite rusty.

I had forgotten almost all — which really wasn’t that much — of SPITBOL structures and internals, so it was necessary to reacquire that knowledge — even if I had to do it the hard way.

The process was complicated by the poor support for assembly language programming provided by Linux. I knew about GDB and its visual front-end, DDD. However, I found them sorely lacking, probably because SPITBOL as I found it intermixed data and code in the code section, and this was enough to cause problems using these tools.

As a result, I implemented a variety of instruction-level traces to try to find out what was happening. That itself was an interesting experience, one that made me appreciate even more the power of SPITBOL when it comes to doing this sort of thing.

I plan to write more about this in a future post, but the immediate porting concerns must be addressed first.

The current status is as follows.

The code now available at Hardbol SPITBOL contains a directory b32 with a bootstrap compiler. This is a 32-bit word, 8-bit character SPITBOL using only TCC as the C compiler, MUSL as the library, and NASM as the assembler. The system is thus self-contained in that does not rely on gcc/gnu code.

It doesn’t support floating point.

The OSINT procedures have been cleaned up in that all mention of obsolete systems such as Windows NT, SOLARIS, and MAC (pre OSX) have been omitted.

Going forth, SPITOBL will support only one operating system — Unix.

I have started work on the port for Linux 64-bit word, 8-bit characters. I expect that won’t take too long, but given my track record, we will see…

Once I have that, I’ll try port to OSX. TCC and MUSL support OSX. If that goes well, I’ll put it out. If I run into too many problems, I’ll back off and just do the next — and key — port, for 64-bit words and 32-bit characters, as that is needed for full UNICODE support.

I’ll keep you posted.

Introduction to the Macro SPITBOL MINIMAL Reference Manual

The source code for MACRO SPITBOL contains extensive documentation. I have extracted the specification of the MINIMAL (Machine Independent Macro Assembly Language) and the specification of the OSINT (Operatint System INterface) and converted the plain text to HTML, resulting in what is now the “MINIMAL Reference Manual.

As part of this effort I wrote an introduction in order to give a sense of the flavor of the code. Here is that introduction:

Introduction

The implementation of MACRO SPITBOL is written in three languages: MINIMAL, C, and assembler.

The SPITBOL compiler and runtime is written in MINIMAL, a machine-independent portable assembly language.

The runtime is augmented by procedures written in C that collectively comprise OSINT (Operating System INTerface). These procedures provides such functions as input and output, system initialization and termination, management of UNIX pipes, the loading of external functions, the writing and reading of save files and load modules, and so forth.

The implementation also includes assembly code. This size of this code varies according to the target machine. About 1500 lines are needed for the x86 architecture running UNIX.

This code provides such functions as macros that define the translation of MINIMAL instructions that take more than a few machine-level instructions, support for calling C procedures from MINIMAL, for calling MINIMAL procedures from C, for creating save files and load modules, and for resuming execution from save files or load modules.

To give some idea of the flavor of the code, consider the following simple SPITBOL program that copies standard input to standard output.

loop output = input :s(loop)
end

By default, the variable input is input-associated to standard input, so each attempt to get its value results in reading in a line from standard input and returning the line as a string. The read fails if there are no more lines, and succeeds otherwise.

Similarly, the variable output is output-associated with standard output, so each assignment to output causes the assigned value to be written to the standard output file.

The osint procedure for writing a line is SYSOU. It is called from within SPITBOL as part of assignment, as shown in the follwing excerpt from the MINIMAL source:

*      here for output association

asg10  bze  kvoup,asg07      ignore output assoc if output off
asg1b  mov  xl,xr            copy trblk pointer
       mov  xr,trnxt(xr)     point to next trblk
       beq  (xr),=b_trt,asg1b loop back if another trblk
       mov  xr,xl            else point back to last trblk
.if    .cnbf
       mov  -(xs),trval(xr)  stack value to output
.else
       mov  xr,trval(xr)     get value to output
       beq  (xr),=b_bct,asg11 branch if buffer
       mov  -(xs),xr         stack value to output
.fi
       jsr  gtstg            convert to string
       ppm  asg12            get datatype name if unconvertible

*      merge with string or buffer to output in xr

asg11  mov  wa,trfpt(xl)     fcblk ptr
       bze  wa,asg13         jump if standard output file

*      here for output to file

asg1a  jsr  sysou            call system output routine
       err  206,output caused file overflow
       err  207,output caused non-recoverable error
       exi                   else all done, return to caller

From the OSINT C code (the C procedure name starts with ‘z’ since there is intermediate code (shown below) to call from MINIMAL to C at runtime):

zysou()
{
    REGISTER struct fcblk *fcb = WA(struct fcblk *);
    REGISTER union block *blk = XR(union block *);
    int result;

    if (blk->scb.typ == type_scl) {
	/* called with string, get length from SCBLK */
	SET_WA(blk->scb.len);
    } else {
	/* called with buffer, get length from BCBLK, and treat BSBLK
	 * like an SCBLK
	 */
	SET_WA(blk->bcb.len);
	SET_XR(blk->bcb.bcbuf);
    }

    if (fcb == (struct fcblk *) 0 || fcb == (struct fcblk *) 1) {
	if (!fcb)
	    result = zyspi();
	else
	    result = zyspr();
	if (result == EXI_0) 
	    return EXI_0;
	else 
	    return EXI_2;
    }

    /* ensure iob is open, fail if unsuccessful */
    if (!(MK_MP(fcb->iob, struct ioblk *)->flg1 & IO_OPN)) {
	 return EXI_1;
    }

    /* write the data, fail if unsuccessful */
    if (oswrite
	(fcb->mode, fcb->rsz, WA(word), MK_MP(fcb->iob, struct ioblk *),
	 XR(struct scblk *)) != 0)
	 return EXI_2;

    /* normal return */
    return EXI_0;
}

Here is the assembly code that is used to call a C procedure from MINIMAL. The code is for 32-bit X86
and is written using NASM (Netwide Assembler) syntax.

	%macro	mtoc	1
	extern	%1
	; save minimal registers to make their values available to called procedure
	mov     dword [reg_wa],ecx     
        mov     dword [reg_wb],ebx
        mov     dword [reg_wc],edx	; (also reg_ia)
        mov     dword [reg_xr],edi
        mov     dword [reg_xl],esi
        mov     dword [reg_cp],ebp	; Needed in image saved by sysxi
        call    %1			; call c interface function
;       restore minimal registers since called procedure  may have changed them
        mov     ecx, dword [reg_wa]	; restore registers
        mov     ebx, dword [reg_wb]
        mov     edx, dword [reg_wc]	; (also reg_ia)
        mov     edi, dword [reg_xr]
        mov     esi, dword [reg_xl]
        mov     ebp, dword [reg_cp]
;	restore direction flag in (the unlikely) case that it was changed
        cld
;	note that the called procedure must return exi action in eax
	ret
	%endmacro

  ...

	global	sysou			; output record
sysou:
	mtoc	zysou

On being the maintainer, sole developer, and probably the sole active user of the programming language SPITBOL

As best as I can tell, I am in what I believe to be a unique situation:

I am the maintainer, sole developer, and probably the only active user of the programming language SPITBOL.

(If you know of anyone else who is maintaining an open-source implemenation of a programming language that has only one user, please let me know, via comments to this post.)

Let me explain.

In June of this year I took over maintainership of Macro SPITBOL. Mark Emmer had labored for almost a quarter century in this task. He kept Macro SPITBOL alive, and that is the task I face going forward.

Macro SPITBOL, the work of Robert B. K. Dewar and the late A. P. “Tony” McCann, is the best implementation of the SNOBOL4 language yet created. I believe it to be a remarkable accomplishment.

SNOBOL was created at Bell Labs in the early 60’s. The basic reference is The SNOBOL4 Programming Language, by R. E. “Ralph” Griswold, J. F. Pogue, and and I. P. Polonsky. It is known to SNOBOL4/SPITBOL folks as just “The Green Book,” due to the color of its cover. (Mark Emmer secured the rights to make it available, and the pdf for the book is freely above at the cited URL.)

SNOBOL was the first language to address processing strings and text in a serious way. It’s most notable innovation was the notion of “pattern.” As part of this, it introduced many primitives such as BREAK and SPAN. Those terms are now commonly used in many libraries and programs that process text.

SNOBOL was one of several languages widely taught in “Introduction To Programming” courses in the 70’s and early 80’s.Then — I think it fair to say — SNOBOL sank into obscurity.

Macro SPITBOL was created in the mid 70’s. Dewar and McCann finished their work by the late 70’s. I did the port to CDC 6600 in the late 70’s, and worked with Robert to do the port to the IBM/PC 8086 in 1983. Mark Emmer then took over the project, and did ports to Mac, Solaris, and various versions of Unix.

At most ten people worked on refining the SPITBOL implementation. I’m the only one still working on it. Mark lends a hand now and then, but more as an advisor than programmer.

Since taking over maintainership I have finished the Linux port (Mark had already done most of the heavy lifting). I started writing about SPITBOL on this b log to drum up interest, and I am actively working on the port to Apple’s OSX, to be followed by a port to the ARM architecture, so that SPITBOL can be made available on Apple’s ioS and Google’s Android operating environments (both use ARM chips).

Here is where the work stands:

  • There have been just over 3,000 downloads of the Windows version since it was released in open-source form in July 2009. I’ve received less than ten, if not zero, emails and blog comments about this version since its release.
  • There have been 30 (thirty) downloads of the Linux version since it was released two months ago. I’ve heard back from few than five people that they have downloaded it and also tried it.
  • While a few (under 10) folks have expressed interest in the work, no one has stepped forward to work on the code as a developer.
  • I’ve seen no sign anyone is actively using SNOBOL4/SPITBOL in a serious way. SPITBOL was once used by folks in the linguistics community, but I’ve seen no sign of recent work.
  • I did a search on Google for “SPITBOL” in June and got about 17,000 matches. I also did a search on “SNOBOL4” and got about 47,000 matches. I just repeated the searches and got the same results.
  • Despite making many posts about SPITBOL on my blog, I have yet to see any one of them achieve more than 20 or so views.

In summary, though I know I have advanced the code, I have been a failure trying to drum up interest in the project.

What Next?

I am not giving up. I have several reasons to push on:

  • SPITBOL is unique in the power it provides to manipulate strings and text. I’ve yet to see anything else come close.
  • SPITBOL is both amazingly fast and compact. It can compile hundreds of thousands of lines of code per second, in a program consisting of under 20,0000 executable instructions. It runs at the hardware level. For example, it includes a compacting mark-sweek garbage collector, while I know of at least one widely-used programmng language that still uses reference counts. [1]
  • The ARM implementation should be *very* fast.
  • I have packaged the system so it is self-hosting. SPITBOL uses tcc for its C compiler, NASM for its assembler, musl libc for its runtime libraries, and Rob Lanbley’s toybox to provide basic Unix commands. SPITBOL is built with these open-source packages, and the source for them is included in the SPITBOL distribution, so you get not only Macro SPITBOL, but a complete development environment to boot.
  • I believe there is a great opportunity due to the poor support for Unicode in other programming languages and libraries. If I can add full support for Unicode, I hope to realize “if you build it, they will come.”
  • Perhaps most important: SPITBOL is fun!. I’ve been programming for over fifty years, and I have never had so much fun save when I was working on SPITBOL, as either user of the language or as an implementer.
  • The quality of the system will speak for itself: Code Talks.
  • Knowing that I am starting from zero, any progress will be great news, and it will be fun to see how far I can get. I had a similar experience working with Philippe Charles on Jikes. We started from nothing, and were able to build something that achieved some success. I hope to repeat that, and I know I will have fun trying to do so, even if I don’t succeed.

It is also evident that talking about SPITBOL is a thankless task for now, so I’m going to put off blogging, twittering and such, so I can concentrate on the coding, which after all is all that matters.

Most importantly, I believe that polishing the implementation and porting it to new environments is not enough.

The *only* way to move the project forward is to finish needed refinement and ports as soon as possible, and then to use the result to write applications in SPITBOL that demonstrate what can done when programming in SPITBOL.

I have a few ideas about useful applications, and hope to get to work on them as soon as possible.

We shall see…

Notes:

1. Can you name a widely-used contemporary programming language that still uses the 60’s software technology of reference counts to manage storage?

  • Pages

  • June 2021
    M T W T F S S
     123456
    78910111213
    14151617181920
    21222324252627
    282930  
  • RSS The Wayward Word Press

  • Recent Comments

    daveshields on SPITBOL for OSX is now av…
    Russ Urquhart on SPITBOL for OSX is now av…
    Sahana’s Respo… on A brief history of Sahana by S…
    Sahana’s Respo… on A brief history of Sahana by S…
    James Murray on On being the maintainer, sole…
  • Archives

  • Blog Stats

  • Top Posts

  • Top Rated

  • Recent Posts

  • Archives

  • Top Rated