I have started a discussion about The Lost Art of Programming in Assembly Language with my fellow members of the LinkedIn Compiler Experts Group.
The discussion has drawn many comments, and I am taking the liberty of reproducing them here. I’ll update this post from time to time as more comments are added:
The Lost Art of Programming in Assembly Language
I believe that programming in assembly language is becoming a lost art. I expect everyone in this group is an expert in programming in assembly language for at least one architecture, and most know several architectures very deeply, down to knowing how many cycles the most common instructions take.
But how many programmers not expert in compilers know *any* assembly language?
I’m giving a talk on this topic next week at MHVLUG, the Mid-Hudson Valley Linux Users Group. As part of that I would like to poll fellow members both on their knowledge of assembly language, and their estimates of the knowledge of the average skilled programmer.
Towards this end, I have established a new community at Ning, as I recall it provides a polling feature.
dave shields
Comments (23)
David Corbin
President / Chief Architect at Dynamic Concepts Development Corp.
As a person who has programmed using assemblers for over 65 different processors, I truely understand where you are coming from.
Alas, from a purely pragmatic standpoint, how much VALUE is there in knowing the low level for the typical programmer these days.
I well remember many weeks/months of pulling very late nights (and frequent all-nighters) to save a few clock cycles or bytes/words of memory.
I also remember:
1) Toggling in boot loaders via front panel toggles
2) Hand patching (altering) paper tape to make modifications to programs
3) “Programming” (cut or solder) diode based memory boards
( I also remember rebuilding Rochester carborators using “jiffy kits”, or adjusting spark gaps using a matchbook…..but that is another story).
TIme change, priorities change, “Value” changes.
ps: To help avoid the “lost” part (as contrasted to merely arcane) I am a member of a variety of Classic Computer Societies and musuems. If anyone knows of Pre-1980 computer technology which is in danger of being destroyed, please let me know.
Personally the pride of my collection is a PDP-8/E, introduced in the 1960’s mine was build Feb 1972 with:
1) True Core memory – try to find someone who can rethread a damaged module
2) 14″ 2.5MB removable disk drives [RK-05]
3) 19″ Rack mound 32KW DRUM disk – try to find someone who can replate one of these
4) Three ASR-33 Teletypes – (I am seriously looking for someone who was a TTY repairman/maintainer – There are a few issues with these that are beyond my ability to repair (I CAN get the parts!), and if something is not done, they are likely to become non-operational in the next 5 years….
Robert Gezelter
Principal, Robert Gezelter Software Consultant
I have programmed many architectures at the assembly level, and I heartily believe the comment made at my graduate orientation (NYU/Courant Institute) by the then Director of Graduate Studies for the Computer Science Department, the late Professor Max Goldstein, to wit: “To get here you know one high level language and one low level language. We assume that you can read another in 24 hours and write in it in 48 hours.”
It is not so much a question of writing significant software in assembler, which I admit do doing for a variety of processors from embedded controllers (PIC), minicomputers (PDP-11), super-minicomputers (VAX/ALPHA/Itanium), and mainframes (System/360/370).
It is more a question of understanding the relative complexity of operations and their implications.
Stephen Heffner
Software inventor/architect/designer/implementor, entrepreneur, educator
As someone who has over a dozen assemblers under my belt, and who has authored a software engineering automation meta-tool that translates about 9 different assemblers to higher-level languages (among many other languages), I must reluctantly admit that a knowledge of assembler is not, per se, that useful these days, unless you’re programming arcane device drivers very close to the metal, or in an extremely memory-limited environment. HOWEVER, I do think that a proficiency with assembler is excellent discipline.
An inexact analogy: I took 4 years of Latin in high school. I never directly applied it, but I learned more about English in Latin class than I ever learned in English class. And when it came time to pick up some proficiency in Romance languages (Italian, French, and Spanish), it was extremely useful. Am I glad I took those Latin classes? You bet!
Another thing: It’s very hard to survive as a BAD assembly language programmer — it tends to separate the sheep from the goats. But it’s unfortunately easy to survive as a bad 3GL programmer, as attested to by the sorry state of most of the world’s software. So if I know someone has gained real proficiency in at least one assembler, I know he/she is the real deal.
Finally: Being proficient in an assembler gives one the comfortable feeling of knowing what’s going on underneath all of those 3GLs and 4GLs. That feeling engenders self-confidence and an appreciation for the craft of programming, more than just learning 3GLs and 4GLs.
My (about to be inflated) 2c worth…
BTW, LinkedIn has the ability to host polls too.
David Corbin
President / Chief Architect at Dynamic Concepts Development Corp.
Robert, Stephen,
I agree (for the most part) with both of your posts, and want to clarify a bit what I meant specificly with “Value”. I was using the Employer / Employee relationship where value translates directly to revenue. Since the majority of jobs are in companies where computer programs are a “tool” rather than a “product”, the optimal position is where the “business savings” to “development cost” ratio is high. In many cases this means that an “inferior” (from a computer science point of view) solution that meets the requirements and was developed (and maintained) at a very low cost is the goal.
I do take exception with the quote from Professor Max Goldstein. While it may be true that one can grasph the general meaning of another language and even write syntactically valid code in another language quite quickly; I have repeatedly found that unless the person is capable of “divorcing” themselves from the first language they will invariably bring habits (which were good) into the new language which are quite bad.
When C++ first came out, it was quickly seen that the WORST object oriented programmers were those with a “C” background. With Microsoft .NET, the worst C# programmers are often C++ programmers [and the worst VB.NET programmers people with VB6.0 expertise]
Robert Gezelter
David,
With all due respect, I must clarify what I believe Professor Goldstein meant with that remark. Having worked doing research for the Computer Science Department since I entered NYU as an undergraduate, I can assure you that C with a FORTRAN accent was not the intent.
I too have encountered numerous incidents of C (FORTRAN), C++ (really C), and a wide-range of others, the phenomenon goes far beyond the sphere of programming languages. However, I believe that the standard presumed was the higher standard of being able to assimilate the idiom and gestalt of the additional architectures and languages.
Since my background was systems programming, I found that many of the concepts in “object oriented programming” were formalizations of good engineering practices that had been used within many operating systems for a long time. Admittedly, the terminology was different. As Stephen noted, understanding what the implementation actually is makes this a far more transparent phenomenon than when there is no understanding of the underlying implementation.
Stephen’s comment about Latin as a parallel is also well taken. Even allowing for the differences in idioms, a knowledge of Latin is often an extremely useful tool if one is trying to master multiple romance languages, which are similar, and descended from Latin.
dave shields
I just love Stephen’s comment “It’s very hard to survive as a BAD assembly language programmer — it tends to separate the sheep from the goats.”
I had four years of Latin in high school. I did the fourth year only because of the teacher’s love for Latin. The Latin proved useful in my three trips to Italy, as I could read the text from many of the monuments,
I also had two years of Russian, thanks to the NDEA act of 1958, which was passed because of Sputnik. (I’ve written of this in my blog.) Russian was only taught for three years in New Mexico. It came in handy during my four trips to Russia.
dave shields
I spent over an hour on the phone a couple of weeks ago with Monty Denneau, IBM’s best hardware architect since John Cocke. Monty designed GF11 and BlueGene, currently the world’s largest supercomputer. He won the Cray award in 2002. Monty is now working on what will be the world’s largest around 2013 or so. The power budget is in the megawatts.
An architecture tells the programmer how to access memory, the set of registers, and the set of instructions. These define the assembly language.
If you don’t know at least one assembly language, then you are in my view fundamentally handicapped as a programmer, since you don’t know what a computer is at the most basic level.
C++ has its uses, though I am not a great fan. My first, and hopefullylast, program in C++ was the Jikes compiler. My co-author Philippe taught me just enough C++ so I could make best use of my expertise in C.
dave shields
I think Max Goldstein was right in that once you have learned one low level language and another of a higher level then it is much easier to learn more. But you need to *master* those languages. Then indeed the rest is detail.
It’s also important to master different kinds of higher-level languages. For example, SNOBOL4/SPITBOL gives you unique insight into working with strings. Writing code in SPITBOL gives more fun, line for line, than any other PL I have used. Every line is a challenge and a delight, even though, as Robert Dewar, the author of SPITBOL once remarked, no one has ever mastered the semantics of SNOBOL4.
The one language whose semantics I completely mastered was FORTRAN, as defined in the ANSI 1974 (?) standard. I knew the meaning and consequences of every character in that document.
I worked for almost a year as part of a team of five that wrote a program to translate Sybase SQL stored procedures to DB2 UDB stored procedures. The code was written in OCAML. OCAML is unique in that not once did I need to use a debugger. If I could get the program to compile then it worked, or if it didn’t work then it was a trivial task to find the problem.
dave shields
Writing of Max Goldstein reminded me of my favorite Max stories.
Max started his career at Los Alamos. During the 50’s he was in charge of preparing a book of random numbers. (I can recall seeing a copy of the book on a library shelf years before I met Max.)
One day, while preparing the book, one of the clerks who was transcribing the list of random numbers came to Max and confessed that they had made an error copying one of the numbers.
Max said, “Don’t worry. It doesn’t matter. It was a random error,”
The clerk went away dumbfounded.
dave shields
Another Max Goldstein story…
Courant’s (CIMS) CDC 6600 had one megabyte of memory. More precisely, it had 128K words, with sixty bits to a word, six bits to a character.
Memory was a scarce resource in those days. For example, jobs taking more than half the memory, just 512KB, were only run overnight.
The resource was scarce because memory was expensive. Max had originally ordered 512K from CDC, but when it came time to install the machine, CDC said they only had available 1MB of memory. Max told them to go ahead and install it. He felt they would never get aound to downsizing it.
They didn’t, so Max saved CIMS hundreds of thousands of dollars.
Max, along with almost everyone else I met during my years at CIMS, was a wonderful person.
dave shields
I just published blog posts inspired by this discussion on two of my blogs. The content is the same in both.
http://jackstalesandstories.wordpress.com/2009/05/03/two-stories-about-max-goldstein-director-of-the-cims-computing-center-in-the-1960s-and-1970s/
https://daveshields.wordpress.com/2009/05/03/on-programming-machine-language-supercomputers-and-moores-law/
Robert Gezelter
Indeed, Dave, “mastery” was the underlying presumption in what Max said. At the time, I had already mastered assembler for the IBM 1620, IBM 1130, IBM System/360/370/303x, the Digital PDP-11, and the Digital VAX. Ditto for a plethora of high level languages including FORTRAN, PASCAL, PL/I, LITTLE, and some COBOL (C was not popular yet), and I had been reading a bit of BLISS.
New languages never seemed an obstacle. Syntactically strange at first, but nothing insurmountable.
David Corbin
I never met Max, though many of my early mentors in the 1970’s talked about him frequently, and I have definately enjoyed the information posted on this thread.
Robert’s last comment: “New languages never seemed an obstacle. Syntactically strange at first, but nothing insurmountable.” does bother me.
I believe that mastering a language has very little do do with syntax (which can be learned in hours or days), and much more to do with having a deep understanding of the concepts and the paradigm (which can take months or years) of the language. I have seen “functional” (the compile, execute, and produce the desired results) programs in MANY languages where the implementation did NOT embody the very concepts the language was designed (or evolved into) intended to address.
I am not saying that anyone on this thread has such a position or attitude, but it definately gets my ire up, when a person says (I dont like/need/want feature X of language Y..when they can not even coherently explain all of the aspects of the language and the rational behind their inclusion).
IMHO: The “commoditizing” of programming is largely to “blame” for this (though there are many other reasons).
Robert Gezelter
When I said “Syntactically strange at first”, I was referring to some of the syntactic conventions. I was definitely NOT using the “Hello world” program as a standard. From my perspective, “Hello world” is more a sanity check of the installation of the compiler and run time environment than anything else.
My standard for working with a new language has been to deal with the underlying concepts and architecture as the basis. Once this is understood, at least for me personally, I have not found things all that difficult.
I do not claim that this is the prevalent mode or experience.
In the context of this thread, I certainly have not found any machine architecture particularly strange or difficult to understand. Each has its quirks, some more than others, but in working from PICs and custom firmware to mainframes, I find the similarities are overwhelming. There are differences for sure, but I even found the FPS-164 and Itanium not particularly difficult to understand.
Some of this may be the result of having worked in code generators (where I met Dave), and systems programming on many different architectures.
I will not disagree with you that there are many “programmers” in the industry for which the above does not hold. I well remember “programmers” that “knew” COBOL, but only on an IBM System/370 Model 148 under OS MVT. In actuality, such a statement often says a lot.
David Corbin
Robert, unfortunately (of the “low-level” programmers that I have been involved with over the past 30 years) it seems that the vast majority do NOT have a good understanding of thing much past the basic instruction set.
About a decade ago I was involved in a HUGE DSP (cube) project. Only one of the people on the team (when I arrived) was able to look at code (it was all written in assembler) and be able to isolate the points where flow was blocked waiting for a previous calculation, or where there was an execution path sitting idle.
More recently I have been involved in highly parallel multi-processor systems (usually but not always Intel based). When the team was asked to identify which parts of the code would perform better on a 2-CPU (each Dual Core) and which parts would perform better on a 1-CPU (Quad Core), they were completely clueless. Most of them could not even identify locations in the code where pipline flushs were occuring that could be avoided.
As we start moving into the “next generation” (e.g. Kiefer with 32 cores per processor) and beyond, it is [IMHO] completely unrealistic for the general “application programmer” to be able to efficiently make use of the hardware. The “work” will have to (again IMHO) be done by the language architectures and the associated tooling (e.g. compilers).
These are all common challenges which (for the most part) did not exist 30 years ago – at least outside of specialized areas, but which are going to have a significant impact on the functioning of systems tomorrow and beyond..
Stephen Heffner
If a programmer responsible for code doesn’t understand that code COMPLETELY, either a) he/she is NOT FIT to do the job or b) the code is so BAD it should be thrown out. This is true regardless of the language involved.
I have NO patience with “programmers” who can’t program! They cause untold damage — witness the sorry state of most programs in the world today. Part of the problem is that it’s politically incorrect to criticize someone — it might hurt his/her self-esteem. The result is rampant incompetence at every level. Such so-called programmers should find jobs digging ditches; the worst case is that the ditch won’t get dug.
One big thing that’s different about assembler from 3GLs etc. is that in a 3GL, an incompetent programmer can “sort of” get by; in assembler it’s usually fatal (to the system).
One of my goals with XTRAN, our software engineering meta-tool, is to automate a large portion of program maintenance and enhancement work, so incompetent programmers don’t get the chance to muck it up.
Robert Gezelter
David,
I have had similar experiences. I too have seen numerous cases where there was insufficient knowledge of many aspects of a design, technology, or algorithms.
I have seen applications programmers fail to understand simple algorithms since I first entered computing three decades ago. The first time it happened, I was amazed that it was considered an “advanced” topic that an n**2 (where n=10,000) algorithm was a performance issue (e.g., bubble sort in a business application).
Efficiently using multi-core implementations is a definite challenge.
David Corbin
Stephen,
1) Any job/career/profession is going to have some type of “bell curve” of expertise.
2) I defy anyone to pick a hardware platform, and programming language; then provide a small (one page) sample of code, and be able to COMPLETELY describe what happens when the code executes. [I have given this challenge many times since the time I was humbled by the same challenge back in 1978; never once has a person completed it – nor should anyone be expected to.]
Stephen Heffner
David,
1) Of course. The problem arises when a) the curve’s median is at “mediocre to bad”, and b) the bottom tail causes huge problems for the rest of the curve. What do you suppose the bell curve looks like for brain surgeons? Would you want someone in the bottom tail working on your brain? Are we to accept mediocrity or outright incompetence because “every job/career/profession has a bell curve”? IMHO the objective is to chop off the bottom tail and raise the median of the remainder through education and rewarding excellence.
2) Who said anything about “one page” of code? This a “straw man” argument that obscures the problem: Too many “programmers” don’t even understand the basics of the code for which they’re responsible, much less master the code. How many maintenance programmers are afraid to make a change, for fear of breaking the system somewhere else? This is because a) the system is much too fragile (due to the ministrations of previous bad programmers) and/or b) they probably don’t really understand the system anyway.
Sorry to be so harsh, but I’ve seen the sad results of the problems I described over and over in the real world, during the 37 years I’ve been an independent consultant — mostly in application programming, and much less in system programming and real-time where the consequences of incompetence are dire.
#
dave shields
I accept David Corbin’s challenge: ” I defy anyone to pick a hardware platform, and programming language; then provide a small (one page) sample of code, and be able to COMPLETELY describe what happens when the code executes.”
In the late 60’s I was a sysadmin for the Courant CDC 6600. One day early in June I was told that the fiscal year would end a few weeks later, and that I should make sure to use up all the time that had been allocated to me, to show our sponsors that we made full use of our funding.
I thus wrote the following FORTRAN program, and then an it several times with different time limits until the account went dry:
PROGRAM MAIN
10 GOTO 10
END
Here is my complete description:
The program wastes time in an efficient fashion.
One page is more than enough. Three lines suffice.
The program ran correctly the first time.
I thought of rewriting it in assembly language, to make it even more efficient using an in-stack loop.
However, that would have been a waste of my time …
dave shields
Founder at Delites US
In my previous comment I said of my time-wasting FORTRAN program:
PROGRAM MAIN
10 GOTO 10
END
that it ran right the first time.
This was the second time I did this.
Earlier, during a discussion about great programming feats during an afternoon tea on the 13th floor of NYU’s Courant Institute of Mathematical Sciences (CIMS), someone mentioned the best they had ever seen was the work of a graduate student at Cornell. Working solely from the hardware reference manual the student wrote a program of about 5000 lines of assembly language that ran correctly the first time.
Realizing I had never accomplished this feat, I then went down to the computer center and entered the following program using just three punch cards:
PROGRAM MAIN
STOP
END
It ran correctly the first time.
I boasted of my prowess af tea the next day.
I didn’t disclose the code. I just said it wasn’t that difficult a problem.
I’ve just written two blog posts based on Corbin’s Challenge:
On Programming and David Corbin’s Challenge: “I defy anyone to pick a hardware platform, and programming language; then provide a small (one page) sample of code, and be able to COMPLETELY describe what happens when the code executes.”
On Programming: Getting It Right The First Timein col
By the way I wrote the post that includes the comments posted to date in part because my blog is included in Sam Ruby’s Planet Intertwingly, http://planet.intertwingly.net . It is very interesting, and hence widely followed.