In common commodity microprocessors, the instruction set is either hardwired or has a
MicroCode ROM.
However, a few CPUs include a user-modifiable microcode RAM.
One such processor is the
WISC Technologies' CPU/32 (ca 1987),
mentioned in the JFAR (
http://www.jfar.org/volume5.html). This was built
from discrete TTL components, allowing considerable flexibility in system
configuration.
Another variation, the
RTX 32P, optimized for
ForthLanguage, (see
http://www.ece.cmu.edu/~koopman/stack_computers/sec5_3.html) was designed
to execute two microinstructions for each main memory access, including
instruction fetches.
This was a single-chip CMOS microprocessor, with
on-chip micro-program RAM (and 2 stacks).
It ran at 8 MHz, which was relatively fast for its time.
Another processor, the
Cjip (Imsys Technologies) uses (writable) 72 bit wide
microcode instructions optimized for any of the following:
- ISAJ: stack based instruction set, best performance in JavaLanguage ByteCodes
- ISAC: slightly better optimized for other high level languages like C
- ISA3: designed for the Forth language
- ISA1: a register machine instruction set, basic instructions similar Zilog Z80
There are others. The attractive thing about writable instruction sets
is that it allows profiling techniques to not only optimize the
software
for a given task, but also the
hardware -- if an application is more
heavily weighted toward math, you would tweak the CPU in that direction --
if comms or data collection was required, that would be a different tweak.
I've worked for some companies that could have benefitted hugely from this
kind of technology, but I could never sell the idea.
AssemblyLanguage is
cool, but
MicroCode is where the
RealEngineers hang out.
--
GarryHamilton
How would that work in a multi-threading OS? I imagine swapping out MicroCode is something that one wouldn't do every context-switch...
One approach is to just change an internal switch to point to a different range of microcode.
(For the same reason, many processors that change an internal switch to point to a different register bank during interrupts -- such as the PDP-11, the Z80, the ARM, ___, and ___. The Sun SPARC has "sliding register windows" that are ever so much more clever. "Interrupts on Pipelined Systems"
http://www.cs.uiowa.edu/~jones/arch/notes/34int.html ).
However, since context-switch time is often terrible for other reasons,
the slight improvement in overall performance usually
isn't worth the cost of extra bank(s) of microstore
(or the extra banks of registers).
Mountain View Press: The Forth Source
http://www.TheForthSource.com/
still lists the "WISC CPU/16" in their catalog:
- a 16-bit stack machine with a user writable instruction set.
- Design uses a PC/AT as an I/O server.
"Writable instruction set, stack oriented computers:
The WISC Concept"
article by Philip Koopman Jr. 1987
http://www.ece.cmu.edu/~koopman/forth/rochester_87.pdf
has nearly enough details to build one.
"The CPU/16 Writable Instruction Set Machine (WISC)"
http://www.ece.cmu.edu/~koopman/stack_computers/sec4_2.html
"MVP Microcoded CPU/16 Architecture"
http://www.ece.cmu.edu/~koopman/forth/rochester_86b.pdf
that was built (apparently in 1986, perhaps earlier) out of
less than 100 chips:
4 static RAMs ( data stack, return stack, program and data RAM, and micro-program memory),
and the rest 74LS00 series MSI components.
DavidCary thinks it looks very similar to the
TTL design in
ComputationStructures ISBN 0262231395
(identical ALU design, similar microcode),
with this major added feature:
the CPU/16 could be halted by the host,
and all the RAMs
(Including the micro-program memory !)
and most of the registers
examined and modified by the host,
single-stepped or run freely (~ 3 MHz, which is pretty fast for 73LS00 discrete-logic CPUs).
I used to think writable instruction sets were really, really cool.
However, now I see other solutions to the problems they solve:
- I can't do exactly what I want to do in the time of a single machine-language instruction -->
- ... and I process lots of data. Solutions: WISC or instruction cache or ???. With an instruction cache, typically the bottleneck is now getting data in and out of the CPU. Even if you could add a specialized instruction to do something in a single CPU cycle (rather than a few cycles of subroutine), it wouldn't run any faster than a in-cache subroutine -- both are still waiting on RAM for the data. Both WISC and instruction caches slow down context-switches.
- ... and I do a lot of processing per byte of data. Solutions: WISC or DSP or FieldProgrammableGateArray or ???.
- I can't do exactly what I want to do in the space of a single machine-language instruction -->
- solutions: WISC or "call subroutine" instruction or ???. Once the subroutine is written once, the space required to call it as many places as you want is a single "call subroutine" instruction.
- The instruction set is ugly -- I hate writing so many instructions to do a simple "call".
- solutions: WISC or an interpreted language like ForthLanguage (that interprets a short "call" instruction into the appropriate sequence at run time) or higher-level languages (that expands a short "call" instruction into the appropriate sequence at compile time) or ???.
If you start with WISC then
TurnAllTheKnobsToSeven , you get
FpgaCpus.
Maybe "turn all the knobs to 10" to get
MolecularNanoTechnology.
--
DavidCary
(page created 2004-12-07)