Chump 0.0.10 review

by on

Chump is a language I created to describe line assemblers and disassembles in one description. It is used in KMD to describe diffe

License: LGPL (GNU Lesser General Public License)
File size: 156K
Developer: Charlie Brej
0 stars award from

Chump is a language I created to describe line assemblers and disassembles in one description.

It is used in KMD to describe different processors. The reason for using this reather than gnu binutils as is the fast implementation. Its basicly made for people who want to design instruction sets.

GLib - Provides many useful data types, macros, type conversions, string utilities and a lexical scanner.
GDK - A wrapper for low-level windowing functions.
GTK - An advanced widget set.
BFD - the Binary File Descriptor Library. (BFD comes with GCC)

What's New in This Release:
full expression parsing including hex/dec/oct/ascii/symbols
fix of recursive calls bug
Long int allows assambling for architectures with greater int size than that running on.
ARM16 (Thumb) architecture added

Example code:

The system comes with a sample.chump. This has descriptions of ARM32, MIPS32 and STUMP16 architectures. I am working on 6809 as well.
It took me about 3 days to write the ARM one. About 1 day for the MIPS and 1 hour for STUMP.

Below are descriptions of a STUMP (little 16 bit RISC) written in chump.

(isa "STUMP16" ; STUMP is a simple 16bit processor (C) Andrew Bardsley
; Use this description to learn chump
; This is not a good tutorial but you can get the basics
; Firstly the basics:
; The following is the correct syntax to describe a translation
; (("Disasambled descrption")(Assambled description))
; disassamled description is simply a string or set of strings
; Assembled description is a set of bits I (always on),O (always off),
; X (dontcare but set as on),Z (dontcare but set as off)
; Be careful, I and O are LETTERS. The parser will complain if it doesnt understand.
; e.g. 1 : (("R3")(OII)) - matches 011 to "R3"
; and "R3" to 011
; e.g. 2 : (("BR")(OZX)) - matches 000, 001, 010 or 011 to "BR"
; and "BR" to 001
; e.g. 3 : (("PC")(III))
; (("R7")(III)) - matches 111, to "PC" as its first in the list
; and "PC" or "R7" to 111
; e.g. 4 : (define "set" (("S")(I)) defines a rule called "set"
; (("") (O)) ) this rule can now be used in all rules below
; (("ADD" set) (OI set)) we can now use the predefined rule in another rule
; remember to place the rule in both the binary and ascii sections
; e.g. 5 : (define "imm" (int 4 + 4)) "imm" is defined to be a 4bit hex number. When DISASSEMBLING 4 is added
; e.g. 6 : (define "imm" (relative 4)) "imm" is defined to be a 4bit relative number offset from the current position
; e.g. 7 : (("#" ("imm" (int 4))) (imm)) the imm rule is defined in the rule. Its only valid in this rule
; and previous definition is ignored in this rule
; Take a look at the STUMP instruction set
; Instruction types
; 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
; 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
; --------------------------------
; Type 1 OP |0|S| DST |SRCA |SRCB |SHIFT
; Cond Br 1|1|1|1| COND | OFFSET

(define "reg" (("R0")(OOO)) (("R1")(OOI)) ; Firstly we 'define' a description of all the registers
(("R2")(OIO)) (("R3")(OII)) ; R0 - 000, R1 - 001 ... PC - 111, R7 - 111
(("R4")(IOO)) (("R5")(IOI)) ; 111 is overloaded, When disassambling it will choose the first
(("R6")(IIO)) (("PC")(III)) ; one ("PC") but when assembling either are acceptable
(define "dst" reg) ; DST is a register
(define "srca" reg) ; so are SrcA and SrcB
(define "srcb" reg)
(define "op" (("ADD")(OOO)) (("ADC")(OOI)) ; These are the 6 OP codes
(("SUB")(OIO)) (("SBC")(OII))
(("AND")(IOO)) (("OR") (IOI)))
(define "set" (("S")(I)) ; If set bit is set then add an S onto the opcode
(("") (O))) ; e.g. ADD -> ADDS

(define "shift" (("") (OO)) ; Shift types
((", ASR") (OI))
((", ROR") (IO))
((", RRC") (II)))

(define "cond" (("") (OOOO)) ; Branch conditions
(("AL") (OOOO))
(("NV") (OOOI))
(("HI") (OOIO))
(("LS") (OOII))
(("CC") (OIOO))
(("CS") (OIOI))
(("NE") (OIIO))
(("EQ") (OIII))
(("VC") (IOOO))
(("VS") (IOOI))
(("PL") (IOIO))
(("MI") (IOII))
(("GE") (IIOO))
(("LT") (IIOI))
(("GT") (IIIO))
(("LE") (IIII)))

(define "dir" (("LD")(O)) ; The difference between an ST and an LD is in the S bit

(("NOP") (OZZ Z O OOO ZZZ ZZZ ZZ)) ; These are the descriptions of the instructions
(("NOP") (IOZ Z O OOO ZZZ ZZZ ZZ)) ; These two NOP descriptions overlap other instructions

(("CMP" "tf10" srca ", " srcb shift ) ; e.g. CMP R3, R6, ASR
(OIO O I OOO srca srcb shift))

(("CMP" "tf10" srca ", " ("imm" (int 5)) ) ; e.g. CMP R4, 12
(OIO I I OOO srca imm)) ; notice the inline definition if "imm"

(("MOV" set "tf10" dst ", " ("imm" (int 5))) ; e.g. MOVS R3, 12
(OOOO set dst OOO imm))

(("MOV" set "tf10" dst ", " ( "src" ((reg)(OOO reg)) ; e.g. MOV R3, R5
((reg)(reg OOO))) shift) ; note inline definition can also be translations
(OOOO set dst src shift))

((op set "tf10" dst ", " srca ", " srcb shift) ; e.g. ADD R4, R7, R2
(op O set dst srca srcb shift))

((op set "tf10" dst ", " srca ", " ("imm" (int 5)) ) ; e.g. SUBS R6, R2, C
(op I set dst srca imm))

(("B" cond "tf10" ("offset" (relative 8 ))) ; e.g. BNE 100
(IIII cond offset))

((dir "tf10" dst ", [" srca ", " srcb shift "]") ; e.g. LD r4, [r3,r0]
(IIO O dir dst srca srcb shift))

((dir "tf10" dst ", [" srca ", " ("imm" (int 5)) "]") ; e.g. LD r4, [r3,12]
(IIO I dir dst srca imm))

Chump 0.0.10 keywords