Using The Assembler Language 03

By Roy Warner

Originally published in EUG #05

In the first article, a safe area for the code was found by lowering HIMEM, the Byte Array provided by Acorn being deemed unsuitable. I believe Acorn use the Byte Array because of difficulty in explaining the pros and cons of alternative methods.

There is only one way of providing a totally safe area within the Acorn protocol, that is by a Paged Service call. Directly after BREAK the Operating System checks each ROM in turn (pages them), code within the ROMs responds to the paged service call and claims the amount of memory space required. A ROM may not need additional workspace or may share workspace, or may claim private workspace and not use it. That space can then be "illegally" used by the programmer for machine code. This is impossible unless Sideways RAM is present and the programmer can write a ROM image containing the correct code. DFS and ADFS both claim private workspace, that is why PAGE is raised and is also the reason that the filing system is corrupted when ordinary code is inadvertantly in the filing system's private workspace.

Other methods have risks: Memory allocated to users for programs is bounded by two variables namely PAGE and HIMEM, PAGE being the start, HIMEM the end. PAGE is set by the Operating System which pages the ROMs, learns the memory requirement and sets PAGE to suit.

HIMEM is also set by the Operating System and is based on the amount of memory needed to produce the screen presentation. In between PAGE and HIMEM sit TOP and HIMEM, both of which are variables. TOP is the end of a BASIC program, HIMEM is the start of the area where BASIC stores records of variables used in a program. Normally TOP and HIMEM reside at the same address but they can be separated. The BASIC stack, where BASIC stores the variables used in programs, begins under HIMEM and works down towards the upper end of the table of records at HIMEM. When the clear space between the two is used up, the Operating System reports BAD MEMORY.

On each press of BREAK or at start-up, the machine "resets", that is, the Operating System pages the ROMs, declares MODE 6 as the screen mode and determines PAGE, TOP, HIMEM and HIMEM, changes Mode and the Operating System moves the variable values out of the way and relocates the screen memory. Add a line ot a BASIC program and TOP changes. Press BREAK and PAGE goes back to the system defined address. Hence the Byte Array.

Byte Arrays are not without their problems. If too big, a lot of memory is wasted, it too small then "Undefined Results" are back with a vengeance. The Operating System decides where the Byte Array will reside and often changes its mind!

Machine code will not accept being moved around, the addresses of the labels cannot alter, so the source code must be reassembled each time. This destroys much of the memory saving advantages of Assembler.

Lowering HIMEM is the safest option. It should be remembered that if Mode changes are intended, the machine code must finish at an address lower than the lowest required for screen memory and that a Mode change must be carried out from within machine code without alteration of HIMEM. For the time being, lowering HIMEM will be used.

Type in and save the following program:

      190FOR pass=0 TO 3 STEP 3
      200P%=base:[OPT pass
      330EQUS"my string"
      650CALLgo                                      Fig 1

Run it and it should produce the following assembler listing:

      5001          OPT pass
      5001          .go
      5001 A2 00    LDX#0
      5003          .loop
      5003 BD 0F 50 LDAdata,X
      5006 20 EE FF JSR&FFEE
      5009 E8       INX
      500A C9 00    CMP#0
      500C D0 F5    BNEloop
      500E 60       RTS
      500F          .data
      500F 6D 79 20
           73 74 72
           69 63 67 EQUS"my string"
      5018 0D       EQUB13
      5019 0A       EQUB10
      501A 00       EQUB0

This shows the addresses of the machine code instructions and appended data. In the left hand column are the addresses in memory where the code is stored. The machine code instructions are in the second column. The third and fourth columns are data. This pattern changes at the label "data" at address 500F. The program stops at 500E and returns to BASIC. The remaining digits are all data in hex ASCII code. Machine code instructions are called "opcodes" and the data "operands". The instructions are changed to p-codes by a routine known as HASHING which is simple in theory but very complicated in practice.

Hashing is the means of converting a string of ASCII codes to an index. Take LDA, the ASCII code for "L" is 76, double it and add it to itself, then EOR 0 with the product. Add the ASCII code for "D", double the result and EOR the previous result with the product and finally do the same thing with the "A". If this were done in BASIC, any number might result, but in machine code only a number between 0 and 255 can result, due to the 'wrap round' effect (a byte may only contain a maximum of 256). An index has been created, but it may not be unique so a "collision" is looked for, by checking the index to see if a number has been used before. If it has, then the index is decremented by one and hashed again.

A string and a value may be stored at TABLE+(INDEX x RECORD length in bytes) where TABLE is the address at the beginning of the storage area and RECORD is the Assembler instruction and its opcode. To find the value use the same hash routine to produce the index, hash it again, go to the index and check the string. Normally only two tries are required.

The index is restricted to a maximum of 256, but for peak efficiency of indexing and searching it should only be 80% used. 6502 machine code contains less than 256 opcodes. The assembler reads the mnemonic hashes and when it finds a match, pokes the value stored at the match into the correct program address.

The screen display shows that "OPT pass" and the label "go" are at 5001. Stored in the address 5001 is the opcode for "LDA"="A2" followed by 0. The microprocessor has been told to put zero in register "X". At 5003 is an opcode for LDA with "data" indexed to "X", "BD" then 0F 50, the 6502 uses the high byte last so turn it round to 500F, and at 5001 is label "data".

At 500C is "D0" the opcode for BNE but look at the operand "F5". The label loop is at 5003!

The decimal value if hex F5 is 245. The reason is that the computer works in two's compliment binary. Up to 127 is positive, 128 to 255 is negative. This is controlled by bit seven. If it is clear (that is zero), the number is positive, if set (that is 1) then the number is negative. F5 hex is negative. To find the decimal minus quantity represented by F5 or any other hex number, deduct it from hex 100. Type &100-&F5 and the number should be 11. "F5" is minus 11. Still looks wrong! The difference between 5003 and 500C is 9 but by the time the "F5" stored in 500D is in the micro-processr, the data in 500E is on the address bus. "E" hex is 14 decimal. 14 minus 11 equals 3. This is known as "relative addressing".

Fortunately, the Electron does all the calculations, and gets it right every time. Some calls to the Operating System require negative numbers, so remember &FF is -1. Make a note that -1 is 256-255 and the hex is &FF.

Generally in the assembler listing the first hex number is the opcode followed by two bytes of data. At 5009 there isn't any data - INX "implicit addressing" data not required. 500F, being data poked by EQUS does not have opcodes, but we have found that the micro-processor will accept the first column as opcodes if allowed to. The EQUB statement pokes numbers less than 256 into the addresses.

Next time, the BASIC listing will be put into procedures. Please don't give up! If you are totally lost, please write in with your questions and we'll try to answer them.

Roy Warner, EUG #5