HiSoft BASIC

For ZX Spectrum, ZX Spectrum +, ZX Spectrum 128 & ZX Spectrum Plus 2

 

A Fast Floating-Point ZX BASIC Compiler

 

 

 

First Edition October 1986                           © Copyright HiSoft 1986

 

Please buy, don’t steal

 

HiSoft The Old School  Greenfield  Bedford  MK45 5DE Tel (0525) 718181

 

HiSoft BASIC was written by Cameron Hayne

 

 

All Rights Reserved Worldwide. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature.

 

It is an infringement of the copyright pertaining to HiSoft BASIC and its associated documentation to copy, by any means whatsoever, any part of HiSoft BASIC for any reason other than for the purpose of making one security back-up copy of the object code.

 

Contents

 

Introduction

How to Use HiSoft BASIC

HiSoft BASIC Commands

Summary of differences from Spectrum Basic

Variables

Numerical Constants

Conversion between Types

Compiler Directives

Notes on compiled BASIC

Including other machine code

Compiling large programs

Tips on Efficiency

What if it doesn't work?

Error Messages

The meaning of the dots and colours

Making a back-up copy

Memory Maps

Runtimes

Appendix 1 - Spectrum 128 version


Introduction

 

HiSoft BASIC is a BASIC compiler that surpasses all others for the Spectrum. There are integer compilers that can make Basic programs run more than 100 times faster but they only handle integers (no decimals, only whole numbers from - 32768 to 32767 or from 0 to 65535) and often have other restrictions. There are floating-point compilers that handle the full range of decimal numbers and all of the Spectrum's functions but (in spite of advertised claims) they speed up programs by only a factor of 3 to 5.

 

HiSoft BASIC combines the advantages of these two types of compilers without any of the disadvantages. It is a floating-point compiler that can obtain the speed of an integer compiler when doing operations that don't require the complexities of floating-point arithmetic. In fact, HiSoft BASIC is simultaneously the fastest integer compiler and the fastest floating-point compiler available for the Spectrums.

 

HiSoft BASIC can compile almost all of the Spectrum's BASIC into fast machine code. Unlike some floating-point compilers, it can handle user-defined functions and two-dimensional numeric and string arrays. Most other compilers have a block of routines about 5K in length (called runtimes) that must be present for the compiled code to work. This means that even the shortest BASIC program compiles to more than 5K. HiSoft BASIC includes only the runtime routines that are actually necessary for your code so that a short BASIC program may compile into only a few hundred bytes. Also, unlike other compilers, HiSoft BASIC allows you to put the compiled code anywhere in RAM you want, even in locations normally occupied by the compiler itself!

 

Spectrum 128 & Plus 2 Owners - Please Note

Spectrum 128 and Spectrum Plus 2 owners should read Appendix 1 before using HiSoft BASIC; this describes the extra features available for these machines.

 

 

HiSoft BASIC is only about 11K in length so it loads quickly. It can compile BASIC programs up to about 30K in length without requiring microdrives or cumbersome tape swapping. Another distinguishing feature of HiSoft BASIC is that it provides full information on the code that it produces so that it is easy to interface the compiled code with a co-resident BASIC program. Or, if you're interested in machine code, you could use this information to learn how to use the ROM routines.

 

Finally, HiSoft BASIC, unlike many compilers, does not blindly follow a recipe in converting your BASIC to machine code. Instead, it watches for simple cases (e.g.: operations with powers of 2, constant array indices, etc) which it can compile into especially-efficient codes.

 

HiSoft BASIC is very easy to use but we recommend you read through this manual before starting any serious compiling.

 

 

Try this First

 

The instructions for using HiSoft BASIC follow this introductory section but instead of leaving you to read them and figure out things for yourself, we'll show you the ropes with a few example programs.

 

First load HiSoft BASIC by putting the tape with the words HiSoft BASIC uppermost in your tape recorder, typing:

 

LOAD "" [ENTER]

 

and pressing PLAY on your tape player.

 

When it is finished loading you'll see the copyright notice at the top of the screen. Then load in the first example program by typing:

 

 LOAD "EXAMPLE 1" [ENTER]

 

LIST it and then RUN it to test it out and make sure that it works. This is a vital step before attempting to compile any program! As you might not always want to compile all parts of your BASIC program, it is necessary to tell HiSoft BASIC where to start and where to stop compiling. As with all instructions to HiSoft BASIC (called compiler directives) this is done via a REM statement. The start-compiling instruction is:

 REM : OPEN #     (do this now by making this instruction line 1 of the example)

 

The stop-compiling instruction is:

 REM : CLOSE #

 

but this is optional here since we want to compile right to the end of the BASIC.

Now type

 

*C

 

and compiling will start.

 

 

Users of Spectrum 128 and Spectrum Plus 2 computers should note that HiSoft BASIC commands are invoked in a totally different manner. Rather than typing * followed by a command letter, you should press the [TRUE VIDEO] and [INV VIDEO] keys simultaneously. This will produce a menu of command options on the screen from which any of the compiler commands may be selected.

 

During compilation, HiSoft BASIC will pause twice, showing some information at the bottom of the screen. You'll have to press a key to continue (don't worry about the information - you'll not need it now). The borders will change colour (magenta-cyan-white) and strange dots and colours will appear on the screen. We'll explain later what all this is; for now we just need the information that will appear after the second key press. For the first example program, this should indicate that the compiled code (machine code) is 357 bytes long and that 10 bytes must be reserved for machine-code variables (for the sake of comparison the number of bytes taken up by the BASIC program  without variables is also given). The most vital information is in the two lines that tell you how to save and load the compiled code. The address in the LOAD line is  the address to be used after RANDOMIZE USR when you want to execute the  compiled code. E.g. if the code is to be loaded to address 65001 then  RANDOMIZE USR 65001 will execute the compiled code. But while HiSoft  BASIC is resident there's an easier way; *R will execute the compiled code.  Spectrum 128 and Spectrum Plus 2 owners: remember that commands are  invoked using [TRUE VIDEO] and (INV VIDEO] on your machines!

 

You can test out the machine code now if you like. By the way, don't be alarmed at the fact that this very small program seems to require so many bytes  in machine code. Most of the bytes are taken up by the runtimes - subroutines that are included as needed but that will be re-used by other parts of a larger program. Thus the ratio of bytes used for machine code to those in the BASIC will decrease as the size of program increases.

 

Your BASIC program is still there after compilation and can be modified and re-compiled. Without changing anything, try compiling it a second time (with *C) just to see what happens. All the information on the final screen will be the same except for the address where the compiled code is located. Each time you compile a BASIC program, the compiled code is placed at what the Spectrum considers to be the top of your memory space (i.e. just below RAMTOP) and the RAMTOP is changed to be just before the newly-compiled code. To reclaim that memory (by resetting RAMTOP to its original value) type *X.

 

We want to use the first example program to illustrate that the variables used by BASIC and the variables used by the compiled code are totally distinct.

 

Re-compile the program (with *C). Now RUN the BASIC version and then, as a direct command, execute PRINT N1,  N2. Now execute the machine code version (with *R), this time responding with different numbers than those you used for the BASIC version. Now re-execute the direct command PRINT N1,N2. The BASIC variables are still as they were. The machine code variables are local to the compiled code.

 

Before we leave this example, we must point out that the INPUT command is one that behaves slightly differently in the compiled code than it does in BASIC. The difference is in its response to errors. In BASIC, an error in input returns control to the editor with an error message. This would be inconvenient in machine code, so in the compiled code INPUT commands are error-trapped so that any error results in a restart of the INPUT. Test this out for yourself with the compiled code.

 

Now type:

 

LOAD "EXAMPLE 2" [ENTER]

 

and LIST it once it has loaded. Put in a new line:

 

 9 REM : OPEN #

 

and RUN it to make sure it works, then compile (with *C). The thing we want to bring to your attention now is the number of bytes taken up by machine code variables. The total is 277; this is 15 bytes for the FOR/NEXT variable 1,5 bytes for L, and 257 bytes for N$. The 257 is made up of 2 bytes for the length of N$ and 255 bytes to hold the actual characters. Since no name is ever going to be that long, it seems wasteful to reserve that much space for it. By using the REM : LEN directive we can tell HiSoft BASIC how much space to reserve for a string variable. In this case suppose we decide that we are safe in assuming that no name could possibly ever be longer than 50 characters, then we can tell HiSoft BASIC this by inserting a new line:

 

 8 REM : LEN N$ <=50

 

(on the 48K Spectrum the <= is the single character obtained by [SYMBOL SHIFT] -Q). Do this now and re-compile. You will see the number of bytes for machine code variables is now only 72 (52 bytes reserved for N$). It may seem that 50 is still too long, but it's better to err on the long side - too little space can be fatal. Incidentally, all this is necessary because we've chosen to opt for efficiency over convenience. In BASIC, when you assign to a string variable, the old copy is destroyed, all the other variables are shuffled down, and the new string is inserted at the end of the variables list. But this takes time! In the compiled code from HiSoft BASIC, all variables including string variables are at fixed locations, which gives a great improvement in speed. Note that for DiMensioned string variables, HiSoft BASIC can tell from the DIM statement how much space to reserve and so the REM : LEN directive is not necessary.

 

The first two example programs served to illustrate some essential points about using HiSoft BASIC but they weren't very interesting as programs and they certainly didn't show any perceptible increase in speed; and speed after all is what you're here for! So type:

 

 LOAD "EXAMPLE 3" [ENTER]

 

and we'll start to explore the true capabilities of HiSoft BASIC. RUN the program as usual, to make sure it works (we emphasise that this is an essential step before attempting to compile any program). If you LIST the program you will find that we've already included the REM : OPEN # directive at the beginning so we're ready to compile. Type *C and watch. You will find that you get line 290 at the top of your screen with a flashing ? and the message Not supported at the bottom. What is not supported is the tape command SAVE. None of the operating-system commands are supported by HiSoft BASIC because they are usually more appropriately left in BASIC. This is where the directive REM : CLOSE # is useful. Insert a new line:

 

 271 REM : CLOSE #

 

and then re-compile. It will work this time. Try out the newly-compiled machine code and you will see the spiral drawn more than 3 times faster. A few asides on the program: note, in lines 20-50, that the values of the SIN and COS functions are computed only once and then assigned to variables for future use. As these functions (along with TAN, ASN, ACS, ATN, EXP, LN, SQR) are very slow, this is a smart thing to do whenever possible. Note also the CLS in line 15. This is redundant in BASIC since a CLS is done automatically when we RUN the program, but it is needed for the compiled code.

 

But what about the line 290 that was left out of the compilation? Since we now have a machine code version of the program, what we want is a BASIC loader program that looks like this:

10 CLEAR wwwww

20 LOAD "spiral" CODE xxxxx : RANDOMIZE USR xxxxx

30 STOP

40 SAVE "spiral" CODE xxxxx, yyy

 

where the xxxxx and yyy are the numbers given by HiSoft BASIC and wwwww is below xxxxx.

Now type:

 LOAD "EXAMPLE 4" [ENTER]

 

and LIST it. You will see that it is the same as EXAMPLE 3 but with additional lines at the end. We've already put in the line 271 REM : CLOSE #. If you compile it as it is now, the compiled code would be precisely the same as that from EXAMPLE 3. The BASIC lines after 271 would simply be ignored (although they do figure in the number given by the HiSoft BASIC for the bytes taken up by BASIC). What line 1000 does is to POKE the picture on the screen into storage at memory address 45000; line 2000 recalls it from memory onto the screen.

 

Type *X and RUN the program. When the drawing is completed, execute GOTO 1000. After the STOP message (it will take a few minutes) execute CLS and then GOTO 2000. After another few minutes, the spiral will have reappeared on the screen, but since it takes several times longer to recall the spiral from memory than it would take simply to re-draw it, this seems pointless! But the compiled version will be faster! What we want in the compiled version is to have three separate entry points to the machine code: the first to draw the spiral and the second and third to store and recall it from memory. We already have the first entry point at REM : OPEN # and the return to BASIC occurs at the REM : CLOSE #.

 

We want additional entry points at lines 1000 and 2000, so we insert new lines:

 

 999 REM : OPEN #                                 and

1999 REM : OPEN #                                (do this now)

 

The returns to BASIC from these sections of code will be from their STOP statements. There is no need for any additional REM : CLOSE # because we want to compile right to the end of the BASIC. Now we're ready to compile so type *C. This time you will be required repeatedly to press a key as information about the various entry points comes up on the bottom of the screen.

 

We will now explain what these numbers mean. During the first pass (with a cyan border) you will be told the relative addresses of the various entry points - i.e. relative to the start of the compiled code. During the second pass (with a white border) you will be told the execution addresses (in both decimal and hexadecimal) of the various entry points. Make a note of these for use later. Note also (from the final screen) the number of bytes taken up by the compiled code. Remember that if you miss some information during compilation, you can always re-compile (after * X if desired). Now try out the compiled code by executing the first part to draw the spiral, then the second part to store it in memory, then CLS and finally execute the third part of the compiled code to recall the spiral screen. Note that *R works only for the first entry point. You should find that now it takes about the same time to recall the spiral from memory as it would to re-draw it.

 

So our store and recall routine still doesn't seem very useful. But there's a further improvement we can make that will dramatically increase the speed. The key fact to notice is that the variables I, SOURCE, and DESTINATION of lines 1000-5040 take on only values that are positive integers (they range from 16384 to 51911). If HiSoft BASIC is informed of this fact (it's not quite smart enough to notice it for itself!), it will generate much more efficient code because it can then use the native abilities of the Z80 processor rather than relying on the ROM routines for floating-point arithmetic. The way to inform HiSoft BASIC is to use the directive REM  :  INT +. This directive must come before the first REM : OPEN #, so we insert a new line (do this now!):

 

 9 REM : INT + I, SOURCE, DESTINATION

 

This tells HiSoft BASIC that these variables will take on only values that are positive integers in the range from 0 to 65535. We know this to be true for lines 1000 to 5040 but before we can re-compile we must check that it is true for the whole program. There is a variable i (which to the Spectrum is the same as variable I) in lines 130, 160, 230 but we easily can see that it too takes positive integer values in the right range. So go ahead and re-compile. You will notice that the new compiled code takes fewer bytes, but the real difference is in the speed.

 

Now it takes less than 0.7 seconds to recall a screen from memory! If you time it with a stopwatch you will also find a small decrease in the time taken to draw the spiral. There is no dramatic increase in the speed of drawing the spiral because most of the time is spent in the ROM DRAW routine.

 

We have just seen how much more efficient it is to use integer variables wherever possible. However, in this program it was relatively easy to convince ourselves that the variables I, SOURCE and DESTINATION take on only integer values. In other, more complex programs it may be more difficult to pick out the integer-valued variables. But help is at hand! Type *T. Nothing will happen right away. But now RUN the BASIC program. You should see the lower screen go BRIGHT and the spiral being drawn more slowly than usual.

 

When it's finished, type *T again and this time you will be rewarded with a list of variables. Beside each variable is the type of that variable: REAL, INTEG or POS INT (or POS INTEG - a combination of POS INT and INTEG). See below for an explanation of variable types but for now just note that variable i is listed as POS INTEG which means that its value never went out of the range of positive integers between 0 and 32767. The program STOPped at line 280 so what this is really telling us is that the variable i never goes out of that range in lines 130-230. The program has not yet explored the region of lines 1000-5040 so variables SOURCE and DESTINATION are not even listed yet. To get a full indication of the variable types we would have to execute the other two sections of the program separately, e.g. by doing the sequence:

 

*T

RUN

GOTO 1000

GOTO 2000

*T

 

If you don't find the long waiting times too irksome, you could try this now but otherwise you can just take our word for it that the variables I, SOURCE and DESTINATION would all come out listed as POSINT. All this is just confirmation of something that we realised earlier but you can see how it could be useful when applied to more complex programs. What *T does is to turn on another program that keeps watch over the values of the variables during the BASIC program's execution. The second *T turns this off and prints the results. Anything you do between the *TS that affects the variables will be taken account of in the results. Thus, if you were to do the sequence: * T, RUN , LET i = 1.3, *T, the variable i could be shown as REAL instead of POSINTEG.

 

We must caution you that * T is not an infallible guide to the types of the variables. To illustrate this, first get rid of the existing BASIC by typing *ERASE (to get ERASE, go into extended mode then press [SYMBOL SHIFT] -7). Do not use NEW as that would wipe out HiSoft BASIC as well! Now type in the following program :

 

                  10 LET A=0

                  20 IF RND > .5 THEN LET A =.5

 

Do the sequence : *T, RUN, *T and then repeat this sequence a few times. You will notice that the variable A is listed sometimes as REAL, sometimes as POSINTEG. The reason for this is clear - there is a branch in the program depending on the value of RND and if RND < . 5 the program doesn't realise that A is ever non-integral. The lesson you should draw from this is that, in using *T, you should repeat the program enough times with different inputs to make sure that the whole of the program is explored. In this example it would probably suffice to do : *T, RUN, RUN, RUN, RUN, *T.

Now type:

 LOAD "EXAMPLE 5" [ENTER]

 

This program is ready to compile, but first LIST it and note the use of I NT after the DATA command. The variables X and Y are READ from this DATA list and since they are declared to be of type POSINT by the REM : INT + directive, it is necessary that the data be stored in integer format. This is accomplished by putting an INT after the DATA. Compile and RON this program at your leisure.

 

Now type:

 LOAD "EXAMPLE 6" [ENTER]

 

and RUN it. You will see that it is a typical example of menu programming. Try to compile it and you will get line 50 at the top of your screen and the dreaded Not supported message at the bottom. Line 50 is what is called a computed GOSUB statement. The line number is not given explicitly but must be computed at run-time, so in order to compile such statements the compiler must make a list of all the line numbers and the corresponding addresses in the compiled code. The compiled code for the GOSUB will then search through this list at run time to find the address for the line number that is needed. HiSoft BASIC has the capability to do all this, but since it results in slower and longer code, we have made it a non-standard feature that you must select by means of a compiler directive.

 

Insert a new line:

                  7 REM : GOSUB :

 

 and it will now compile correctly.

 

Note that the compiled code is 752 bytes long. If you look at the program you'll see that the variable N can be 1, 2 or 3, so the only line numbers we really need for use in line 50 are lines 100, 200 and 300. You can tell HiSoft BASIC this by changing line 7 to read

 

                7 REM : GOSUB 100, 200, 300

 

and now if you recompile you'll find the code is 724 bytes long because we're now keeping only the info for those 3 lines rather than for all the lines of the program. For this short program it's not much of a saving, but for longer programs the savings in bytes can be enormous, and having a shorter list to search through can significantly increase the execution speed. To see what happens if you omit a relevant line number, delete the 300 from line 7, recompile, and try out the compiled code, selection option 3. Note also that if you change line 50 to GOSUB 100 *N-l, although it would still work in BASIC, the compiled code wouldn't work because lines 99, 199 and 299 don't exist.

 

The rest of the programs on the tape are ready to compile. You can LOAD and LIST them to see further examples of the use of compiler directives. The next program is called SIEVE and is a standard benchmark program. It finds all prime numbers less than twice the number used in line 20. As it stands, the program will not work in BASIC because there isn't room for an array of 8192 elements of 5 bytes each. However, the compiled version with the array f () declared as POSINT (so that each element only takes 2 bytes) works fine and takes only 2.9 seconds. Add a line:

 

 165 PRINT prime

 

if you want to see the prime numbers (but this will slow it down a lot). To get the program to work in BASIC you will have to change both of the 8192s in line 20 to something like 7000 or smaller and RUN it on an otherwise empty Spectrum. With 7000 in line 20, the program takes 418 seconds to finish, compared with only 2.6 seconds for the compiled code - a speed increase of 161 times! If you try to compile this program twice in succession without resetting RAMTOP (by *X or CLEAR) in between, you will find that the addresses on the final screen after SAVE and LOAD differ from each other and you get a message DO NOT TEST on the bottom of the screen. This means that the code is not in its proper position and would have to be SAVEd and re-LOADed to its proper position before executing it. But beware of over-writing HiSoft BASIC - see Memory Maps.

 

The last two programs on the tape are SHELLSORT and QUICKSORT. Lines 9000 and higher of these two programs contain subroutines that son an array X () of numbers into ascending order using two different algorithms. The rest of these programs are for testing the speed of these two algorithms in sorting data that is randomly arranged and data that is already in order. You will find that QUICKSORT is faster for randomly-arranged data but SHELLSORT is faster for data that are already almost all in order. The subroutines can easily be modified to sort into descending order or to sort strings instead of numbers. If you compile them, you will find that the compiled versions are up to 19 times faster!


How to Use HiSoft BASIC

 

1. LOAD "HiBasic" [ENTER]

(there are 3 parts which will load in sequence).

 

2. Either type in your BASIC program or LOAD it in from tape or microdrive  (there is space for BASIC programs up to about 30K in length). Note: You  must arrange your BASIC program so that it is possible to execute it by  simply entering RUN (that is, it must start at the lowest line and all variables  must be defined within the program). For example, if you have a BASIC  program which you execute by entering RUN 9000 then insert a new  line at the beginning that says GOTO 9000.

 

3. Make sure your BASIC doesn't include any of the commands or functions  that aren't supported by HiSoft BASIC (see Summary of differences from Spectrum BASIC).

 

4. Insert a new line with the compiler directive REM  :  OPEN  # at the beginning of your program. Other compiler directives are optional.

 

5. RUN your program to make sure that it works. Try it with different inputs to cover all the possibilities and test out all the branches of the program. The compiled code will be designed to reproduce the effect of the BASIC (except faster!), so if it doesn't work in BASIC, the compiled version won't work either and may even crash. Conversely, if your program works in BASIC, you can expect the compiled code to do the same. It is a good idea to SAVE your BASIC program before proceeding.

 

6. Compile by typing *C (see HiSoft BASIC Commands). Refer to Error messages if compilation stops with a message at the bottom of the screen.

 

7. SAVE the compiled code.

 

The compiled code is like any other machine code program. To execute it requires the command RANDOMIZE USR xxxxx where xxxxx is its address. The compiled code will return to BASIC at points where there was a STOP command or a REM : CLOSE # directive or if it reaches the end of the program. Note that you must CLEAR wwwww before LOAD ing in the code, where wwwww  is any address less than xxxxx.


 

HiSoft BASIC Commands

 

Commands may be typed in upper or lower case. Execution of the command is immediate upon receipt of the final character (no [ENTER] is needed). If at any time these commands should stop being accepted, re-initialise the command interpreter by RANDOMIZE USR 23792. Spectrum 128 and Spectrum Plus 2 owners should read Appendix 1 first.

 

*C            Starts compilation of the BASIC program. Compiles those portions of the BASIC between compiler directives REM: OPEN# and REM: CLOSE# . The compiled code is placed just below RAMTOP and RAMTOP is revised. The following information is given during the first pass: the relative addresses of the entry points to the code. During the second pass: the execution addresses of the entry  points to the code (both in decimal and hexadecimal). Compilation  pauses after these until you press a key.  At completion :

 - the number of bytes taken up by the compiled code

 - the number of bytes needed for machine code variables

 - the number of bytes occupied by the BASIC program

 - the commands to be used to SAVE the compiled code and to LOAD it back in afterwards.