HiSoft
BASIC
For
ZX Spectrum, ZX Spectrum +, ZX Spectrum 128 & ZX Spectrum Plus 2
A
Fast Floating-Point ZX BASIC Compiler
First
Edition October 1986 ©
Copyright HiSoft 1986
Please
buy, don’t steal
HiSoft The
Old School Greenfield Bedford MK45 5DE Tel (0525) 718181
HiSoft
BASIC was written by Cameron Hayne
All
Rights Reserved Worldwide. No part of this publication may be reproduced or
transmitted in any form or by any means, including photocopying and recording,
without the written permission of the copyright holder. Such written permission
must also be obtained before any part of this publication is stored in a
retrieval system of any nature.
It
is an infringement of the copyright pertaining to HiSoft BASIC
and its associated documentation to copy, by any means whatsoever, any part of HiSoft
BASIC for any reason other than for the purpose of making
one security back-up copy of the object code.
Contents
Introduction
How
to Use HiSoft BASIC
HiSoft
BASIC Commands
Summary
of differences from Spectrum Basic
Variables
Numerical
Constants
Conversion
between Types
Compiler
Directives
Notes
on compiled BASIC
Including
other machine code
Compiling
large programs
Tips
on Efficiency
What
if it doesn't work?
Error
Messages
The
meaning of the dots and colours
Making
a back-up copy
Memory
Maps
Runtimes
Appendix
1 - Spectrum 128 version
Introduction
HiSoft
BASIC is a BASIC compiler that surpasses all others for the
Spectrum. There are integer compilers that can make Basic programs run more
than 100 times faster but they only handle integers (no decimals, only whole
numbers from - 32768 to 32767 or from 0 to 65535) and often have other
restrictions. There are floating-point compilers that handle the full range of
decimal numbers and all of the Spectrum's functions but (in spite of advertised
claims) they speed up programs by only a factor of 3 to 5.
HiSoft
BASIC combines the advantages of these two types of
compilers without any of the disadvantages. It is a floating-point compiler
that can obtain the speed of an integer compiler when doing operations that
don't require the complexities of floating-point arithmetic. In fact, HiSoft
BASIC is simultaneously the fastest integer compiler and
the fastest floating-point compiler available for the Spectrums.
HiSoft
BASIC can compile almost all of the Spectrum's BASIC into
fast machine code. Unlike some floating-point compilers, it can handle
user-defined functions and two-dimensional numeric and string arrays. Most
other compilers have a block of routines about 5K in length (called runtimes)
that must be present for the compiled code to work. This means that even the
shortest BASIC program compiles to more than 5K. HiSoft BASIC
includes only the runtime routines that are actually necessary for your code so
that a short BASIC program may compile into only a few hundred bytes. Also,
unlike other compilers, HiSoft BASIC allows you to
put the compiled code anywhere in RAM you want, even in locations normally
occupied by the compiler itself!
Spectrum
128 & Plus 2 Owners - Please Note
Spectrum
128 and Spectrum Plus 2 owners should read Appendix 1
before using HiSoft BASIC; this describes the extra features
available for these machines.
HiSoft BASIC is only
about 11K in length so it loads quickly. It can compile BASIC programs up to
about 30K in length without requiring microdrives or cumbersome tape swapping.
Another distinguishing feature of HiSoft BASIC is that
it provides full information on the code that it produces so that it is easy to
interface the compiled code with a co-resident BASIC program. Or, if you're
interested in machine code, you could use this information to learn how to use
the ROM routines.
Finally, HiSoft BASIC, unlike
many compilers, does not blindly follow a recipe in converting your BASIC to
machine code. Instead, it watches for simple cases (e.g.: operations with
powers of 2, constant array indices, etc) which it can compile into
especially-efficient codes.
HiSoft BASIC is very
easy to use but we recommend you read through this manual before starting any
serious compiling.
Try this First
The instructions for using HiSoft BASIC
follow this introductory section but instead of leaving you to read them and
figure out things for yourself, we'll show you the ropes with a few example
programs.
First load HiSoft BASIC by
putting the tape with the words HiSoft BASIC uppermost in your tape recorder,
typing:
LOAD "" [ENTER]
and pressing PLAY on your tape player.
When it is finished loading you'll see the copyright
notice at the top of the screen. Then load in the first example program by
typing:
LOAD "EXAMPLE
1" [ENTER]
LIST
it and then RUN it to test it out and make sure that it works. This is a vital
step before attempting to compile any program! As you might not always want to
compile all parts of your BASIC program, it is necessary to tell HiSoft BASIC
where to start and where to stop compiling. As with all instructions to HiSoft
BASIC (called compiler directives)
this is done via a REM statement. The start-compiling
instruction is:
REM : OPEN # (do this now by making this instruction line
1 of the example)
The
stop-compiling instruction is:
REM : CLOSE #
but
this is optional here since we want to compile right to the end of the BASIC.
Now
type
*C
and
compiling will start.
Users
of Spectrum 128 and Spectrum Plus 2 computers should note that HiSoft BASIC
commands are invoked in a totally different manner. Rather than typing *
followed by a command letter, you should press the [TRUE VIDEO] and [INV VIDEO]
keys simultaneously. This will produce a menu of command options on the screen
from which any of the compiler commands may be selected.
During
compilation, HiSoft BASIC will pause twice, showing some
information at the bottom of the screen. You'll have to press a key to continue
(don't worry about the information - you'll not need it now). The borders will
change colour (magenta-cyan-white) and strange dots and colours will appear on
the screen. We'll explain later what all this is; for now we just need the
information that will appear after the second key press. For the first example
program, this should indicate that the compiled code (machine code) is 357
bytes long and that 10 bytes must be reserved for machine-code variables (for
the sake of comparison the number of bytes taken up by the BASIC program without variables is also given). The
most vital information is in the two lines that tell you how to save and load
the compiled code. The address in the LOAD line is the address to be used after RANDOMIZE USR when you want to
execute the compiled code. E.g. if
the code is to be loaded to address 65001 then RANDOMIZE USR 65001 will execute the compiled code. But
while HiSoft BASIC is resident
there's an easier way; *R will execute the compiled code. Spectrum 128 and Spectrum Plus 2 owners:
remember that commands are invoked
using [TRUE VIDEO] and (INV VIDEO] on your machines!
You
can test out the machine code now if you like. By the way, don't be alarmed at
the fact that this very small program seems to require so many bytes in machine code. Most of the bytes are
taken up by the runtimes - subroutines that are included as needed but that
will be re-used by other parts of a larger program. Thus the ratio of bytes
used for machine code to those in the BASIC will decrease as the size of program
increases.
Your
BASIC program is still there after compilation and can be modified and
re-compiled. Without changing anything, try compiling it a second time (with
*C) just to see what happens. All the information on the final screen will be
the same except for the address where the compiled code is located. Each time
you compile a BASIC program, the compiled code is placed at what the Spectrum
considers to be the top of your memory space (i.e. just below RAMTOP) and the
RAMTOP is changed to be just before the newly-compiled code. To reclaim that
memory (by resetting RAMTOP to its original value) type *X.
We
want to use the first example program to illustrate that the variables used by
BASIC and the variables used by the compiled code are totally distinct.
Re-compile
the program (with *C). Now RUN the BASIC version and then, as a direct command,
execute PRINT N1, N2. Now execute
the machine code version (with *R), this time responding with different numbers
than those you used for the BASIC version. Now re-execute the direct command
PRINT N1,N2. The BASIC variables are still as they were. The machine code
variables are local to the compiled code.
Before
we leave this example, we must point out that the INPUT command is one that
behaves slightly differently in the compiled code than it does in BASIC. The
difference is in its response to errors. In BASIC, an error in input returns
control to the editor with an error message. This would be inconvenient in
machine code, so in the compiled code INPUT commands are error-trapped so that
any error results in a restart of the INPUT. Test this out for yourself with
the compiled code.
Now
type:
LOAD
"EXAMPLE 2" [ENTER]
and
LIST it once it has loaded. Put in a new line:
9 REM : OPEN #
and
RUN it to make sure it works, then compile (with *C). The thing we want to
bring to your attention now is the number of bytes taken up by machine code
variables. The total is 277; this is 15 bytes for the FOR/NEXT variable 1,5
bytes for L, and 257 bytes for N$. The 257 is made up of 2 bytes for the length
of N$ and 255 bytes to hold the actual characters. Since no name is ever going
to be that long, it seems wasteful to reserve that much space for it. By using
the REM : LEN directive we can tell HiSoft BASIC how much
space to reserve for a string variable. In this case suppose we decide that we
are safe in assuming that no name could possibly ever be longer than 50
characters, then we can tell HiSoft BASIC this by
inserting a new line:
8 REM : LEN N$ <=50
(on
the 48K Spectrum the <= is the single character obtained by [SYMBOL SHIFT]
-Q). Do this now and re-compile. You will see the number of bytes for machine
code variables is now only 72 (52 bytes reserved for N$). It may seem that 50
is still too long, but it's better to err on the long side - too little space
can be fatal. Incidentally, all this is necessary because we've chosen to opt
for efficiency over convenience. In BASIC, when you assign to a string
variable, the old copy is destroyed, all the other variables are shuffled down,
and the new string is inserted at the end of the variables list. But this takes
time! In the compiled code from HiSoft BASIC, all
variables including string variables are at fixed locations, which gives a
great improvement in speed. Note that for DiMensioned string variables, HiSoft
BASIC can tell from the DIM statement how much space to
reserve and so the REM : LEN directive is not necessary.
The
first two example programs served to illustrate some essential points about
using HiSoft BASIC but they weren't very interesting as programs and they
certainly didn't show any perceptible increase in speed; and speed after all is
what you're here for! So type:
LOAD "EXAMPLE 3" [ENTER]
and
we'll start to explore the true capabilities of HiSoft BASIC.
RUN the program as usual, to make sure it works (we emphasise that this is an
essential step before attempting to compile any program). If you LIST the
program you will find that we've already included the REM : OPEN # directive at
the beginning so we're ready to compile. Type *C and watch. You will find that
you get line 290 at the top of your screen with a flashing ? and the message
Not supported at the bottom. What is not supported is the tape command SAVE.
None of the operating-system commands are supported by HiSoft BASIC
because they are usually more appropriately left in BASIC. This is where the
directive REM : CLOSE # is useful. Insert a new line:
271 REM : CLOSE #
and
then re-compile. It will work this time. Try out the newly-compiled machine code
and you will see the spiral drawn more than 3 times faster. A few asides on the
program: note, in lines 20-50, that the values of the SIN and COS functions are
computed only once and then assigned to variables for future use. As these
functions (along with TAN, ASN, ACS, ATN, EXP, LN, SQR) are very slow, this is
a smart thing to do whenever possible. Note also the CLS in line 15. This is
redundant in BASIC since a CLS is done automatically when we RUN the program,
but it is needed for the compiled code.
But
what about the line 290 that was left out of the compilation? Since we now have
a machine code version of the program, what we want is a BASIC loader program
that looks like this:
10
CLEAR wwwww
20
LOAD "spiral" CODE xxxxx : RANDOMIZE USR
xxxxx
30
STOP
40
SAVE "spiral" CODE xxxxx, yyy
where
the xxxxx and yyy are the numbers given by HiSoft BASIC
and wwwww is below xxxxx.
Now
type:
LOAD "EXAMPLE 4" [ENTER]
and
LIST it. You will see that it is the same as EXAMPLE 3 but with additional
lines at the end. We've already put in the line 271 REM : CLOSE #. If you
compile it as it is now, the compiled code would be precisely the same as that
from EXAMPLE 3. The BASIC lines after 271 would simply be ignored (although
they do figure in the number given by the HiSoft BASIC
for the bytes taken up by BASIC). What line 1000 does is to POKE the picture on
the screen into storage at memory address 45000; line 2000 recalls it from
memory onto the screen.
Type
*X and RUN the program. When the drawing is completed, execute GOTO 1000. After
the STOP message (it will take a few minutes) execute CLS and then GOTO 2000.
After another few minutes, the spiral will have reappeared on the screen, but
since it takes several times longer to recall the spiral from memory than it would
take simply to re-draw it, this seems pointless! But the compiled version will
be faster! What we want in the compiled version is to have three separate entry
points to the machine code: the first to draw the spiral and the second and
third to store and recall it from memory. We already have the first entry point
at REM : OPEN # and the return to BASIC occurs at the REM : CLOSE #.
We
want additional entry points at lines 1000 and 2000, so we insert new lines:
999 REM : OPEN # and
1999
REM : OPEN # (do
this now)
The
returns to BASIC from these sections of code will be from their STOP
statements. There is no need for any additional REM : CLOSE # because we want
to compile right to the end of the BASIC. Now we're ready to compile so type
*C. This time you will be required repeatedly to press a key as information
about the various entry points comes up on the bottom of the screen.
We
will now explain what these numbers mean. During the first pass (with a cyan
border) you will be told the relative addresses of the various entry points -
i.e. relative to the start of the compiled code. During the second pass (with a
white border) you will be told the execution addresses (in both decimal and
hexadecimal) of the various entry points. Make a note of these for use later.
Note also (from the final screen) the number of bytes taken up by the compiled
code. Remember that if you miss some information during compilation, you can
always re-compile (after * X if desired). Now try out the compiled code by
executing the first part to draw the spiral, then the second part to store it
in memory, then CLS and finally execute the third part of the compiled code to
recall the spiral screen. Note that *R works only for the first entry point.
You should find that now it takes about the same time to recall the spiral from
memory as it would to re-draw it.
So
our store and recall routine still doesn't seem very useful. But there's a
further improvement we can make that will dramatically increase the speed. The
key fact to notice is that the variables I, SOURCE, and DESTINATION of lines
1000-5040 take on only values that are positive integers (they range from 16384
to 51911). If HiSoft BASIC is informed of this fact (it's not
quite smart enough to notice it for itself!), it will generate much more
efficient code because it can then use the native abilities of the Z80
processor rather than relying on the ROM routines for floating-point
arithmetic. The way to inform HiSoft BASIC is to use the
directive REM : INT +. This directive must come before
the first REM : OPEN #, so we insert a new line (do this now!):
9 REM : INT + I, SOURCE, DESTINATION
This
tells HiSoft BASIC that these variables will take on
only values that are positive integers in the range from 0 to 65535. We know this
to be true for lines 1000 to 5040 but before we can re-compile we must check
that it is true for the whole program. There is a variable i (which to the
Spectrum is the same as variable I) in lines 130, 160, 230 but we easily can
see that it too takes positive integer values in the right range. So go ahead
and re-compile. You will notice that the new compiled code takes fewer bytes,
but the real difference is in the speed.
Now
it takes less than 0.7 seconds to recall a screen from memory! If you time it
with a stopwatch you will also find a small decrease in the time taken to draw
the spiral. There is no dramatic increase in the speed of drawing the spiral
because most of the time is spent in the ROM DRAW routine.
We
have just seen how much more efficient it is to use integer variables wherever
possible. However, in this program it was relatively easy to convince ourselves
that the variables I, SOURCE and DESTINATION take on only integer values. In
other, more complex programs it may be more difficult to pick out the
integer-valued variables. But help is at hand! Type *T. Nothing will happen
right away. But now RUN the BASIC program. You should see the lower screen go
BRIGHT and the spiral being drawn more slowly than usual.
When
it's finished, type *T again and this time you will be rewarded with a list of
variables. Beside each variable is the type of that
variable: REAL, INTEG or POS INT (or POS INTEG - a combination of POS INT and
INTEG). See below for an explanation of variable types but for now just note
that variable i is listed as POS INTEG which means that its value never went
out of the range of positive integers between 0 and 32767. The program STOPped
at line 280 so what this is really telling us is that the variable i never goes
out of that range in lines 130-230. The program has not yet explored the region
of lines 1000-5040 so variables SOURCE and DESTINATION are not even listed yet.
To get a full indication of the variable types we would have to execute the
other two sections of the program separately, e.g. by doing the sequence:
*T
RUN
GOTO
1000
GOTO
2000
*T
If
you don't find the long waiting times too irksome, you could try this now but
otherwise you can just take our word for it that the variables I, SOURCE and
DESTINATION would all come out listed as POSINT. All this is just confirmation
of something that we realised earlier but you can see how it could be useful
when applied to more complex programs. What *T does is to turn on another
program that keeps watch over the values of the variables during the BASIC
program's execution. The second *T turns this off and prints the results.
Anything you do between the *TS that affects the variables will be taken
account of in the results. Thus, if you were to do the sequence: * T, RUN , LET
i = 1.3, *T, the variable i could be shown as REAL instead of POSINTEG.
We
must caution you that * T is not an infallible guide to the types of the
variables. To illustrate this, first get rid of the existing BASIC by typing
*ERASE (to get ERASE, go into extended mode then press [SYMBOL SHIFT] -7). Do
not use NEW as that would wipe out HiSoft BASIC as well! Now
type in the following program :
10
LET A=0
20
IF RND > .5 THEN LET A =.5
Do
the sequence : *T, RUN, *T and then repeat this sequence a few times. You will
notice that the variable A is listed sometimes as REAL, sometimes as POSINTEG.
The reason for this is clear - there is a branch in the program depending on
the value of RND and if RND < . 5 the program doesn't realise that A is ever
non-integral. The lesson you should draw from this is that, in using *T, you
should repeat the program enough times with different inputs to make sure that
the whole of the program is explored. In this example it would probably suffice
to do : *T, RUN, RUN, RUN, RUN, *T.
Now
type:
LOAD "EXAMPLE 5" [ENTER]
This
program is ready to compile, but first LIST it and note the use of I NT after
the DATA command. The variables X and Y are READ from this DATA list and since
they are declared to be of type POSINT by the REM : INT + directive, it is
necessary that the data be stored in integer format. This is accomplished by
putting an INT after the DATA. Compile and RON this program at your leisure.
Now
type:
LOAD "EXAMPLE 6" [ENTER]
and
RUN it. You will see that it is a typical example of menu programming.
Try to compile it and you will get line 50 at the top of your screen and the
dreaded Not supported message at the bottom. Line 50 is what is called a computed
GOSUB statement. The line number is not given explicitly
but must be computed at run-time, so in order to compile such statements the
compiler must make a list of all the line numbers and the corresponding
addresses in the compiled code. The compiled code for the GOSUB will then
search through this list at run time to find the address for the line number
that is needed. HiSoft BASIC has the
capability to do all this, but since it results in slower and longer code, we
have made it a non-standard feature that you must select by means of a compiler
directive.
Insert
a new line:
7
REM : GOSUB :
and it will now compile correctly.
Note
that the compiled code is 752 bytes long. If you look at the program you'll see
that the variable N can be 1, 2 or 3, so the only line numbers we really need
for use in line 50 are lines 100, 200 and 300. You can tell HiSoft BASIC
this by changing line 7 to read
7
REM : GOSUB 100, 200, 300
and
now if you recompile you'll find the code is 724 bytes long because we're now
keeping only the info for those 3 lines rather than for all the lines of the
program. For this short program it's not much of a saving, but for longer
programs the savings in bytes can be enormous, and having a shorter list to
search through can significantly increase the execution speed. To see what
happens if you omit a relevant line number, delete the 300 from line 7,
recompile, and try out the compiled code, selection option 3. Note also that if
you change line 50 to GOSUB 100 *N-l, although it would still work in BASIC,
the compiled code wouldn't work because lines 99, 199 and 299 don't exist.
The
rest of the programs on the tape are ready to compile. You can LOAD and LIST
them to see further examples of the use of compiler directives. The next
program is called SIEVE and is a standard benchmark program. It finds all prime
numbers less than twice the number used in line 20. As it stands, the program
will not work in BASIC because there isn't room for an array of 8192 elements
of 5 bytes each. However, the compiled version with the array f () declared as
POSINT (so that each element only takes 2 bytes) works fine and takes only 2.9
seconds. Add a line:
165 PRINT prime
if
you want to see the prime numbers (but this will slow it down a lot). To get
the program to work in BASIC you will have to change both of the 8192s in line
20 to something like 7000 or smaller and RUN it on an otherwise empty Spectrum.
With 7000 in line 20, the program takes 418 seconds to finish, compared with
only 2.6 seconds for the compiled code - a speed increase of 161 times! If you
try to compile this program twice in succession without resetting RAMTOP (by *X
or CLEAR) in between, you will find that the addresses on the final screen
after SAVE and LOAD differ from each other and you get a message DO NOT TEST on
the bottom of the screen. This means that the code is not in its proper
position and would have to be SAVEd and re-LOADed to its proper position before
executing it. But beware of over-writing HiSoft BASIC -
see Memory Maps.
The
last two programs on the tape are SHELLSORT and QUICKSORT. Lines 9000 and
higher of these two programs contain subroutines that son an array X () of
numbers into ascending order using two different algorithms. The rest of these
programs are for testing the speed of these two algorithms in sorting data that
is randomly arranged and data that is already in order. You will find that
QUICKSORT is faster for randomly-arranged data but SHELLSORT is faster for data
that are already almost all in order. The subroutines can easily be modified to
sort into descending order or to sort strings instead of numbers. If you
compile them, you will find that the compiled versions are up to 19 times
faster!
How
to Use HiSoft BASIC
1.
LOAD "HiBasic" [ENTER]
(there
are 3 parts which will load in sequence).
2.
Either type in your BASIC program or LOAD it in from tape or microdrive (there is space for BASIC programs up
to about 30K in length). Note: You
must arrange your BASIC program so that it is possible to execute it
by simply entering RUN (that is,
it must start at the lowest line and all variables must be defined within the program). For example, if you
have a BASIC program which you
execute by entering RUN 9000 then insert a new line at the beginning that says GOTO 9000.
3.
Make sure your BASIC doesn't include any of the commands or functions that aren't supported by HiSoft
BASIC (see Summary of differences from
Spectrum BASIC).
4.
Insert a new line with the compiler directive REM : OPEN # at the beginning of your program.
Other compiler directives are optional.
5.
RUN your program to make sure that it works. Try it with different inputs to
cover all the possibilities and test out all the branches of the program. The
compiled code will be designed to reproduce the effect of the BASIC (except
faster!), so if it doesn't work in BASIC, the compiled version won't work
either and may even crash. Conversely, if your program works in BASIC, you can
expect the compiled code to do the same. It is a good idea to SAVE your BASIC
program before proceeding.
6.
Compile by typing *C (see HiSoft BASIC Commands).
Refer to Error messages if compilation stops with a message at the bottom of
the screen.
7.
SAVE the compiled code.
The
compiled code is like any other machine code program. To execute it requires
the command RANDOMIZE USR xxxxx where xxxxx is its address. The compiled code
will return to BASIC at points where there was a STOP command or a REM : CLOSE
# directive or if it reaches the end of the program. Note that you must CLEAR
wwwww before LOAD ing in the code, where wwwww is any address less than xxxxx.
HiSoft
BASIC Commands
Commands
may be typed in upper or lower case. Execution of the command is immediate upon
receipt of the final character (no [ENTER] is needed). If at any time these
commands should stop being accepted, re-initialise the command interpreter by
RANDOMIZE USR 23792. Spectrum 128 and Spectrum Plus 2 owners should read
Appendix 1 first.
*C Starts
compilation of the BASIC program. Compiles those portions of the BASIC between
compiler directives REM: OPEN# and REM: CLOSE# . The compiled code is placed
just below RAMTOP and RAMTOP is revised. The following information is given
during the first pass: the relative addresses of the entry points to the code.
During the second pass: the execution addresses of the entry points to the code (both in decimal and
hexadecimal). Compilation pauses
after these until you press a key.
At completion :
- the number of bytes taken up by the
compiled code
- the number of bytes needed for machine
code variables
- the number of bytes occupied by the
BASIC program
- the commands to be used to SAVE the
compiled code and to LOAD it back in afterwards.