LB Booster
IDE and Compiler >> Integrated development environment >> Bug in loading a tkn file
http://lbb.conforums.com/index.cgi?board=ide&action=display&num=1461707954

Bug in loading a tkn file
Post by BrianM on Apr 26th, 2016, 9:59pm

LB allows a line number on a line by itself. LBB does not decode theses lines from a tkn file and they are missing in the loaded basic file. Admittedly this is not in common use.

e.g.

10
print "infinite loop'
goto 10

the line "10" will not be loaded into the editor from the tkn file.

Brian Matthews


Re: Bug in loading a tkn file
Post by Richard Russell on Apr 27th, 2016, 08:34am

on Apr 26th, 2016, 9:59pm, BrianM wrote:
LB allows a line number on a line by itself. LBB does not decode theses lines from a tkn file and they are missing in the loaded basic file. Admittedly this is not in common use.

Thanks for the report. As you say, having a line containing a line number, but nothing else, is unusual (I don't think I've ever seen it done 'in the wild'). In a 'modern' program one would normally choose to use a label rather than a line number anyway.

You will appreciate that decoding TKN files to recover an approximation of the original BASIC program is something of a 'black art', since (understandably) the structure of TKN files is not published. I don't know how difficult it would be to put right the omission and it's not a high priority for me.

While we're on the subject I think I'm right in saying that programs decoded from TKN files have a 'hard' TAB character at the beginning of each line, which breaks LBB's 'automatic indentation'. Sorry about that.

Richard.

Re: Bug in loading a tkn file
Post by SarmedNafi on May 1st, 2016, 03:28am

Hi Richard,
Hope you are well,

Well I read your replay and I am not sure if you are going to make some changes in LBB. But I am sure you will do the best for all of us.

However since many days I found my self needs to do some complex calculations on numbers which is represents money, long calculations it was tend to fall into mistakes. Therefor writing a program was the best solution.
The most simpler one for me was LB not LBB for two reasons
first it print result on MainWin without the necessity to create Text Editor.
Second it let me write variables in Arabic (of course without spaces).
After I finished, I asked my self LBB could overcome these two points.
-A menu item could either write a code of Text Editor or let LBB sends the result to Text Editor which is created automatically.
-A little useful toleration from Richard to let LBB accept Arabic letters as a name of variables.

With all thanks to Richard
Sarmed
Re: Bug in loading a tkn file
Post by Richard Russell on May 1st, 2016, 09:49am

on May 1st, 2016, 03:28am, SarmedNafi wrote:
I am not sure if you are going to make some changes in LBB

When I said in my reply that it was a "not a high priority", I meant that I have no intention of fixing it! It's such an obscure edge-case, and loading TKN files is hardly a mainstream aspect of LBB. Its only legitimate use is to allow somebody to recover a program that they have lost the source code for.

Quote:
print result on MainWin without the necessity to create Text Editor.

I'm afraid I don't understand that comment. You can, of course, print results to the LBB mainwin (even in Arabic if you really want to, although that involves a bit more work). Indeed the LBB mainwin supports Unicode, which the LB 4 mainwin does not, so it is better suited to multilingual output. The only significant disadvantage of the LBB mainwin is that it has a limited size (about 84 lines with the default font) after which lines 'scroll off the top' and are lost forever.

The other point I would want to make is that when you output Arabic (or any other language not supported by the ANSI character set) in LB 4 you must use the old-fashioned Code Page technique. The use of Code Pages is deprecated, and has been for many years. Nowadays one is expected to use Unicode for multilingual output, which is why LBB is designed to have (limited) support for Unicode. LB 4 does not support Unicode at all.

Quote:
Second it let me write variables in Arabic (of course without spaces).

This would be impossible in LBB, because Arabic characters (whether using a Code Page or Unicode) have the high-bit set (i.e. they are in the code range 128-255). Characters in this range are keyword tokens in LBB! Personally I do not think it is desirable to use foreign-language characters in variable names, because it makes the code non-portable.

It is ironic that LB 4 refers to its .TKN files as being 'tokenized' (that is presumably what the extension TKN refers to). If they really were tokenized you would almost certainly not be able to use Arabic variable names, but of course as we know TKN files are not tokenized at all!

Richard.

Re: Bug in loading a tkn file
Post by SarmedNafi on May 1st, 2016, 11:14am

Thank you Richard,

For your detailed replay.

The results that should be send to main win was 300 lines.

Unicode feature in LBB not help for variables. However it is not a big problem, thanks a lot for LBB.

Regards
Re: Bug in loading a tkn file
Post by BrianM on May 1st, 2016, 1:18pm

Richard,
getting back to decoding tkn files. A line containing just a line number in a tkn file is created from a quote comment
e.g.

goto 100
...
100 ' a comment

results in the line
100
in the tkn file (actually no code is generated, just an entry in the labels table). It is not uncommon in legacy code to goto or gosub to a comment statement.

REM comments don't suffer from this as the REM keyword survives in the tkn file and hence generates code but the comment text is stripped.

For info: When decoding a tkn file, the processing for line numbers is the same as for character labels.

Note that line numbers in LB are really strings and not numeric. Line numbers 10 and 010 can be both used and are different.

Brian


Re: Bug in loading a tkn file
Post by Richard Russell on May 1st, 2016, 4:05pm

on May 1st, 2016, 1:18pm, BrianM wrote:
When decoding a tkn file, the processing for line numbers is the same as for character labels.

Clearly that's not the case in LBB, because otherwise the bug you reported (lines having only a line number being omitted, but lines having only a label not being omitted) would not occur! Do you have some 'inside knowledge' of how TKN files are encoded?

Quote:
Note that line numbers in LB are really strings and not numeric. Line numbers 10 and 010 can be both used and are different.

In LBB line numbers are numeric (so 10 and 010 are the same) but labels are strings (so [10] and [010] are different). Internally they are handled quite independently, and indeed GOTO [100] is likely to execute much more quickly than GOTO 100! This arises because in BBC BASIC line numbers and labels are entirely different too.

As I alluded to before, I would have no idea how to modify the TKN decoder routine in LBB to fix the issue you reported. Specifically I don't know how lines containing only a line number are encoded in a TKN file. I would still contend that the circumstances in which the bug manifests itself are so rare - particularly in a program that has been 'compiled' to a TKN file - that it's of little importance.

Richard.

Edit: According to the LB Help documentation "Spaces and numbers are not allowed as part of branch label names" and the example is given of [1moreTime] as an invalid label. In practice it seems that digits are allowed, but perhaps this indicates that line numbers were once handled differently.


Re: Bug in loading a tkn file
Post by BrianM on May 1st, 2016, 9:19pm

Richard

I have no inside information. Many years ago (before LBB) I stumbled across a program on the internet which decoded tkn files into bas files. It is rather convoluted and far from correct. I converted this from quick basic to liberty basic and I occasionally try to improve it as I am not entirely happy with my program being 100% correct. Out of interest I compared the output from my program to that of LBB and it revealed deficiencies in both programs. I had thought that the LBB tkn decoder was based on the same program.

Line numbers in LB are strings, but is seems they are numeric in LBB. Try running this program in both LB and LBB.

gosub 10
gosub 010
end
10 print "10"
return
010 print "010"
return

I am not expecting you to modify the LBB IDE and as you say it is unlikely to be a problem with recent code. I am just reporting my findings. I am also happy to share with you my understanding of the tkn file format but it is not appropriate to do so in public.

Brian

Re: Bug in loading a tkn file
Post by Richard Russell on May 1st, 2016, 9:43pm

on May 1st, 2016, 9:19pm, BrianM wrote:
I had thought that the LBB tkn decoder was based on the same program.

No, I 'reverse engineered' the TKN format myself; the decoder is not based on anybody else's work. I think I know what program you are referring to, but I only came across it later (the User Name and Password decoding features that I subsequently added may have come from that program).

As far as I can see from my code (which was written several years ago) the presence of square brackets around branch labels is the only way they are recognised as such, so line numbers fundamentally won't be seen. I do not know how they are encoded.

Quote:
Line numbers in LB are strings, but is seems they are numeric in LBB

I explained in my recent post how labels and line numbers are handled entirely differently in LBB. It cannot be adapted to be compatible with LB in this respect, because it is constrained by the way BBC BASIC works (line numbers are encoded as 16-bit binary integers, labels are represented as strings).

Richard.