1     Well, log05.txt ended with some great excitement. I
     
2     double-checked and all of the open TODOs are now closed.
     
3 
     4     So I think I'll dip into design-notes.txt and pick the
     
5     next thing to do.
     
6 
     7     I just remembered one thing, I need to _remove_ a
     
8     feature: the return stack doesn't need to be a stack at
     
9     all because my "inline all the things!" language can't
    
10     have nested word calls anyway:
    
11 
    12         [ ] Replace return stack with single addr
    
13 
    14     So, that's not super rewarding, but I do enjoy deleting
    
15     uneeded code.
    
16 
    17     Oh, I know which feature I'm doing after that! Time to
    
18     reward myself for staying on track with something fun
    
19     and visual:
    
20 
    21         [ ] Pretty-print meta info about word!
    
22         [ ] Loop through dictionary, list all names
    
23         [ ] Loop through dictionary, pretty-print all
    
24 
    25     Next night: De-evolving the return mechanism for
    
26     immediate word calling was easy, so that one's done.
    
27 
    28     Now for the fun ones.
    
29 
    30     I'm more of a strings programmer than a numbers
    
31     programmer. So the ultra-primitive state of my string
    
32     printing is a bit of a bummer. Before I start storing a
    
33     billion little pieces of strings in the DATA segment,
    
34     I'd like to consider adding some convenience words for
    
35     string handling.
    
36 
    37     It would be nice to have, at the very least, string
    
38     literals in the language.
    
39 
    40         [ ] Add string literals.
    
41         [ ] Re-define 'meow' using a string literal.
    
42 
    43     I like the idea of just writing "anonymous" strings to
    
44     be printed into the dictionary space where all the words
    
45     are. And I think my choice to null-terminate my strings
    
46     will pay off here (I hope).
    
47 
    48     Adding immediate mode strings that are just references
    
49     to the input buffer turned out to be super easy:
    
50 
    51         ; IMMEDIATE version of " scans forward until it finds end
    
52         ; quote '"' character in input_buffer and replaces it with
    
53         ; the null terminator. Leaves start addr of string on the
    
54         ; stack. Use it right away!
    
55         DEFWORD quote
    
56             mov ebp, [input_buffer_pos]
    
57             inc ebp ; skip initial space
    
58             push ebp ; we leave this start addr on the stack
    
59         .look_for_endquote:
    
60             inc ebp
    
61             cmp byte [ebp], '"' ; endquote?
    
62             jne .look_for_endquote ; nope, loop
    
63             mov byte [ebp], 0   ; replace endcquote with null
    
64             inc ebp ; move past the new null terminator
    
65             mov [input_buffer_pos], ebp ; save position
    
66         ENDWORD quote, '"', (IMMEDIATE)
    
67 
    68     And now I can do my first legit Hello World:
    
69 
    70         db ' " Hello world!" print newline exit '
    
71 
    72     Which works just fine:
    
73 
    74 $ mr
    
75 Hello world!
    
76 
    77     But since it just saves a reference to the input buffer,
    
78     real world usage won't really be safe. Unless the input
    
79     buffer is limitless, I hae no idea if the string address
    
80     will still be valid by the time I try to use it.
    
81 
    82     For that reason, I'm gonna have to copy any strings from
    
83     the input buffer to somewhere.
    
84 
    85     I could either have a special-purpose buffer just for
    
86     storing strings, or I could write to the stack, or I
    
87     could write to the compile area.
    
88 
    89     The other thing that's really messing with my mind is
    
90     trying to think ahead (probably way too much) towards
    
91     how I might handle this stuff in a stand-alone
    
92     executable program produced by Meow5...which, now that
    
93     I've written it out, is DEFINITELY thinking ahead too
    
94     far ahead.
    
95 
    96     Next night: Moving on, I've also decided that I should
    
97     extract the part of 'get_token' that eats any initial
    
98     space characters (or other whitespace) out into its own
    
99     word.
   
100 
   101         [ ] New word: 'eat_spaces'
   
102 
   103     That will allow me to use it to "peek ahead" if I
   
104     want to in the outer interpreter and possibly switch
   
105     into a "string mode" (which is something I'm
   
106     contemplating). But all these paragraphs are me getting
   
107     way ahead of myself. Back to the assembly!
   
108 
   109     Okay, done. I had just one mistake, but GDB was a clumsy
   
110     way to debug it. So I added some more print debugging,
   
111     leading to this extremely verbose output once it worked:
   
112 
   113 $ mr
   
114 Running ":"
   
115 Inlining "meow"
   
116 Inlining "meow"
   
117 Inlining "meow"
   
118 Inlining "meow"
   
119 Inlining "meow"
   
120 Running ";"
   
121 Running "meow5"
   
122 Meow. Meow. Meow. Meow. Meow. Running "newline"
   
123 
   124 Running "exit"
   
125 
   126     I'll comment those out for now, but I'm betting I'll be
   
127     using them again soon.
   
128 
   129 $ mr
   
130 Meow. Meow. Meow. Meow. Meow.
   
131 
   132     There we are, good as new.
   
133 
   134     Next night: So while it's true that I could save strings
   
135     (and other data) in a variety of clever places, my
   
136     understanding is that modern CPUs do much better with
   
137     separate instruction and data memory.
   
138 
   139     So I'm gonna say for now that there will be three types
   
140     of memory in Meow5:
   
141 
   142         1. The stack for all register-sized parameters
   
143         2. The "compile area" where all inlined words go
   
144         3. The "data area" where all variables and other
   
145            data (such as "anonymous" strings) will go.
   
146 
   147     In fact, I'm gonna name #2 and #3 exactly like that:
   
148 
   149         section .bss
   
150 
   151         ...
   
152 
   153         compile_area: resb 1024
   
154         data_area:    resb 1024
   
155 
   156         here: resb 4
   
157         free: resb 4
   
158 
   159     Where 'here' points to the next free spot in the
   
160     compile_area (the 'here' name comes from Forth).
   
161 
   162     And 'free' points to the next free spot in the
   
163     data_area.
   
164 
   165     And I'm gonna go against the Forth grain and add a
   
166     special handler for quote syntax. I'll go ahead and peek
   
167     at the next character of input. If it's a quote, I'll
   
168     handle the rest as a string. Otherwise, keep processing
   
169     tokens as usual.
   
170 
   171     The word is called 'quote' instead of '"' and I'm going
   
172     to call it explicitly in my outer interpreter.
   
173 
   174     The point of this is to allow "normal looking" strings
   
175     like this:
   
176 
   177         "Hello world"
   
178 
   179     Rather than requring a token delimeter after the '"'
   
180     word as in traditional Forth:
   
181 
   182         " Hello world"
   
183 
   184     Between that and copying the string from the input
   
185     buffer to a new variable space, the change in my
   
186     immediate mode hello world is just the missing space,
   
187     but it's a world of difference:
   
188 
   189         db ' "Hello World!" print newline exit '
   
190 
   191     Does it work?
   
192 
   193 $ mr
   
194 Hello World!
   
195 
   196     Compile mode is exactly the same (I'll put the string in
   
197     the data_area at compile time), but instead of pushing
   
198     the address of the string to the stack right at that
   
199     momment, I need to inline (or "compile") the machine
   
200     code to push the address *when the word being compiled
   
201     runs*!
   
202 
   203     To do that, I need to actually "assemble" the i386
   
204     opcode to push the 32-bit address onto the stack.
   
205 
   206     So that'll be the "PUSH imm32" instruction in Intel
   
207     documentation parlance.
   
208 
   209     Handy reference: https://www.felixcloutier.com/x86/push
   
210 
   211         6A <ib> PUSH imm8
   
212         66 <iw> PUSH imm16
   
213         68 <id> PUSH imm32
   
214 
   215     And I'm gonna test that out with NASM and GDB:
   
216 
   217         push byte  0x99
   
218         push word  0x8888
   
219         push dword 0x77777777
   
220 
   221     disassembles as:
   
222 
   223 0x0804942d <+0>:	6a 99	push   $0xffffff99
   
224 0x0804942f <+2>:	66 68 88 88	pushw  $0x8888
   
225 0x08049433 <+6>:	68 77 77 77 77	push   $0x77777777
   
226 
   227     Bingo! So I'm going to want opcode 0x68 followed by
   
228     the address value.
   
229         
   230         mov edx, [here]
   
231         mov byte [edx], 0x68     ; i386 opcode for PUSH imm32
   
232         mov dword [edx + 1], ebx ; address of string
   
233         add edx, 5               ; update here
   
234         mov [here], edx          ; save it
   
235 
   236     Well, here goes nothing...
   
237 
   238         db ': meow "Meow." print ; meow newline exit '
   
239 
   240     There's no way that's gonna work...
   
241 
   242 $ mr
   
243 Running ":"
   
244 Inlining "print"
   
245 Running ";"
   
246 Running "meow"
   
247 Meow.Running "newline"
   
248 
   249 Running "exit"
   
250 
   251     What?! It worked!
   
252 
   253     As you can see, I had also turned my debugging
   
254     statements back on 'cause I was expecting trouble. They
   
255     help assure me that this is, in fact compiling a word
   
256     called 'meow' that prints a string stored in memory at
   
257     compile time. I'll turn the debugging off again.
   
258 
   259     And while I'm at it, I'll remove the old assembly test
   
260     'meow' word and define it like this in order to create
   
261     the 'meow5' word.
   
262 
   263 
   264         input_buffer:
   
265             db ': meow "Meow." print ; '
   
266             db ': meow5 meow meow meow meow meow ; '
   
267             db 'meow5 '
   
268             db 'newline '
   
269             db 'exit',0
   
270 
   271 ./build.sh: line 33:  2650 Segmentation fault      ./$F
   
272 
   273     Aw man.
   
274 
   275     Okay, were are we crashing?
   
276 
   277 (gdb) r
   
278 Starting program: /home/dave/meow5/meow5
   
279 Running ":"
   
280 Inlining "print"
   
281 Running ";"
   
282 Running ":"
   
283 Inlining "meow"
   
284 Inlining "meow"
   
285 Inlining "meow"
   
286 
   287 Program received signal SIGSEGV, Segmentation fault.
   
288 find.test_word () at meow5.asm:165
   
289 165	    and eax, [edx + T_FLAGS] ; see if mode bit is set...
   
290 
   291     Hmmm. Weird that it dies while trying to find the fourth
   
292     'meow' to inline. I bumped up the compile area memory to
   
293     4kb and it wasn't that. So I guess I'll be stepping
   
294     through this.
   
295 
   296     Three nights later (I think): I did step through it
   
297     quite a bit with GDB, but this thing is getting to the
   
298     point where it feels like there's a pretty big mismatch
   
299     between GDB's strengths (stepping through C) and this
   
300     crazy machine code concatenation I'm doing.
   
301 
   302     I've always prefered "print debugging" anyway. So I've
   
303     made what I think is a neat little DEBUG print macro. It
   
304     takes a string and an expression to print as a 32-bit hex
   
305     number. The expression is anything that would be valid
   
306     as the source for a MOV to a register: mov eax, <expr>.
   
307 
   308     Examples:
   
309     
   310         DEBUG "Value in eax: ", eax
   
311         DEBUG "My memory: ", [mem_label]
   
312         DEBUG "32 bits of glory: ", 0xDEADBEEF
   
313 
   314     Since the segfault is happening after a fourth iteration
   
315     of inline, I feel almost certain that this is a memory
   
316     clobbering problem. But all my data areas seem more than
   
317     big enough, so there must be a bug.
   
318 
   319     I've peppered 'inline' and 'find' (where the actual
   
320     crash takes place) with DEBUG statements. Here's a
   
321     sampling:
   
322 
   323 Start [here]: 0804a280
   
324 Start [last]: 08049ad8
   
325 find [last]: 08049ad8
   
326 find edx: 08049ad8
   
327 find [edx]: 08049a3b
   
328 find [edx+T_FLAGS]: 00000003
   
329 ...
   
330 Running ":"
   
331 ...
   
332 Inlining "print"
   
333 ...
   
334 ...
   
335 Running ";"
   
336 semicolon end of machine code [here]: 0804a2a5
   
337   inline to [here]: 0804a2a5
   
338   inline len: 00000007
   
339   inline from: 0804961a
   
340   inline done, [here]: 0804a2ac
   
341 semicolon tail [here]: 0804a2ac
   
342 semicolon linking to [last]: 08049ad8
   
343 semicolon done with [last]: 0804a2ac
   
344                     [here]: 0804a264
   
345 find [last]: 0804a2ac
   
346 ...
   
347 Running ":"
   
348 ...
   
349 Inlining "meow"
   
350   inline to [here]: 0804a264
   
351   inline len: 00000025
   
352   inline from: 0804a280
   
353   inline done, [here]: 0804a289
   
354 ...
   
355 Inlining "meow"
   
356   ...
   
357   inline done, [here]: 0804a2ae
   
358 Inlining "meow"
   
359   ...
   
360   inline done, [here]: 0804a2d3
   
361 find [last]: 0804a2ac
   
362 find edx: 0804a2ac
   
363 find [edx]: 595a5a80
   
364 find [edx+T_FLAGS]: 0004b859
   
365 find [last]: 0804a2ac
   
366 find edx: 595a5a80
   
367 
   368     Even viewing exactly what I want to see, all of these
   
369     addresses are still enough to make me go cross-eyed.
   
370 
   371     So immediately after compiling a new word, I should have
   
372     this:
   
373 
   374         (word's machine code)
   
375         tail:
   
376             link: 0x0804____     <-- [last] points here
   
377             (offsets and flags)
   
378         end of tail              <-- [here] points here
   
379 
   380     The [last] address should point to the tail of the last
   
381     compiled word and [here] should point to the next
   
382     available free space in the compile_area.
   
383 
   384     Time to examine the output.
   
385 
   386     When Meow5 beings, [last] is pointing to the last word
   
387     created in assembly and [here] is pointing to the very
   
388     beginning of the compile_area:
   
389 
   390 Start [here]: 0804a280
   
391 Start [last]: 08049ad8
   
392 
   393     After a run-time word is compiled (such as 'meow'),
   
394     [here] should always be a little larger than [last].
   
395 
   396 Running ";"
   
397 semicolon end of machine code [here]: 0804a2a5
   
398   inline to [here]: 0804a2a5
   
399   inline len: 00000007
   
400   inline from: 0804961a
   
401   inline done, [here]: 0804a2ac
   
402 semicolon tail [here]: 0804a2ac
   
403 semicolon linking to [last]: 08049ad8
   
404 semicolon done with [last]: 0804a2ac
   
405                     [here]: 0804a264
   
406 
   407     Which is indeed the case - [here] is a tail's worth of
   
408     bytes after [last]. So far so good.
   
409 
   410     Then we crash while finding and inlining the 'meow'
   
411     machine code into a new 'meow5' word. Here's the first:
   
412 
   413 Inlining "meow"
   
414   inline to [here]: 0804a264
   
415   inline len: 00000025
   
416   inline from: 0804a280
   
417   inline done, [here]: 0804a289
   
418 
   419     To double-check, I put in even more DEBUG statements in
   
420     'inline':
   
421 
   422 Inlining "meow"
   
423     word tail: 0804b30c
   
424           len: 00000025
   
425   code offset: 0000002c
   
426        source: 0804b2e0
   
427   dest [here]: 0804b30e
   
428   dest    edi: 0804b30e
   
429    end    edi: 0804b333
   
430    end [here]: 0804b333
   
431 
   432     No, that all seems fine. It doesn't look like 'inline'
   
433     is at fault here. But _something_ is making the linked
   
434     list incorrect in the tail:
   
435 
   436 find [last]: 0804b30c
   
437 find edx: 0804b30c
   
438 find [edx]: 595a5a80  <--- not a valid address
   
439 
   440     Sure, semicolon could have a bug...but that should be
   
441     causing the problem immediately, not between inlining
   
442     'meow' the third and fourth times.
   
443 
   444     Okay, acutally, I think GDB can help me here. I need to
   
445     know when this value in memory is getting clobbered.
   
446     Here's the syntax for watching a specific address. Have
   
447     to cast it - "int" is 32 bits for 32-bit elf and '*'
   
448     tells GDB that our value is a pointer. It's all very
   
449     'C'.
   
450 
   451 (gdb) watch *(int)0x0804b30c
   
452 Hardware watchpoint 1: *(int)0x0804b30c
   
453 
   454     And let's see what happens:
   
455 
   456 ...
   
457 semicolon linking to [last]: 08049dc8
   
458 Hardware watchpoint 1: *(int)0x0804b30c
   
459 
   460 Old value = 0
   
461 New value = 134520264
   
462 @124.continue () at meow5.asm:454
   
463 454	    mov [last], eax ; and store this tail as new 'last'
   
464 (gdb) x/x *(int)0x0804b30c
   
465 0x8049dc8 <tail_quote>:	0x08049d2b
   
466 
   467     Okay, that's a good address. So semicolon is doing the
   
468     right thing so far. Let's continue...
   
469 
   470 ...
   
471 Inlining "meow"
   
472     word tail: 0804b30c
   
473           len: 00000025
   
474   code offset: 0000002c
   
475        source: 0804b2e0
   
476   dest [here]: 0804b2e9
   
477   dest    edi: 0804b2e9
   
478 
   479 Hardware watchpoint 1: *(int)0x0804b30c
   
480 
   481 Old value = 134520264
   
482 New value = 134520192
   
483 @36.continue () at meow5.asm:247
   
484 247	    rep movsb       ; copy [esi]...[esi+ecx] into [edi]
   
485 
   486     Bingo! Well, then it *is* inline then. Yeah, clearly it
   
487     is. Ah, I see, but that's the first 'meow' inline. Which
   
488     kinda explains why I missed it.
   
489 
   490     So it's gotta be with a [here] that wasn't updated
   
491     correctly at some point.
   
492 
   493     Wait, has it been staring me in the face this whole
   
494     time?
   
495 
   496 semicolon done with [last]: 0804b30c
   
497                     [here]: 0804b2c4
   
498 
   499     Ah geez. Yeah, [here] should certainly be *after*
   
500     [last]:
   
501 
   502                     [last]: 0804b30c <-- 30c (after)
   
503                     [here]: 0804b2c4 <-- 2c4 (before)
   
504 
   505     Dangit! Okay, so some more DEBUGs:
   
506 
   507 tail eax: 0804b35c
   
508 tail eax: 0804b360
   
509 tail eax: 0804b364
   
510 tail eax: 0804b368
   
511 tail eax: 0804b30c  <-- yup! lost some ground here :-(
   
512 tail eax: 0804b310
   
513 
   514     Got it! So it was my descision to go against Chuck
   
515     Moore's advice to always have words consume their
   
516     parameters from the stack so you don't have to remember
   
517     which words do and which words don't:
   
518 
   519         %macro STRLEN_CODE 0
   
520             mov eax, [esp] ; get string addr (without popping!)
   
521             ...
   
522 
   523     Sure enough, I forgot to pop to throw away this one
   
524     unique case where I really do just need the string
   
525     length:
   
526 
   527         ; Call strlen again so we know how much string name we
   
528         ; wrote to the tail:
   
529         push name_buffer
   
530         STRLEN_CODE
   
531         pop ebx ; get string len pushed by STRLEN_CODE
   
532         pop eax ; get saved 'here' position
   
533 
   534     That second pop was getting the name_buffer address I'd
   
535     pushed before STRLEN_CODE.
   
536 
   537     And now the novice is enlightened.
   
538 
   539     I'll fix that behavior right now and always heed that
   
540     particular bit of advice from here on out!
   
541 
   542     Okay, then it wouldn't find 'meow' after the *second*
   
543     try:
   
544 
   545 Finding...0804b2fc
   
546 meow
   
547 find [last]: 0804b369
   
548 find edx: 0804b369
   
549 find [edx]: 0804a062
   
550    flags okay: 00000001
   
551    finding: 776f656d
   
552    finding: 00776f65
   
553    finding: 0000776f
   
554    finding: 00000077
   
555    finding: 00000000
   
556 
   557     It turns out I had one more problem with my strlen:
   
558 
   559         add eax, ebx ; advance 'here' by that amt
   
560         inc eax      ; plus one for the null
   
561 
   562     Had to add that last inc because strlen doesn't count
   
563     the null terminator as a character. So why did it find
   
564     'meow' the first time? Because I hadn't yet written
   
565     anything to the compile area, and the "blank" memory
   
566     acted as a terminator, but once I started to inline a
   
567     copy of 'meow' right after 'meow's tail as the
   
568     definition of 'meow5', that null was no longer there!
   
569 
   570     Now I'm gonna remove about two dozen DEBUG statements...
   
571 
   572     And will this work?
   
573 
   574         input_buffer:
   
575             db ': meow "Meow." print ; '
   
576             db ': meow5 meow meow meow meow meow ; '
   
577             db 'meow5 '
   
578             db 'newline '
   
579             db 'exit',0
   
580 
   581     Crossing fingers:
   
582 
   583 $ mr
   
584 Meow.Meow.Meow.Meow.Meow.
   
585 
   586     At last!
   
587 
   588     Guess the pretty-printing the dictionary got super
   
589     delayed, but this was vital stuff. I'll put those todos
   
590     in a new log in just a bit. But leaving this much
   
591     simpler todo for tomorrow night:
   
592 
   593         [ ] Factor out a PRINTSTR macro from DEBUG, then use
   
594             it *from* DEBUG and also anywhere else I'm
   
595             currently hard-coding strings in the data
   
596             section and printing them in the interpreter. Go
   
597             ahead and push/pop the 4 registers in that one
   
598             too.  Performance is totally not a concern with
   
599             these convenience macros in the interpreter.
   
600 
   601     Well, that was even easier than I expected.
   
602 
   603     Now to test (I'm using PRINTSTR in DEBUG and
   
604     stand-alone):
   
605 
   606         PRINTSTR "Hello world!"
   
607         NEWLINE_CODE
   
608 
   609         DEBUG "[here] starting at 0x", [here]
   
610 
   611     Run:
   
612 
   613 $ mr
   
614 Hello world!
   
615 [here] starting at 0x0804a114
   
616 Meow.Meow.Meow.Meow.Meow.
   
617 
   618     And replaced all the strings in the data section with my
   
619     PRINSTR macro - which makes those parts at least 30%
   
620     shorter and MUCH easier to read.
   
621 
   622     Here's where I'm at with the TODOs in this log:
   
623 
   624 
   625         [x] Replace return stack with single addr
   
626         [ ] Pretty-print meta info about word!
   
627         [ ] Loop through dictionary, list all names
   
628         [ ] Loop through dictionary, pretty-print all
   
629         [x] Add string literals.
   
630         [x] Re-define 'meow' using a string literal.
   
631         [x] New word: 'eat_spaces'
   
632         [x] Factor out a PRINTSTR macro from DEBUG
   
633             [x] Use it in DEBUG
   
634             [x] Replace data strings + CALLWORD print
   
635 
   636     So  I'll start the next log where I started this one:
   
637     With the fun dictionary pretty-printer words.
   
638 
   639     This log's progress has been better than I'd hoped for
   
640     and I think now I'm in a good position for the fun
   
641     stuff!