1     Now having seen a correct response when nothing matches
     
2     idiotic input string "foo", let's take a look at input
     
3     that does match. I'll start with a defined word. How about
     
4     "FIND", since we know that exists.
     
5 
     6     I'm not feeling particularly clever tonight. So I'm just
     
7     going to step through everthing and cut down the log to
     
8     the good stuff.
     
9 
    10     The "FIND" below is me typing "FIND" when the interpreter
    
11     requests input.
    
12 
    13 (gdb) run
    
14 code_INTERPRET () at nasmjf.asm:209
    
15 _WORD.skip_non_words () at nasmjf.asm:308
    
16 _KEY () at nasmjf.asm:349
    
17 _KEY.get_more_input () at nasmjf.asm:359
    
18     ...
    
19 364         int 0x80                ; syscall!
    
20 (gdb)
    
21 FIND
    
22 
    23     Now let's see if FIND can find itself. :-)
    
24 
    25 code_INTERPRET () at nasmjf.asm:212
    
26 _FIND () at nasmjf.asm:485
    
27 485         push esi                ; _FIND! Save esi, we'll use this reg for string comparison
    
28 488         mov edx,[var_LATEST]    ; LATEST points to name header of the latest word in the diction
    
29 
    30     As before, we alternate between .test_word and .prev_word
    
31     for every word in the dictionary, starting with the latest
    
32     and using the stored pointers as a linked list until we
    
33     find a match or the beginning of the list.
    
34 
    35 _FIND.test_word () at nasmjf.asm:490
    
36 _FIND.prev_word () at nasmjf.asm:517
    
37     ...
    
38 490             test edx,edx            ; NULL pointer?  (end of the linked list)
    
39 491         je .not_found
    
40 496         xor eax,eax
    
41 497         mov al, [edx+4]           ; al = flags+length field
    
42 498         and al,(F_HIDDEN|F_LENMASK) ; al = name length
    
43 499         cmp cl,al        ; Length is the same?
    
44 500         jne .prev_word          ; nope, try prev
    
45 
    46     Length of the word matched, now we'll check the actual
    
47     name string.
    
48 
    49     The key to understanding the comparison is knowing that
    
50     cmpsb implicitly uses the eci and edi registers as pointers
    
51     to the data to compare. The repe mnemonic stands for
    
52     "repeat while equal" and is a modifier for the cmpsb
    
53     instruction.
    
54 
    55     I've always been in the RISC camp (versus CISC), because
    
56     I love systems that compose from "simple" pieces. But I
    
57     have to admit that these "string" operations in x86
    
58     do make a lot of sense. After dwelling on this a bit
    
59     last night, I think I had a dream where I desiged some
    
60     new hardware (like Ben Eater's 8-bit breadboard CPU or 
    
61     was it an FPGA?) and I was coming up with a new
    
62     instruction set architecture (ISA) that was strictly
    
63     "complex" instructions like these string operations,
    
64     but I think it was also inspired by the array languages
    
65     like APL and J.  Anyway, I've abandoned my simplistic
    
66     noob stance on RISC: there are lots of different kinds
    
67     of simple and RISC trades one kind for another.
    
68 
    69     Let's see a repe cmpsb in action...
    
70 
    71 503         push ecx                ; Save the length
    
72 504         push edi                ; Save the address (repe cmpsb will move this pointer)
    
73 _FIND.test_word () at nasmjf.asm:505
    
74 505         lea esi,[edx+5]         ; Dictionary string we are checking against.
    
75 506         repe cmpsb              ; Compare the strings.
    
76 507         pop edi
    
77 _FIND.test_word () at nasmjf.asm:508
    
78 508         pop ecx
    
79 _FIND.test_word () at nasmjf.asm:509
    
80 509         jne .prev_word          ; nope, try prev
    
81 512         pop esi
    
82 _FIND.test_word () at nasmjf.asm:513
    
83 513         mov eax,edx
    
84 514         ret                     ; FOUND!
    
85 
    86     Yay! We've got a match on the word.
    
87 
    88     Now back in INTERPRET, we have to do another comparison to
    
89     check the return value. Since we did, we can now act upon
    
90     the matched word.
    
91 
    92 code_INTERPRET () at nasmjf.asm:215
    
93 215         test eax,eax            ; Found?
    
94 216         jz .try_literal
    
95 219         mov edi,eax             ; edi = dictionary entry YES WE HAVE MATCHED A WORD!!!
    
96 220         mov al,[edi+4]          ; Get name+flags.
    
97 221         push ax                 ; Just save it for now.
    
98 code_INTERPRET () at nasmjf.asm:222
    
99 222         call _TCFA              ; Convert dictionary entry (in %edi) to codeword pointer.
   
100 
   101     So TCFA is the internal label for the Forth word
   
102     ">CFA" which I read as "To CFA" and Jones guesses
   
103     probably means "Code Field Address". It's job is
   
104     to take the given pointer to a word and return a
   
105     pointer to the word's code. Neat.
   
106 
   107 _TCFA () at nasmjf.asm:386
   
108 386         xor eax,eax
   
109 387         add edi,4               ; Skip link pointer.
   
110 388         mov al,[edi]            ; Load flags+len into %al.
   
111 389         inc edi                 ; Skip flags+len byte.
   
112 390         and al,F_LENMASK        ; Just the length, not the flags.
   
113 391         add edi,eax             ; Skip the name.
   
114 392         add edi,3               ; The codeword is 4-byte aligned.
   
115 393         and edi,-3
   
116 394         ret
   
117 
   118     Then we return to INTERPRET again now that the edi
   
119     register contains the address of the matched word's
   
120     code (in this case, the code for FIND).
   
121 
   122     We can be in immediate mode and/or executing state.
   
123 
   124     In this case, we are NOT in immediate mode.
   
125 
   126 code_INTERPRET () at nasmjf.asm:223
   
127 223         pop ax
   
128 code_INTERPRET () at nasmjf.asm:224
   
129 224         and al,F_IMMED          ; is IMMED flag set?
   
130 225         mov eax,edi
   
131 226         jnz .execute            ; If IMMED, jump straight to executing.
   
132 
   133     We ARE in executing state.
   
134 
   135 227         jmp .check_state
   
136 code_INTERPRET.check_state () at nasmjf.asm:238
   
137 238         mov edx,[var_STATE]
   
138 239         test edx,edx
   
139 240         jz .execute             ; Jump if executing.
   
140 code_INTERPRET.execute () at nasmjf.asm:253
   
141 253         mov ecx,[interpret_is_lit] ; Literal?
   
142 254         test ecx,ecx               ; Literal?
   
143 255         jnz .do_literal
   
144 
   145     To execute the matched word, we simply jump
   
146     to the code address...
   
147 
   148 259         jmp [eax]
   
149 
   150     ...and now we're executing FIND, just as expected.
   
151 
   152 code_FIND () at nasmjf.asm:478
   
153 478         pop ecx                 ; length of word
   
154 479         pop edi                 ; buffer with word
   
155 480         call _FIND
   
156 _FIND () at nasmjf.asm:485
   
157 485         push esi                ; _FIND! Save esi, we'll use this reg for string comparison
   
158     ...
   
159 
   160     I'm not even sure what FIND is looking for now
   
161     since I didn't bother examining any memory during
   
162     the rest of the run. I was just happy to see the
   
163     interpreter finding and executing the requested
   
164     word!
   
165 
   166     Eventually it got through the linked list and didn't
   
167     match anything.
   
168 
   169 _FIND.test_word () at nasmjf.asm:490
   
170 490             test edx,edx            ; NULL pointer?  (end of the linked list)
   
171 491         je .not_found
   
172 _FIND.not_found () at nasmjf.asm:521
   
173 521         pop esi
   
174 _FIND.not_found () at nasmjf.asm:522
   
175 
   176     And then Forth exited normally having run out
   
177     of code (the interpreter does not yet loop, so
   
178     it always exits after the first bit of input.
   
179 
   180 190             mov eax, 1    ; exit syscall
   
181 191         int 80h       ; call kernel
   
182 [Inferior 1 (process 2531) exited normally]
   
183 
   184     Next will be handling numeric literals.