1     Hello! So it seems to me that there are two major paths
     
2     to decide between for the thing to add next:
     
3 
     4         * Control structures (if/else, loops)
     
5         * Write compiled programs to ELF executables
     
6 
     7     Both will be challenging. I'm leaning towards ELF at the
     
8     moment. We'll see what I think when I come back tomorrow
     
9     night.
    
10 
    11     Two nights later: Yup, gonna try to write an ELF
    
12     executable. This is gonna be cool!
    
13 
    14     First, I need to test writing a file. Then write the ELF
    
15     header, then the contents of a word.
    
16 
    17     I'll start by having 'make_elf' take a token (will use
    
18     as an output filename for writing the executable and,
    
19     later, the word get the machine code from).
    
20 
    21     Then I'll write the string 'ELF' to that file. (Which is
    
22     very appropriate because Bytes 2-5 of a _real_ ELF
    
23     header are that string.)
    
24 
    25     Next night: So I've got my test 'make_elf' and it is
    
26     supposed to be writing to whatever filename you want:
    
27 
    28         make_elf foo
    
29 
    30     That should write the string 'ELF' to a file called
    
31     'foo', but it's not. So I've inserted a DEBUG to see
    
32     what the fd returned from 'open' is:
    
33 
    34 $ mr
    
35 make_elf foo
    
36 new fd: fffffffe
    
37 Goodbye.
    
38 Exit status: 0
    
39 
    40     Yeah, that's definitely an error.
    
41 
    42     While looking for how to decode that error (the open(2)
    
43     man page explains the errors, but they're all C mnemonic
    
44     constants, of course), I came across this excellent
    
45     suggestion on SO: https://stackoverflow.com/a/68155464
    
46 
    47     Which was to use strace to decode the error for me!
    
48 
    49 $ strace ./meow5
    
50 execve("./meow5", ["./meow5"], 0x7fff2d4ec190 /* 60 vars */) = 0
    
51 [ Process PID=2579 runs in 32 bit mode. ]
    
52 read(0, make_elf foo
    
53 "make_elf foo\n", 1024)         = 13
    
54 open("foo", O_WRONLY|0xc)               = -1 ENOENT (No such file or directory)
    
55 write(1, "new fd: ", 8new fd: )                 = 8
    
56 write(1, "fffffffe\n", 9fffffffe
    
57 )               = 9
    
58 write(-2, "ELF", 3)                     = -1 EBADF (Bad file descriptor)
    
59 read(0, "", 1024)                       = 0
    
60 write(1, "Goodbye.\n", 9Goodbye.
    
61 )               = 9
    
62 exit(0)                                 = ?
    
63 +++ exited with 0 +++
    
64 
    65     Huh, so something's wrong with my attempt to open the
    
66     output file with write-only, create, and truncate flags.
    
67 
    68     Here's what I'm sending:
    
69 
    70         ; From open(2) man page:
    
71         ;   A call to creat() is equivalent to calling open()
    
72         ; with flags equal to O_CREAT|O_WRONLY|O_TRUNC.
    
73         ; I got the flags by searching all of /usr/include and
    
74         ; finding /usr/include/asm-generic/fcntl.h
    
75         ; That yielded (along with bizarre comment "not fcntl"):
    
76         ;   #define O_CREAT   00000100
    
77         ;   #define O_WRONLY  00000001
    
78         ;   #define O_TRUNC   00001000
    
79         ; Hence this flag value for 'open':
    
80         mov ecx, 1101b
    
81 
    82     But from the strace above, it looks like it sees
    
83     O_WRONLY and...0xC - which is, indeed 1100...
    
84 
    85     Sounds like I've got a mystery for tomorrow night.
    
86 
    87     Two nights later: I bet somebody out there is
    
88     screaming. Ha ha. Those numbers are in octal, not binary
    
89     (despite looking for all the world like bit flags).
    
90 
    91     So I fixed that one night. Then I had to learn how to
    
92     set the mode (permissions), which was, like, freakishly
    
93     hard to find online. All the 'open' examples I found
    
94     were opening existing files. But since CREAT is an
    
95     option, obviously there was a way to do it...
    
96 
    97     The search "32 x86 assembly linux syscall table" is the
    
98     blessed way to ask the major search engines.
    
99 
   100     The answer is: the mode bits (in the usual unix octal
   
101     owner/group/all format) go in register edx. So:
   
102 
   103         ; ebx contains null-terminated word name (see above)
   
104         mov ecx, (0100o | 0001o | 1000o)  ; open flags
   
105         mov edx, 666o                     ; mode (permissions)
   
106         mov eax, SYS_OPEN
   
107         int 80h ; now eax will contain the new file desc.
   
108 
   109     And when I went to test it, I was sleepy and forgot that
   
110     since I was running the binary from strace, it wasn't
   
111     gonna re-build from source like my shell aliases 'mr',
   
112     'mb', 'mt' do, so I couldn't figure out why it wasn't
   
113     working...
   
114 
   115     ...until I woke up in the middle of the night with the
   
116     realization.
   
117 
   118     Anyway, next morning, here goes:
   
119 
   120 $ strace ./meow5
   
121 execve("./meow5", ["./meow5"], 0x7fff56d5ec40 /* 60 vars */) = 0
   
122 [ Process PID=1377 runs in 32 bit mode. ]
   
123 read(0, make_elf foo
   
124 "make_elf foo\n", 1024)         = 13
   
125 open("foo", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
   
126 write(1, "new fd: ", 8new fd: )                 = 8
   
127 write(1, "00000003\n", 900000003
   
128 )               = 9
   
129 write(3, "ELF", 3)                      = 3
   
130 read(0, "", 1024)                       = 0
   
131 write(1, "Goodbye.\n", 9Goodbye.
   
132 )               = 9
   
133 exit(0)                                 = ?
   
134 +++ exited with 0 +++
   
135 
   136     Awesome, we can see the flags being correctly decoded
   
137     and the mode/permission param:
   
138 
   139         open("foo", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
   
140 
   141     So I've learned that strace rules for this sort of thing!
   
142 
   143     But did it work?
   
144 
   145 $ cat foo
   
146 ELF
   
147 
   148     Yahoo! Ha ha, I have written a string to a new file.
   
149     Jeez, that was way harder than I expected.
   
150 
   151     But now I can actually try writing an ELF header. I'm
   
152     excited.
   
153 
   154     -------------------------------------------------------
   
155 
   156     11 nights later: It's the holiday season, which is a lot
   
157     of exhausting activity (if you're a parent) under the
   
158     best of circumstances and this was an unusually hard one
   
159     for the family. So what I could easily have done in a
   
160     single night ended up stretching out for many nights.
   
161     But I finally finished the header portion in the .data
   
162     section and am writing it with the 'make_elf' word (I am
   
163     *not* writing the word yet).
   
164 
   165     Let's see what it does so far:
   
166 
   167 $ mr
   
168 make_elf exit
   
169 new fd: 00000003
   
170 Goodbye.
   
171 Exit status: 0
   
172 
   173     The "new fd" message is a DEBUG statement I apparently
   
174     left in there to make sure I was opening the file
   
175     correctly.
   
176 
   177     If I've done everything correctly, this will have
   
178     written a file named "exit" with a more-or-less correct
   
179     ELF header.
   
180 
   181     Let's see what 'file' thinks of it:
   
182 
   183 $ file exit
   
184 exit: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), can't read elf program headers at 184, no section header
   
185 
   186     Not bad! The program headers error might be due to a bug
   
187     in my headers or just the fact that I'm not writing the
   
188     program to the file yet.
   
189 
   190     Let's see what 'readelf' says:
   
191 
   192 $ readelf -a exit
   
193 ELF Header:
   
194   Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
   
195   Class:                             ELF32
   
196   Data:                              2's complement, little endian
   
197   Version:                           1 (current)
   
198   OS/ABI:                            UNIX - System V
   
199   ABI Version:                       0
   
200   Type:                              EXEC (Executable file)
   
201   Machine:                           Intel 80386
   
202   Version:                           0x1
   
203   Entry point address:               0x8048000
   
204   Start of program headers:          184 (bytes into file)
   
205   Start of section headers:          0 (bytes into file)
   
206   Flags:                             0x0
   
207   Size of this header:               52 (bytes)
   
208   Size of program headers:           32 (bytes)
   
209   Number of program headers:         1
   
210   Size of section headers:           0 (bytes)
   
211   Number of section headers:         0
   
212   Section header string table index: 0
   
213 
   214     ...
   
215 
   216 readelf: exit: Error: Reading 32 bytes extends past end of
   
217 file for program headers
   
218 
   219     Yeah, so it looks like my program header offset might be
   
220     wrong. But otherwise, the decoding looks correct!
   
221 
   222     Next night: Okay, I don't see anything wrong with my
   
223     header data (program header offset), so I'm gonna try
   
224     just writing out a program (word) and see what
   
225     happens.
   
226 
   227     I'm overwriting the program size portion of the program
   
228     header in data and then writing the header, *then*
   
229     writing the actual program after that. Every time I call
   
230     'make_elf' my elf_header data will contain the last
   
231     word's size that was written.
   
232 
   233     Anyway, here goes:
   
234 
   235 $ mr
   
236 make_elf exit
   
237 prog bytes: 00000008
   
238 new fd: 00000003
   
239 Goodbye.
   
240 
   241     My 'exit' word is 8 bytes, that sounds right.
   
242 
   243     What does file say?
   
244 
   245 $ file exit
   
246 exit: ELF 32-bit LSB executable, Intel 80386, version 1
   
247 (SYSV), statically linked, no section header
   
248 
   249     Ooh! No more errors there!
   
250 
   251     And readelf?
   
252 
   253 $ readelf exit
   
254 ELF Header:
   
255   Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
   
256   Class:                             ELF32
   
257   Data:                              2's complement, little endian
   
258   Version:                           1 (current)
   
259   OS/ABI:                            UNIX - System V
   
260   ABI Version:                       0
   
261   Type:                              EXEC (Executable file)
   
262   Machine:                           Intel 80386
   
263   Version:                           0x1
   
264   Entry point address:               0x8048000
   
265   Start of program headers:          52 (bytes into file)
   
266   Start of section headers:          0 (bytes into file)
   
267   Flags:                             0x0
   
268   Size of this header:               52 (bytes)
   
269   Size of program headers:           32 (bytes)
   
270   Number of program headers:         1
   
271   Size of section headers:           0 (bytes)
   
272   Number of section headers:         0
   
273   Section header string table index: 0
   
274 
   275 ...
   
276 
   277 Program Headers:
   
278   Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
   
279   LOAD           0x000000 0x08048100 0x00000000 0x00008 0x00008 R E 0
   
280 
   281 ...
   
282 
   283     Cool! That looks good. My program takes up 8 bytes in
   
284     memory TOTAL. It doesn't allocate ANY memory for a stack
   
285     or data or anything, which is correct.
   
286 
   287     Next morning (I fell asleep): Now for the moment of
   
288     truth, does the program properly exit?
   
289 
   290 $ ./exit
   
291 bash: ./exit: Permission denied
   
292 
   293     LOL. Yeah, it literally doesn't have execute permission:
   
294 
   295 -rw-r--r-- 1 dave users 92 Dec 30 09:04 exit
   
296 
   297     Weird. That's not the permissions I thought I was
   
298     setting via the edx register for the sys 'write' call:
   
299 
   300         mov edx, 555o ; mode (permissions)
   
301 
   302     Well, I'll figure that out in a bit. Right now I just
   
303     wanna see if I can run this thing.
   
304 
   305 $ chmod +x exit
   
306 $ ./exit
   
307 Segmentation fault
   
308 
   309     Oops, nope. Let's see what GDB says about this
   
310     program.
   
311 
   312     Looks like I have to break via explicit address since
   
313     there's no debugging symbols...
   
314 
   315 $ gdb exit
   
316 ...
   
317 (gdb) break *0x08048100
   
318 Breakpoint 1 at 0x8048100
   
319 (gdb) run
   
320 ...
   
321 Segmentation fault.
   
322 
   323     Argh. Shouldn't it have halted at the first instruction?
   
324     Hmm...
   
325 
   326     I'm thinking maybe my program section doesn't have
   
327     execution permissions or something, in which case it
   
328     might die before it can even look at the first
   
329     instruction?
   
330 
   331     Anyway, now I know what I'm gonna start looking at next
   
332     time.
   
333 
   334     Next night: no, the flags (pretty sure they're R=read,
   
335     E=execute) look right for a text/executable segment. And
   
336     at any rate, as near as I can tell (and meow5 wouldn't
   
337     work the way it does if it weren't true), Linux ignores
   
338     the flags anyway!
   
339 
   340     Instead, I had mis-typed the entry point address in the
   
341     main header vs the program header. Now I've made them
   
342     the same:
   
343 
   344 $ readelf -a exit
   
345 ...
   
346   Entry point address:               0x8048100
   
347 ...
   
348 Program Headers:
   
349   Type           Offset   VirtAddr    ...
   
350   LOAD           0x000000 0x08048100  ...
   
351 
   352     Kinda weird that there's a leading 0 on one, but not the
   
353     other, right? But I don't see any harm per se. Also, the
   
354     meow5 executable shows the same thing (though it
   
355     executes starting in the second segment and I don't
   
356     claim to entirely understand the program segment
   
357     addressing yet, so I may well be missing something
   
358     important. I need to read that chapter of the ELF
   
359     document properly...)
   
360 
   361     Anyway, does it work now?
   
362 
   363 $ ./exit
   
364 Segmentation fault
   
365 
   366     Bah.
   
367 
   368     Okay, let's see if I can figure out some stuff with GDB.
   
369 
   370 (gdb) file exit
   
371 Reading symbols from exit...
   
372 (No debugging symbols found in exit)
   
373 (gdb) info file
   
374 Symbols from "/home/dave/meow5/exit".
   
375 
   376     Hmmm. I thought 'info file' would at least show the
   
377     entry point, but no luck there.
   
378 
   379 (gdb) break *0x08048100
   
380 Breakpoint 1 at 0x8048100
   
381 (gdb) run
   
382 Starting program: /home/dave/meow5/exit
   
383 During startup program terminated with signal SIGSEGV,
   
384 Segmentation fault.
   
385 
   386     Another mystery. Well, my meow5 executable starts each
   
387     LOAD segment at even 1000 byte marks - which I guess has
   
388     something to do with page sizes? (Again, I need to read
   
389     that ELF document chapter, and I will, but I just wanna
   
390     see this working!)
   
391 
   392     So I updated my addresses to 0x08048000 at an even 1000
   
393     (in hex). I double-checked them with 'readelf -hl exit',
   
394     which I'll spare you from here.
   
395 
   396     But running it:
   
397 
   398 $ ./exit
   
399 Segmentation fault
   
400 
   401     Argh.
   
402 
   403     I'll take a look with GDB:
   
404 
   405 (gdb) file exit
   
406 Reading symbols from exit...
   
407 (No debugging symbols found in exit)
   
408 (gdb) r
   
409 Starting program: /home/dave/meow5/exit
   
410 
   411 Program received signal SIGSEGV, Segmentation fault.
   
412 0x08048047 in ?? ()
   
413 
   414     Wait a second! That *is* progress. Now it's showing me
   
415     the address of the crash. I wasn't getting that before.
   
416     And it looks like it's crashing 47 bytes into memory
   
417     (which is way larger than my exit code). So it could be
   
418     that my program just isn't executing correctly...
   
419 
   420     So I'll set a breakpoint at the entry point (with GBD's
   
421     '*' address syntax) and see if I can figure out how to
   
422     view what's running.
   
423 
   424 (gdb) break *0x08048000
   
425 Breakpoint 1 at 0x8048000
   
426 (gdb) r
   
427 The program being debugged has been started already.
   
428 Start it from the beginning? (y or n) y
   
429 Starting program: /home/dave/meow5/exit
   
430 
   431 Breakpoint 1, 0x08048000 in ?? ()
   
432 
   433     Cool! I've finally paused the darn thing.
   
434 
   435 (gdb) disass *0x08048000
   
436 No function contains specified address.
   
437 
   438     I guess without symbols, 'disassemble' won't cooperate?
   
439     Can I at least step?
   
440 
   441 (gdb) s
   
442 Cannot find bounds of current function
   
443 
   444     Oh, right. I know this one. There's a separate 'stepi'
   
445     to step through the program at the instruction level
   
446     since there are no 'lines' to step through!
   
447 
   448 (gdb) stepi
   
449 0x08048047 in ?? ()
   
450 
   451     Huh? Why am I now at that '...8047' address?
   
452 
   453     Turns out there's an 'i' format that will display
   
454     whatever memory you want as an instruction. So, after
   
455     the fact, here's that first instruction we just ran:
   
456 
   457 (gdb) x/i 0x08048000
   
458    0x8048000:	jg     0x8048047
   
459 
   460     Ha ha, well, that certainly explains what's happening.
   
461     But how did that get there? Here's the bytes of that
   
462     machine code:
   
463 
   464 (gdb) x/x 0x8048000
   
465 0x8048000:	0x464c457f
   
466 
   467     Since it's so tiny, I'm just gonna hex dump exit
   
468     entirely to see where that is:
   
469 
   470 00000000: 7f45 4c46 0101 0100 0000 0000 0000 0000  .ELF............
   
471 00000010: 0200 0300 0100 0000 0080 0408 3400 0000  ............4...
   
472 00000020: 0000 0000 0000 0000 3400 2000 0100 0000  ........4. .....
   
473 00000030: 0000 0000 0100 0000 0000 0000 0080 0408  ................
   
474 00000040: 0000 0000 0800 0000 0800 0000 0500 0000  ................
   
475 00000050: 0000 0000 5bb8 0100 0000 cd80            ....[.......
   
476 
   477     Ha ha, I see it right away (though little-endian always
   
478     makes it harder because the bytes are reversed).
   
479 
   480     The memory we're trying to execute is the 'ELF' magic
   
481     string from the header!
   
482 
   483     Okay, apparently I really need to read that chapter
   
484     about program segments and how they're loaded into
   
485     memory now.
   
486 
   487     But I gotta say, I really don't regret getting this
   
488     wrong to begin with. Now I have a concrete example of
   
489     what's happening and the information in that chapter is
   
490     going to make *so* much more sense to me. Sometimes
   
491     getting it right the first time "by the book" doesn't
   
492     teach me nearly as much as getting it wrong on my own
   
493     and *then* learning how to do it properly. It just
   
494     sticks better.
   
495 
   496     Some number of nights later: First of all, the file
   
497     creation permissions here _were_ working. I've also
   
498     updated them to 755:
   
499 
   500         mov edx, 755o ; mode (permissions)
   
501 
   502     Which shows up correctly:
   
503 
   504 $ ls -l exit
   
505 -rwxr-xr-x 1 dave users 92 Jan  3 22:01 exit
   
506 
   507     And as for my executable trying to run the ELF header
   
508     itself...ha ha, well, I did read Part 2:  "Program
   
509     Loading and Dynamic Linking" of the System V ELF spec
   
510     and the answer was so simple, it was downright silly.
   
511 
   512     When you specify that the ELF executable wants to load
   
513     the file segment into (one of) the program's virtual
   
514     memory segments (which is what my single "LOAD" type
   
515     program header is requesting), it will load the ELF
   
516     header itself, followed by whatever data (or machine
   
517     code, in this case) follows the header.
   
518 
   519     So you always need to account for the ELF header when
   
520     determining the execution entry point address.
   
521 
   522     In other words, where I was pointing to the very first
   
523     byte of my requested virtual address:
   
524 
   525       dd 0x08048000 ; entry     - Execution start address
   
526 
   527     I needed to offset it by the elf header size:
   
528 
   529       dd elf_va + elf_size ; entry - execution start address
   
530 
   531     Oh, right, and I also made a NASM macro to contain that
   
532     address so I wouldn't have the bare value in multiple
   
533     places:
   
534 
   535       %assign elf_va 0x08048000 ; elf virt mem start address
   
536 
   537     Okay, crossing my fingers and toes...
   
538 
   539 $ mr
   
540 make_elf exit
   
541 prog bytes: 00000008
   
542 new fd: 00000003
   
543 Goodbye.
   
544 Exit status: 0
   
545 $ ./exit
   
546 $ 
   
547 
   548     Gasp! It worked! My executable exited cleanly! That can
   
549     only happen if the exit syscall was called correctly.
   
550 
   551     But a *real* test would be to call the exit syscall with
   
552     a unique value so we can *see* it doing something.
   
553 
   554     Do I dare hope? I'm going to try making a new word with
   
555     a constant value and "calling" the 'exit' word and see
   
556     if I can write that out as a new ELF executable:
   
557 
   558 $ mr
   
559 : foo 42 exit ;
   
560 make_elf foo
   
561 prog bytes: 0000000d
   
562 new fd: 00000003
   
563 Goodbye.
   
564 Exit status: 0
   
565 
   566     Indeed, that wrote a 97 byte ELF file containing 0xD
   
567     (13) bytes of machine code:
   
568 
   569 $ ls -l foo
   
570 -rwxr-xr-x 1 dave users 97 Jan  3 22:25 foo
   
571 
   572     But does it work?!
   
573 
   574     Drum roll...
   
575 
   576 $ ./foo
   
577 $ echo $?
   
578 42
   
579 $
   
580 
   581     Ha ha! No way! 
   
582 
   583     It totally works.
   
584 
   585     Initial ELF creation is a success!
   
586 
   587     I think I'll figure out how to handle memory in my ELF
   
588     output next. It would be amazing to be able to write a
   
589     stand-alone executable that prints "Meow. Meow. Meow..."
   
590 
   591     See you in the next log!