1     In this log, I'd like to figure out how to handle memory
     
2     access in my output ELF executable.
     
3 
     4     The first challenge will be to figure out how to keep
     
5     track of things like strings that have been stored in
     
6     the running interpreter's memory...which need to be
     
7     referenced by the written program.
     
8 
     9     Quite frankly, my first shot at it might have to be a
    
10     total hack. I don't really have any proper mechanism for
    
11     keeping track of what's been used, so I'll probably
    
12     just write ALL of the currently used memory, even if
    
13     it's not actually referenced.
    
14 
    15     Actually, just thinking about this is giving me all
    
16     sorts of wild ideas about how you could "save" the
    
17     current state of the whole interpreter as an executable
    
18     that picks right up where you left off last time...
    
19 
    20     That would certainly be unique.
    
21 
    22     Anyway, first I gotta figure out how to write memory and
    
23     make it accessible...
    
24 
    25 
    26                     *********************
    
27                     * Five months pass. *
    
28                     *********************
    
29 
    30     Soooooooo.... Here's what happened.
    
31 
    32     In order to have my ELF executables allocate memory on
    
33     startup, I added a second "program" segment of type
    
34     LOAD.
    
35 
    36     Any strings would be stored in the "lower" part of that
    
37     segment. (And maybe additional memory would be allocated
    
38     to use as scratch space in the program?)
    
39 
    40     I was able to stumble my way into the single working
    
41     segment in the last log.
    
42 
    43     But adding a second segment taxed my feeble
    
44     understanding to the breaking point.
    
45 
    46     Furthermore, I had some extremely rare personal project
    
47     deadlines come up in January and February. So my
    
48     nighttime reserves were even more pathetic than usual.
    
49 
    50     Anyway, the resultant executable always segfaulted:
    
51 
    52 $ mr
    
53 : foo "Meow." say ;
    
54 make_elf foo
    
55 prog bytes: 000000ce
    
56 data offset: 00000142
    
57 new fd: 00000003
    
58 Goodbye.
    
59 Exit status: 0
    
60 
    61 $ ./foo
    
62 Segmentation fault
    
63 
    64     And it was clear that just trying to poke it from
    
65     different angles wasn't going to "accidentally" make it
    
66     work. I needed some real insight.
    
67 
    68     As always, GDB was resistant to considering anything but
    
69     a C exectuable as worthy of examination.
    
70 
    71 
    72     I tried to find tools to assist me in understanding what
    
73     I was doing wrong, but nothing made it easy enough for
    
74     me to "get it". I was exhausted and the information just
    
75     wasn't penetrating my thick skull.
    
76 
    77     Note: One of the tools I tried out was Radare 2:
    
78 
    79       https://en.wikipedia.org/wiki/Radare2
    
80 
    81       "Radare2 (also known as r2) is a complete framework
    
82       for reverse-engineering and analyzing binaries;
    
83       composed of a set of small utilities that can be used
    
84       together or independently from the command line. Built
    
85       around a disassembler for computer software which
    
86       generates assembly language source code from
    
87       machine-executable code, it supports a variety of
    
88       executable formats for different processor
    
89       architectures and operating systems."
    
90 
    91     That was a really fun excursion and r2 is an amazing
    
92     tool (well, tools). But though it was doing a better job
    
93     than GDB with my crazy ELF executable output, it still
    
94     wasn't giving me any magical insight into the problem.
    
95 
    96     Reluctantly, I shelved it.
    
97 
    98     I finished some projects and started to feel better
    
99     about my place in the universe.
   
100 
   101     And then inspiration struck. Here is what I would do:
   
102 
   103         Write my own stupid tool to read the ELF file.
   
104         Write it in Zig.
   
105     
   106     This would solve three problems at once:
   
107 
   108         1. By writing the tool, I would be able to fully
   
109            understand the ELF header format. (Programming
   
110            and writing help me think.)
   
111         2. By writing it in Zig, I would finally have a
   
112            concrete project to kick-start me back into the
   
113            Zig world from which I'd been absent for (gasp)
   
114            nearly two years!
   
115         3. Hopefully the tool would actually help me figure
   
116            out how to correctly write the ELF header!
   
117 
   118     Well, it's been a little over two weeks of tiny,
   
119     incremental nighttime progress, and I'm pleased to say
   
120     that the tool is absolutely everything I had hoped it
   
121     would be...and it was not even remotely hard to make:
   
122 
   123         http://ratfactor.com/repos/mez/
   
124 
   125             MEZ = Meow5 + ELF + Zig
   
126 
   127     Here's the output (it automatically reads "foo" and foo
   
128     is the program example you saw above):
   
129 
   130 $ ./mez
   
131 -----------------------------------------
   
132 Main ELF Header
   
133   0-3 - four bytes of magic (0x7f,'ELF'): Matched!
   
134     4 - 32-bit, as expected.
   
135     5 - little-endian, as expected.
   
136 24-27 - Program entry addr: 0x08048000
   
137 28-31 - Program header offset (in this file): 0x34
   
138 32-35 - Section header offset (in this file): 0x0
   
139 40-41 - Size of this header: 52 bytes
   
140 42-43 - Size of program header entries: 32 bytes
   
141 44-45 - Number of program entries: 2
   
142 -----------------------------------------
   
143 Program Header @ 0x34
   
144   Segment type: 1 ('load', as expected)
   
145   File offset: 0x0
   
146   File size: 4096 bytes
   
147   Target memory start: 0x8048000
   
148   Target memory size: 4096 bytes
   
149   Memory mapping:
   
150     +--------------------+     +--------------------+
   
151     | File               | ==> | Memory             |
   
152     |====================|     |====================|
   
153     | 0x0                |     | 0x08048000         |
   
154     |   Load: 4096       |     |   Alloc: 4096      |
   
155     | 0x1000             |     | 0x08049000         |
   
156     +--------------------+     +--------------------+
   
157 -----------------------------------------
   
158 Program Header @ 0x54
   
159   Segment type: 1 ('load', as expected)
   
160   File offset: 0x142
   
161   File size: 5 bytes
   
162   Target memory start: 0x8049000
   
163   Target memory size: 10 bytes
   
164   Memory mapping:
   
165     +--------------------+     +--------------------+
   
166     | File               | ==> | Memory             |
   
167     |====================|     |====================|
   
168     | 0x142              |     | 0x08049000         |
   
169     |   Load: 5          |     |   Alloc: 10        |
   
170     | 0x147              |     | 0x0804900a         |
   
171     +--------------------+     +--------------------+
   
172 -----------------------------------------
   
173 
   174     Look at how pretty that is! Look at the ASCII art boxes
   
175     showing how the file is being mapped into memory!
   
176 
   177     Compare that to readelf's program header output:
   
178 
   179 $ readelf -l foo
   
180 
   181 Elf file type is EXEC (Executable file)
   
182 Entry point 0x8048000
   
183 There are 2 program headers, starting at offset 52
   
184 
   185 Program Headers:
   
186   Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
   
187   LOAD           0x000000 0x08048000 0x00000000 0x01000 0x01000 RWE 0x1000
   
188   LOAD           0x000142 0x08049000 0x08049000 0x00005 0x0000a RWE 0x1000
   
189 
   190     It's the same info, but I can visualize mine.
   
191 
   192     Anyway, that's just me being excited about this little
   
193     tool.
   
194 
   195     Now let's see if I can fix this executable!
   
196 
   197     The first problem is that program entry address.
   
198 
   199     If I'm loading the whole file into memory at this
   
200     virtual address:
   
201 
   202         0x08048000
   
203 
   204     And I'm executing at that same address...well, then it's
   
205     trying to run whatever you get when you try to execute
   
206     the ELF header itself as an executable.
   
207 
   208     Apparently that was one of the last things I'd been
   
209     mucking around with when I quit because this totally
   
210     worked in my previous single-segment executable.
   
211 
   212     Rather than load everything from the start of the file
   
213     into that address and then execute starting at an
   
214     offset, it would make sense to *load* from the offset so
   
215     the only thing in memory is my actual executable
   
216     program, right?
   
217 
   218     In other words, why load the whole file into memory like
   
219     this:
   
220 
   221         0x08048000 ELF Header
   
222         0x08048074 Program code
   
223 
   224     (And setting the program entry address to the 0x74 byte
   
225     offset.)
   
226 
   227     When I could load just the program code and keep the
   
228     program entry address the way it is:
   
229 
   230         0x08048000 Program code
   
231 
   232     I'm going to do that and also clean up the mess I left
   
233     for myself in these two program header definitions. It
   
234     looks much better (trust me). You can find them in the
   
235     assembly source with labels 'phdr1' and 'phdr2'.
   
236 
   237 $ mr
   
238 : foo "Meow." say ;
   
239 make_elf foo
   
240 prog bytes: 000000ce
   
241 data offset (header+prog): 00000142
   
242 new fd: 00000003
   
243 Goodbye.
   
244 Exit status: 0
   
245 
   246     It still segfaults:
   
247 
   248 $ ./foo
   
249 Segmentation fault
   
250 
   251     But my MEZ output looks right:
   
252 
   253     ...
   
254     24-27 - Program entry addr: 0x08048000
   
255     ...
   
256     Program Header @ 0x34
   
257         +--------------------+     +--------------------+
   
258         | File               | ==> | Memory             |
   
259         |====================|     |====================|
   
260         | 0x74               |     | 0x08048000         |
   
261         |   Load: 206        |     |   Alloc: 206       |
   
262         | 0x142              |     | 0x080480ce         |
   
263         +--------------------+     +--------------------+
   
264     ...
   
265     Program Header @ 0x54
   
266         +--------------------+     +--------------------+
   
267         | File               | ==> | Memory             |
   
268         |====================|     |====================|
   
269         | 0x142              |     | 0x08049000         |
   
270         |   Load: 11         |     |   Alloc: 11        |
   
271         | 0x14d              |     | 0x0804900b         |
   
272         +--------------------+     +--------------------+
   
273 
   274     The problem is that I'm running blind in terms of what
   
275     should be at that start address.
   
276 
   277     I just found out about the disassembler that comes with
   
278     NASM and I see that I can ask it to give me the
   
279     disassembly of a file starting at an offset (by
   
280     "skipping" bytes starting at offset 0):
   
281 
   282 $ ndisasm -k 0,0x74 foo
   
283 00000000  skipping 0x74 bytes
   
284 00000074  68E0C1            push word 0xc1e0
   
285 00000077  0408              add al,0x8
   
286 00000079  5E                pop si
   
287 0000007A  B90000            mov cx,0x0
   
288 ...
   
289 
   290     But it's been way too long since I was intimate enough
   
291     with my initial "code words" to recognize that
   
292     disassembly.
   
293 
   294     So... I think what I would like is to add a new word to
   
295     meow5 that dumps the raw machine code of a word so I can
   
296     simply *find* it in the executable.
   
297 
   298     I already have 'inspect', which prints info from a
   
299     word's tail. It seems like it would be pretty straight
   
300     forward to use that as the starting point for a
   
301     'dump-word' word. I also have 'ps' (print stack) that I
   
302     forgot about, which is a perfect example of printing a
   
303     space-delimited list of numbers.
   
304 
   305     After a bit of trial-and-error (I'm rusty, but it's
   
306     coming back to me!), I've got _something_. Let's do a
   
307     word with a nice short definition:
   
308     
   309         DEFWORD inc
   
310             pop ecx
   
311             inc ecx
   
312             push ecx
   
313         ENDWORD inc, 'inc', (IMMEDIATE | COMPILE)
   
314 
   315 "inc" find dump-word
   
316 a1 51 41 59
   
317 
   318     I can write that as actual binary data by using xxd's
   
319     reverse operation:
   
320 
   321 $ echo a1 51 41 59 | xxd -r -p > inc.bin
   
322 
   323     But disassembling that isn't right:
   
324 
   325 $ ndisasm get.bin
   
326 00000000  A15141            mov ax,[0x4151]
   
327 00000003  59                pop cx
   
328 
   329     I found a nice little x86 instruction chart:
   
330 
   331     http://sparksandflames.com/files/x86InstructionChart.html
   
332 
   333     Let's see...
   
334 
   335         51   push ecx
   
336         41   inc ecx
   
337         59   pop ecx
   
338 
   339     Those are right, but in reverse order. This is one of
   
340     those real dumb bugs, isn't it?
   
341 
   342     Yup, real dumb. It's getting late.
   
343 
   344     Second try:
   
345 
   346 "inc" find dump-word
   
347 59 41 51
   
348 
   349     That looks good! Now a different one as a sanity check
   
350     and let's see if it disassembles correctly:
   
351 
   352 $ echo 59 49 51 | xxd -r -p | ndisasm  -
   
353 00000000  59                pop cx
   
354 00000001  49                dec cx
   
355 00000002  51                push cx
   
356 
   357     Ha! Yup!
   
358 
   359     You know what? I could totally be piping repeated
   
360     commands like this into meow5. I just need to get rid of
   
361     that "Goodbye." message at the end (I think that was
   
362     more of a diagnostic feel-good message when I added it
   
363     anyway.)
   
364 
   365     Done.
   
366 
   367 $ echo '"Hello command line!" say' | ./meow5
   
368 Hello command line!
   
369 
   370     Oh man, this is gonna save me so much time.
   
371 
   372     And I'll make a super simple word and test it:
   
373 
   374 $ echo ': foo 42 exit ; foo' | ./meow5
   
375 $ echo $?
   
376 42
   
377 
   378     Let's disassemble that foo:
   
379 
   380 $ echo ': foo 42 exit ; "foo" find dump-word' | ./meow5 > foo.hex
   
381 $ cat foo.hex
   
382 68 2a 0 0 0 5b b8 1 0 0 0 cd 80
   
383 $ xxd -r -p foo.hex | ndisasm -
   
384 00000000  682A00            push word 0x2a
   
385 00000003  05BB81            add ax,0x81bb
   
386 00000006  000C              add [si],cl
   
387 00000008  D8                db 0xd8
   
388 
   389     Hmmm... it starts off correct and then goes rapidly
   
390     downhill...oh, I think I see. Those single-digit 0s are
   
391     getting squished into nibbles rather than whole bytes?
   
392 
   393     Well, adding number formatting is way outside the scope
   
394     of this particular test, so I'm gonna just manually add
   
395     leading zeros on the file and see what happens.
   
396 
   397 $ vim foo.hex
   
398 $ xxd -r -p foo.hex | ndisasm -
   
399 00000000  682A00            push word 0x2a
   
400 00000003  0000              add [bx+si],al
   
401 00000005  5B                pop bx
   
402 00000006  B81000            mov ax,0x10
   
403 00000009  000C              add [si],cl
   
404 0000000B  D8                db 0xd8
   
405 
   406     Maybe? So let's see: push 0x2a (42) is correct.
   
407     Adding whatever is in al to the address at bx+si seems
   
408     weird.
   
409 
   410     Here's the source of the 'exit' word:
   
411 
   412         pop ebx ; param1: exit code
   
413         mov eax, SYS_EXIT
   
414         int 0x80
   
415 
   416     So that should be 1 for SYS_EXIT ...
   
417 
   418     Wait, wait, wait WAIT!
   
419 
   420     I just read the man page for ndisasm - it's in 16-bit
   
421     assembly mode by default! The -u switch puts it in
   
422     32-bit mode!
   
423 
   424 $ xxd -r -p foo.hex | ndisasm -u -
   
425 00000000  682A000000        push dword 0x2a
   
426 00000005  5B                pop ebx
   
427 00000006  B81000000C        mov eax,0xc000010
   
428 0000000B  D8                db 0xd8
   
429 
   430     Well, that's certainly closer!
   
431 
   432      Hmm... Okay, I want to see what that assembly should
   
433      be:
   
434 
   435 $ cat exit42.asm
   
436 section .text
   
437 
   438 global _start
   
439 _start:
   
440     push 42
   
441     pop ebx
   
442     mov eax, 1
   
443     int 0x80
   
444 
   445 $ nasm -w+all -g -f elf32 -o exit42.o exit42.asm
   
446 $ ld -m elf_i386 exit42.o -o exit42
   
447 $ ./exit42
   
448 $ echo $?
   
449 42
   
450 
   451     And evidently objdump is what I want to get the
   
452     disassembly of a portion of an ELF executable:
   
453 
   454 $ objdump -d exit42
   
455 exit42:     file format elf32-i386
   
456 Disassembly of section .text:
   
457 
   458 08049000 <_start>:
   
459  8049000:	6a 2a                	push   $0x2a
   
460  8049002:	5b                   	pop    %ebx
   
461  8049003:	b8 01 00 00 00       	mov    $0x1,%eax
   
462  8049008:	cd 80                	int    $0x80
   
463 
   464     And let's see that foo source again:
   
465     
   466 00000000  682A000000        push dword 0x2a
   
467 00000005  5B                pop ebx
   
468 00000006  B81000000C        mov eax,0xc000010
   
469 0000000B  D8                db 0xd8
   
470 
   471     Okay, the beginning is different only by a mov versus
   
472     mov dword. I can't seem to get nasm to generate the
   
473     dword verison, but otherwise they're the same
   
474     instruction mnemonic and the code still makes sense.
   
475 
   476     Then further down, clearly we're off by 01 versus 10
   
477     and then a different number of 0s.
   
478 
   479     Oh, this is just a leading 0 problem. I got all of the
   
480     single 0s, but didn't add a leading 0 to the single 1.
   
481     Okay, no problem:
   
482 
   483 $ vim foo.hex
   
484 $ xxd -r -p foo.hex | ndisasm -u -
   
485 00000000  682A000000        push dword 0x2a
   
486 00000005  5B                pop ebx
   
487 00000006  B801000000        mov eax,0x1
   
488 0000000B  CD80              int 0x80
   
489 
   490     That's the stuff! This is 100% the correct disassembly
   
491     of the "foo" word as defined.
   
492 
   493     So now I know what I should be seeing in the compiled
   
494     ELF created by Meow5.
   
495 
   496     And sure enough, that's exactly what's in there:
   
497 
   498 $ xxd -s 0x74 -l 0xc  foo
   
499 00000074: 682a 0000 005b b801 0000 00cd            h*...[......
   
500 
   501     Why does this crash?
   
502 
   503     Next night:
   
504 
   505     I did some reading. Check this out:
   
506 
   507 $ ./foo
   
508 Segmentation fault
   
509 $ sudo dmesg | tail
   
510 ...
   
511 [   35.663071] process '/dave/meow5/foo' started with executable stack
   
512 
   513     This whole time, Linux has been logging an error for my
   
514     executable and I didn't even realize it.
   
515 
   516     I've got an "executable stack". (Uh, I don't have a
   
517     stack at all, but I'm guessing this is what the Linux
   
518     loader _thinks_ that second segment is for.
   
519 
   520     Okay, so I'll just change the permission flags on the
   
521     second segment to remove exec:
   
522 
   523         ; flags: 1=exec, 2=write, 4=read (7=RWX)
   
524         dd         6 
   
525 
   526 $ ./foo
   
527 Segmentation fault
   
528 $ sudo dmesg | tail
   
529 ...
   
530 
   531     Okay, so maybe that wasn't it. The "executable stack"
   
532     message went away, but the segfault did not.
   
533 
   534     Okay, now I'm commenting out everything to do with the
   
535     second segment (the whole second program header, the
   
536     test string data, and anything that referenced them.
   
537 
   538     Here it is now:
   
539 
   540 $ echo ': foo 42 exit ; make_elf foo' | ./meow5
   
541 prog bytes: 0000000d
   
542 data offset (header+prog): 00000061
   
543 new fd: 00000003
   
544 
   545 $ readelf -h -l foo
   
546 ELF Header:
   
547 ...
   
548   Entry point address:               0x8048000
   
549   Start of program headers:          52 (bytes into file)
   
550 ...
   
551 Program Headers:
   
552   Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
   
553   LOAD           0x000054 0x08048000 0x00000000 0x0000d 0x0000d RWE 0
   
554 
   555 $ ./foo
   
556 
   557 $ mez
   
558 -----------------------------------------
   
559 Main ELF Header
   
560   0-3 - four bytes of magic (0x7f,'ELF'): Matched!
   
561     4 - 32-bit, as expected.
   
562     5 - little-endian, as expected.
   
563 24-27 - Program entry addr: 0x08048000
   
564 28-31 - Program header offset (in this file): 0x34
   
565 32-35 - Section header offset (in this file): 0x0
   
566 40-41 - Size of this header: 52 bytes
   
567 42-43 - Size of program header entries: 32 bytes
   
568 44-45 - Number of program entries: 1
   
569 -----------------------------------------
   
570 Program Header @ 0x34
   
571   Segment type: 1 ('load', as expected)
   
572   File offset: 0x54
   
573   File size: 13 bytes
   
574   Target memory start: 0x8048000
   
575   Target memory size: 13 bytes
   
576   Memory mapping:
   
577     +--------------------+     +--------------------+
   
578     | File               | ==> | Memory             |
   
579     |====================|     |====================|
   
580     | 0x54               |     | 0x08048000         |
   
581     |   Load: 13         |     |   Alloc: 13        |
   
582     | 0x61               |     | 0x0804800d         |
   
583     +--------------------+     +--------------------+
   
584 
   585 $ ./foo
   
586 Segmentation fault
   
587 
   588     Well, blast it! Looks like my problem has something to
   
589     do with some *other* change I've made since I last had
   
590     ELF output working.
   
591 
   592     And just about the only thing I changed was the segment
   
593     file offset and entrypoint.
   
594 
   595     So I'm going to change those back to what I had before.
   
596     Here's the relevant parts from mez:
   
597 
   598 ...
   
599 24-27 - Program entry addr: 0x08048054
   
600 ...
   
601   Memory mapping:
   
602     +--------------------+     +--------------------+
   
603     | File               | ==> | Memory             |
   
604     |====================|     |====================|
   
605     | 0x0                |     | 0x08048054         |
   
606     |   Load: 13         |     |   Alloc: 13        |
   
607     | 0xd                |     | 0x08048061         |
   
608     +--------------------+     +--------------------+
   
609 
   610     So how about now?
   
611 
   612 $ ./foo
   
613 Segmentation fault
   
614 
   615     Ugh.
   
616 
   617     Well, on the plus side, I've certainly ruled a lot of
   
618     things out...
   
619 
   620     Wait, no, I didn't look carefully enough at it. The
   
621     virtual memory address is wrong. I was being too hasty
   
622     in changing it back. The virtual memory address should
   
623     still be 0x08048000, but the program entry address is
   
624     what will change to make up for the lack of file offset.
   
625 
   626     Okay, I'll fix that:
   
627 
   628 ...
   
629 24-27 - Program entry addr: 0x08048054
   
630 ...
   
631   Memory mapping:
   
632     +--------------------+     +--------------------+
   
633     | File               | ==> | Memory             |
   
634     |====================|     |====================|
   
635     | 0x0                |     | 0x08048000         |
   
636     |   Load: 13         |     |   Alloc: 13        |
   
637     | 0xd                |     | 0x0804800d         |
   
638     +--------------------+     +--------------------+
   
639 
   640     And now?
   
641 
   642 $ ./foo
   
643 $ echo $?
   
644 42
   
645 
   646     Okay! So for some reason, loading from a tiny file
   
647     offset wasn't working. I could figure out why, but
   
648     frankly, that's not even remotely related to my quest
   
649     with this application.
   
650 
   651     Let's see if all is well when I add that second segment
   
652     back now...
   
653 
   654 $ mez
   
655 -----------------------------------------
   
656 Main ELF Header
   
657   0-3 - four bytes of magic (0x7f,'ELF'): Matched!
   
658     4 - 32-bit, as expected.
   
659     5 - little-endian, as expected.
   
660 24-27 - Program entry addr: 0x08048074
   
661 28-31 - Program header offset (in this file): 0x34
   
662 32-35 - Section header offset (in this file): 0x0
   
663 40-41 - Size of this header: 52 bytes
   
664 42-43 - Size of program header entries: 32 bytes
   
665 44-45 - Number of program entries: 2
   
666 -----------------------------------------
   
667 Program Header @ 0x34
   
668   Segment type: 1 ('load', as expected)
   
669   File offset: 0x0
   
670   File size: 13 bytes
   
671   Target memory start: 0x8048000
   
672   Target memory size: 13 bytes
   
673   Memory mapping:
   
674     +--------------------+     +--------------------+
   
675     | File               | ==> | Memory             |
   
676     |====================|     |====================|
   
677     | 0x0                |     | 0x08048000         |
   
678     |   Load: 13         |     |   Alloc: 13        |
   
679     | 0xd                |     | 0x0804800d         |
   
680     +--------------------+     +--------------------+
   
681 -----------------------------------------
   
682 Program Header @ 0x54
   
683   Segment type: 1 ('load', as expected)
   
684   File offset: 0x0
   
685   File size: 11 bytes
   
686   Target memory start: 0x8049000
   
687   Target memory size: 11 bytes
   
688   Memory mapping:
   
689     +--------------------+     +--------------------+
   
690     | File               | ==> | Memory             |
   
691     |====================|     |====================|
   
692     | 0x0                |     | 0x08049000         |
   
693     |   Load: 11         |     |   Alloc: 11        |
   
694     | 0xb                |     | 0x0804900b         |
   
695     +--------------------+     +--------------------+
   
696 
   697     And will it run?
   
698 
   699 $ ./foo ; echo $?
   
700 42
   
701 
   702     It does run!
   
703 
   704     Well, that is certainly interesting!
   
705 
   706     I wonder if you can't specify small LOAD segment file
   
707     offsets for alignment reasons or something?
   
708 
   709     I wish I had been able to figure out how to get GDB or
   
710     Radare 2 to show me a disassembly (or hex dump or
   
711     anything) of the memory that had *actually* been loaded
   
712     from my ELF file.
   
713 
   714     I'm sure r2 could have.
   
715 
   716     But this at least allows me to move forward.
   
717 
   718     So, I've loaded a hard-coded test string into my second
   
719     segment.
   
720 
   721     Let's see if it's there at the segment address I
   
722     specified:
   
723 
   724 $ gdb -q foo
   
725 Reading symbols from foo...
   
726 (No debugging symbols found in foo)
   
727 (gdb) info file
   
728 Symbols from "/home/dave/meow5/foo".
   
729 (gdb) break *0x08048074
   
730 Breakpoint 1 at 0x8048074
   
731 (gdb) r
   
732 Starting program: /home/dave/meow5/foo 
   
733 
   734 Breakpoint 1, 0x08048074 in ?? ()
   
735 
   736     I can debug again!
   
737 
   738 (gdb) info proc
   
739 process 2371
   
740 cmdline = '/home/dave/meow5/foo'
   
741 cwd = '/home/dave/meow5'
   
742 exe = '/home/dave/meow5/foo'
   
743 (gdb)
   
744 [1]+  Stopped                 gdb -q foo
   
745 $ ls /proc/2371/maps
   
746 /proc/2371/maps
   
747 $ cat /proc/2371/maps
   
748 08048000-08049000 rwxp 00000000 08:04 8537091                            /home/dave/meow5/foo
   
749 08049000-0804a000 rwxp 00000000 08:04 8537091                            /home/dave/meow5/foo
   
750 
   751     So I should be able to view the memory in that second
   
752     segment, right?
   
753 
   754 (gdb) x/s 0x08049000
   
755 0x8049000:	"\177ELF\001\001\001"
   
756 
   757     Huh? That's the start of the file.
   
758 
   759 (gdb) x/s 0x08048000
   
760 0x8048000:	"\177ELF\001\001\001"
   
761 
   762     Yeah, the segments are identical.
   
763 
   764     Oh, yeah, the file offset is 0 on both of these. Oops.
   
765     That's just a line I missed when I was uncommenting from
   
766     before.
   
767 
   768     How about now?
   
769 
   770 $ ./foo
   
771 Segmentation fault
   
772 
   773     Oh for...
   
774 
   775     Okay, you know what?
   
776 
   777     This whole multi-segment thing seemed like a great idea
   
778     four months ago, but it's extremely tangential to the
   
779     proof-of-concept that is Meow5.
   
780 
   781     I'm going back to one segment. I just want to see this
   
782     thing output an executable that can print a string!
   
783 
   784     I'll commit all the garbage I've got now just in case I
   
785     change my mind, but then I'm gonna basically revert to
   
786     what I had four months ago.
   
787 
   788     Hey, no regrets, though. It's all learning.
   
789 
   790     Next night: Okay, back in business with a fully
   
791     automated test of 'foo' ELF creation:
   
792 
   793 $ ./dbgfoo.sh 
   
794 Wrote to "foo".
   
795 24-27 - Program entry addr: 0x08048054
   
796   File offset: 0x0
   
797   Target memory start: 0x8048000
   
798   Target memory size: 13 bytes
   
799 Running...
   
800 (Exited with code 42)
   
801 Reading symbols from foo...
   
802 (No debugging symbols found in foo)
   
803 Breakpoint 1 at 0x8048054
   
804 
   805 Breakpoint 1, 0x08048054 in ?? ()
   
806 Dump of assembler code from 0x8048054 to 0x8048061:
   
807 => 0x08048054:	push   $0x2a
   
808    0x08048059:	pop    %ebx
   
809    0x0804805a:	mov    $0x1,%eax
   
810    0x0804805f:	int    $0x80
   
811 End of assembler dump.
   
812 A debugging session is active.
   
813 
   814 	Inferior 1 [process 1743] will be killed.
   
815 
   816 Quit anyway? (y or n) [answered Y; input not from terminal]
   
817 
   818     I'll probably destroy it eventually, so here's the
   
819     current contents of dbgfoo.sh:
   
820 
   821         #!/bin/bash
   
822 
   823         # Have to comment out when testing non-0 return values
   
824         # from test programs:
   
825         # set -e # quit on errors
   
826 
   827         F=meow5
   
828 
   829         # rebuild
   
830         ./build.sh
   
831 
   832         # execute the 'foo' elf maker script in meow5!
   
833         echo ': foo 42 exit ; make_elf foo' | ./$F
   
834 
   835         # examine elf headers with mez
   
836         ../mez/mez 2>&1 | ag 'entry|File offset|Target memory'
   
837 
   838         # try to run it
   
839         echo "Running..."
   
840         ./foo
   
841         echo "(Exited with code $?)"
   
842 
   843         # debug it
   
844         gdb foo -q --command=dbgfoo.gdb
   
845 
   846     and it's companion, gdbfoo.gdb:
   
847 
   848 
   849         # entry point address hard-coded because
   
850         break *0x08048054
   
851 
   852         run
   
853 
   854         disas 0x8048054,+13
   
855 
   856         quit
   
857 
   858     (By the way, I think it's wild that when I saved the
   
859     above with the .gdb extension, Vim applied GDB command
   
860     syntax highlighting. Somebody made a syntax highlighter
   
861     for GDB command scripts. And they had it detect the
   
862     extension I happened to pick. It makes me feel like I
   
863     live in a virtual village with all these other people
   
864     who are doing stuff like this all the time.)
   
865 
   866     I will now close this particular colossal misadventure
   
867     and begin the next one in log12.txt.
   
868 
   869     See you there!