1 # RubyLit - This README is a program!
     
2 
     3 *Note:* This literary program was extended by [Parker Glynn-Adey](https://pgadey.ca).
     
4 The interesting extension aspects of the extension are explained in the sections:
     
5 
     6 * [Usage](#label-Usage)
     
7 * [Batch Generation](#label-Batch+Generation)  
     
8 
     9 <hr>
    
10 
    11 The output of this README.md file is a _program_ that turns documents into:
    
12 
    13 * <a href="html/README.rb.html">README.rb</a> - a Ruby program
    
14 * <a href="html/README.html.html">README.html</a> - a (hideously) formatted HTML document
    
15 
    16 It's a
    
17 [literate program](https://en.wikipedia.org/wiki/Literate_programming)
    
18 (wikipedia.org)
    
19 and the concept comes from Donald Knuth.
    
20 (Update: I've written about what it _felt like_ to make this little program:
    
21 [literate-program](http://ratfactor.com/cards/literate-programming)
    
22 .)
    
23 
    24 In order to get the process rolling, there's a non-literate `stage1.rb` script
    
25 that does the initial "tangling". You'll see what _that_ means in a moment.
    
26 
    27 To make this literate programming stuff work, all source is indented. That not
    
28 only makes it easy to read in source form, but by chosing that method,
    
29 I've also made this document a valid Markdown file, so the README will be
    
30 properly formatted when you view it on the Web as HTML output via tools such as
    
31 [RepoRat](http://ratfactor.com/repos/reporat/).
    
32 
    33 The program is executed in three steps:
    
34 
    35     <<Parse Arguments>>
    
36     <<Tangle>>
    
37     <<Weave>>
    
38 
    39 "Parse Arguments" is the boring stuff, to make sure that we comply with [Usage](#label-Usage).
    
40 "Tangle" extracts a program (or some other text file) and "Weave" creates the documentation.
    
41 Those little `<<bracket things>>` are literate programming macros that
    
42 include source code from other sections of the document (identified by
    
43 Markdown subheadings).
    
44 
    45 ## Usage
    
46 
    47     puts "usage: ruby rubylit.rb INPUT-LIT [OUTPUT] [START]"
    
48 
    49 RubyLit accepts three arguments `rubylit.rb INPUT-LIT [OUTPUT] [START]`.
    
50 The `INPUT-LIT` argument is mandatory and specifies the literate program that RubyLit should process.
    
51 It is specified as a literate markdown file without the extension, for example `README` in place of `README.md`.
    
52 
    53 The arguments `[OUTPUT]` and `[START]` are optional.
    
54 If `OUTPUT` is present, then RubyLit will tangle the literate program in to the file `OUTPUT`.
    
55 If `OUTPUT` is absent, then RubyLit will produce `INPUT-LIT.rb`.
    
56 
    57 If the argument `START` is present, then RubyLit will tangle the literate program in to the file `OUTPUT` starting from the segment `START`.
    
58 This allows for the possibility of [Batch Generation](#label-Batch+Generation) of files.
    
59 The default behaviour, when `START` is absent, is to tangle the entire literate program `INPUT-LIT`.
    
60 
    61 ## Tangle
    
62 
    63 What's neat about literate programming is that the source can be presented
    
64 in any order, so you can explain it however you like. The tangling process
    
65 puts it back into order so it can actually run.
    
66 
    67 The _full_ literate programming concept as imagined by Knuth and implemented
    
68 in his initial 'WEB' system not only allows you to include bits of code,
    
69 but even lets you define parametric macros, so the literate document is
    
70 actually a **meta-language** on top of the underlying programming language!
    
71 
    72 I've just implemented a crude "include" macro for this demonstration, but that
    
73 alone gives me a ton of flexibility!
    
74 
    75 Here's how I've made that work. I've got a `segments` hash that will store
    
76 all of the lines of a "segment" in the literate program. I have the program
    
77 begin in a segment identified with the `:start` symbol:
    
78 
    79     myseg = :start
    
80     segments = {}
    
81     segments[myseg] = []
    
82 
    83 Then I loop through all of the lines in the source document and handle
    
84 just two special cases. (All other lines are treated as the "document"
    
85 part of the literate program and are completely ignored here!)
    
86 
    87     File.open(fname_input).each do |line|
    
88         <<Handle segments>>
    
89         <<Handle code lines>>
    
90     end
    
91 
    92 And after gathering the lines into segments, I recursively follow
    
93 the includes to write out the final program to a file:
    
94 
    95     <<Write the program>>
    
96 
    97 ## Handle segments
    
98 
    99 When I see a "## " at the beginning of the line, I know it's a Markdown
   
100 level 2 heading, which I'm using to indicate new code segments. They
   
101 don't _have_ to include code and even if they do, they don't _have_ to
   
102 be used.
   
103 
   104 Here you can see that I'm setting the current segment to the name of
   
105 the heading and initializing a new array to store the code lines:
   
106 
   107       if(line.start_with?('## '))
   
108         myseg = line[3..].chomp
   
109         segments[myseg] = []
   
110       end
   
111 
   112 ## Handle code lines
   
113 
   114 When I see a space at the beginning of the line, I know it's indented source
   
115 code. This is extremely strict and not flexible and is just one example of the
   
116 non-industrial nature of this demonstration program. :-)
   
117 
   118       if(line.start_with?(' '))
   
119         segments[myseg].push(line)
   
120       end
   
121 
   122 ## Write the program
   
123 
   124 Here's the fun part! I've got this recursive method called `put_lines` that
   
125 takes an open destination file, the hash of named code segments (each segment
   
126 being an array of lines), and a target segment to print.
   
127 
   128 I'm looking at each line to see if it's a `<<macro thingy>>` (include request).
   
129 If it is, I recurse into the requested segment. Otherwise, I just output the
   
130 current line to the file:
   
131 
   132     def put_lines(file, segments, sname)
   
133       segments[sname].each do |line|
   
134 
   135         if(m = /^\s*<<([^>]+)>>\s*$/.match(line))
   
136 
   137           put_lines(file, segments, m[1])
   
138 
   139         else
   
140 
   141           file.puts line
   
142 
   143         end
   
144 
   145       end
   
146     end
   
147 
   148 To start the above recursive process, I open the output file and request the appropriate segment.
   
149 If the `START` argument was present as `ARGV[2]` at the time of execution, then we start the recursive process from there.
   
150 Otherwise, we start tangling from the beginning of the literate program at the `:start` symbol.
   
151 
   152     File.open(fname_output, 'w') do |out|
   
153       if ARGV[2]
   
154         puts "Tangling #{fname_input} to output #{fname_output} from \"#{ARGV[2]}\"."
   
155         put_lines(out, segments, ARGV[2])
   
156       else
   
157         puts "Tangling #{fname_input} to output #{fname_output} from :start."
   
158         put_lines(out, segments, :start)
   
159       end
   
160     end
   
161 
   162 That's it!
   
163 
   164 ## Weave
   
165 
   166 The "weave" part of the application is the documentation creation portion.
   
167 
   168 Since my scheme for this literate program is to encode it as pure Markdown,
   
169 I could just rely on an external tool to create the HTML (in fact, that's
   
170 probably how you're reading this README right now).
   
171 
   172 But since Ruby comes with a Markdown parser as part of it's Standard Library,
   
173 I figured I might as well include it. The parser and generator are part of
   
174 the RDoc (Ruby documentation) module:
   
175 
   176     require 'rdoc'
   
177 
   178 The markdown source is the literate program document (yeah, we read it in
   
179 the Tangle process and we'll read it again for Weave):
   
180 
   181     data = File.read(fname_input)
   
182 
   183 Then some boilerplate. RDoc is like a mini-Pandoc in that it can take input
   
184 and produce output in a bunch of different formats, and we pay for that
   
185 flexibility with some complexity:
   
186 
   187     formatter = RDoc::Markup::ToHtml.new(RDoc::Options.new, nil)
   
188     html = RDoc::Markdown.parse(data).accept(formatter)
   
189 
   190 And then I just write that out to a ".html" file, bookended by start and end document tags:
   
191 
   192     File.open("#{fname}.html", 'w') do |out|
   
193       puts "Weaving #{fname_input} to output #{fname}.html."
   
194       out.puts("<html><body>")
   
195       out.print(html)
   
196       out.puts("</body></html>")
   
197     end
   
198 
   199 And that's it!
   
200 
   201 ## Running it!
   
202 
   203 To turn this README into a program starts with "stage1", which only includes
   
204 the "tangle" part of the process (no documentation output):
   
205 
   206     $ ruby stage1.rb README
   
207 
   208      Found segment 'Usage'
   
209      Found segment 'Tangle'
   
210      Found segment 'Handle segments'
   
211      Found segment 'Handle code lines'
   
212      Found segment 'Write the program'
   
213      Found segment 'Weave'
   
214      Found segment 'Running it!'
   
215      Found segment 'Bootstrapping'
   
216      Found segment 'Batch Generation'
   
217      Found segment 'Parse Arguments'
   
218      Fetching segment 'start'
   
219      Fetching segment 'Parse Arguments'
   
220      Fetching segment 'Usage'
   
221      Fetching segment 'Tangle'
   
222      Fetching segment 'Handle segments'
   
223      Fetching segment 'Handle code lines'
   
224      Fetching segment 'Write the program'
   
225      Fetching segment 'Weave'
   
226 
   227 (As you can see, I also gave it some output to help me debug segment names.)
   
228 
   229 That produces `README.rb`, which is now the Ruby program we've described
   
230 above, which can be used to process itself again:
   
231 
   232     ruby README.rb README
   
233 
   234 And running that _again_ proves that we're **fully bootstrapped**. We're
   
235 running the output of the README against the README:
   
236 
   237     ruby README.rb README
   
238 
   239 (Note that this part of the document you're reading right now has indented
   
240 "code" blocks to show the command line and output. Those are not valid Ruby, so
   
241 why is that okay? That's okay because they're never explicitly included in the
   
242 program!  The Ruby interpreter never sees them.)
   
243 
   244 ## Bootstrapping
   
245 
   246 This repo includes two simple literate test programs (Markdown files) I used to
   
247 get the intial "stage1" program working:
   
248 
   249 * `hello.md`
   
250 * `hello-segments.md`
   
251 
   252 When stage1 was done, I copied it to use as the basis for the final
   
253 document/program you're reading now. Then I followed the "Running it!" process
   
254 exactly as shown above and it worked! :-)
   
255 
   256 ## Batch Generation
   
257 
   258 RubyLit allows for creating multiple files from a single literate markdown file. 
   
259 We call this process batch generation. 
   
260 It was inspired by the LaTeX package [docstrip](https://ctan.org/pkg/docstrip).
   
261 For example, one can extract a `batch-generation.sh` from this `README`. 
   
262 The following script first creates the usual `README.rb`, `README.html`, and then produces a file: `batch-generation.sh`.
   
263 
   264     #!/bin/bash
   
265    
   266     ruby README.rb README 
   
267     ruby README.rb README rubylit.rb 
   
268 
   269     ruby rubylit.rb README batch-generation.sh "Batch Generation"
   
270 
   271 Once you can extract arbitary segments as output files, the sky is the limit.
   
272 You can have all sorts of weird recursive things going on.
   
273 For example, you can extract other literate programs as Markdown files.
   
274 One issue with embedding Markdown files in a literate program is that Markdown cares aboue line initial whitespaces. 
   
275 The batch generation system is _just_ a bash script, so we can clean up initial whitespaces with a bit of `sed`.
   
276 For example, we can do the following.
   
277 
   278     ruby rubylit.rb README markdown-example.md "Markdown Example"
   
279     sed --in-place 's/^    //g' markdown-example.md # clean-up initial whitespace
   
280     ruby rubylit.rb markdown-example hello-world.sh
   
281 	
   282 
   283 ## Markdown Example
   
284 
   285 We leave finding a use for this weird recursive literary programming as an exercise for the reader.
   
286 
   287     # An Embedded Literary Program.
   
288 
   289     This is itself a literary program!
   
290     *Wow* it's so meta!
   
291 
   292         #!/bin/bash
   
293         echo "Hello, world!"
   
294 
   295 ## Parse Arguments
   
296 
   297 This is the most boring and hacky part of the program.
   
298 Ideally, it would be replaced with something using [OptionParser](https://ruby-doc.org/stdlib-2.4.2/libdoc/optparse/rdoc/OptionParser.html).
   
299 
   300     if ARGV[0]
   
301       #puts "ARGV[0] is present: adopting value for fname"
   
302       fname = ARGV[0]
   
303       fname_input = "#{fname}.md"
   
304       if ARGV[1]
   
305         #puts "ARGV[1] is present: adopting value for fname_output"
   
306         fname_output = ARGV[1]
   
307         if ARGV[2]
   
308           #puts "ARGV[2] is present: adopting value for initial_segment"
   
309           initial_segment=ARGV[2]
   
310         else
   
311           #puts "No ARGV[2] is present: assuming default value :start"
   
312           initial_segment=:start
   
313         end
   
314       else
   
315         #puts "No ARGV[1] is present: default value"
   
316         fname_output = "#{fname}.rb"
   
317       end
   
318     else
   
319     	<<Usage>>
   
320 	exit 1;
   
321     end