colorful rat Ratfactor.com > Dave's Repos

rubylit

A literate programming system in 35 lines of Ruby
git clone http://ratfactor.com/repos/rubylit/rubylit.git

rubylit/README.html

Download raw file: README.html

1 <html><body> 2 3 <h1 id="label-RubyLit+-+This+README+is+a+program-21">RubyLit - This README is a program!<span><a href="#label-RubyLit+-+This+README+is+a+program-21">&para;</a> <a href="#top">&uarr;</a></span></h1> 4 5 <p>The output of this README.md file is a <em>program</em> that turns documents into:</p> 6 <ul><li> 7 <p>&lt;a href=“html/README.rb.html”&gt;README.rb&lt;/a&gt; - a Ruby program</p> 8 </li><li> 9 <p>&lt;a href=“html/README.html.html”&gt;README.html&lt;/a&gt; - a (hideously) formatted HTML document</p> 10 </li></ul> 11 12 <p>It&#39;s a &lt;a href=“<a href="https://en.wikipedia.org/wiki/Literate_programming">literate”>en.wikipedia.org/wiki/Literate_programming“>literate</a> program&lt;/a&gt; (wikipedia.org) and the concept comes from Donald Knuth.</p> 13 14 <p>In order to get the process rolling, there&#39;s a non-literate <code>stage1.rb</code> script that does the initial “tangling”. You&#39;ll see what <em>that</em> means in a moment.</p> 15 16 <p>To make this literate programming stuff work, all source is indented. That not only makes it easy to read in source form, but by chosing that method, I&#39;ve also made this document a valid Markdown file, so the README will be properly formatted when you view it on the Web as HTML output via tools such as &lt;a href=“<a href="http://ratfactor.com/repos/reporat/">RepoRat</a">ratfactor.com/repos/reporat/”>RepoRat</a</a>&gt;.</p> 17 18 <p>Here&#39;s how it starts. The program takes the root filename as a command line argument:</p> 19 20 <pre class="ruby"><span class="ruby-identifier">fname</span> = <span class="ruby-constant">ARGV</span>[<span class="ruby-value">0</span>] 21 </pre> 22 23 <p>I intially wanted to use the file extension “.lit”, but the real magic happened when I realized I could use “.md” and have the README itself be the program:</p> 24 25 <pre class="ruby"><span class="ruby-identifier">fname_input</span> = <span class="ruby-node">&quot;#{fname}.md&quot;</span> 26 </pre> 27 28 <p>Next, I “tangle” and “weave” the input document:</p> 29 30 <pre>&lt;&lt;Tangle&gt;&gt; 31 &lt;&lt;Weave&gt;&gt;</pre> 32 33 <p>“Tangle” creates the program and “Weave” creates the documentation. Those little <code>&lt;&lt;bracket things&gt;&gt;</code> are literate programming macros that include source code from other sections of the document (identified by Markdown subheadings).</p> 34 35 <h2 id="label-Tangle">Tangle<span><a href="#label-Tangle">&para;</a> <a href="#top">&uarr;</a></span></h2> 36 37 <p>What&#39;s neat about literate programming is that the source can be presented in any order, so you can explain it however you like. The tangling process puts it back into order so it can actually run.</p> 38 39 <p>The <em>full</em> literate programming concept as imagined by Knuth and implemented in his initial &#39;WEB&#39; system not only allows you to include bits of code, but even lets you define parametric macros, so the literate document is actually a <strong>meta-language</strong> on top of the underlying programming language!</p> 40 41 <p>I&#39;ve just implemented a crude “include” macro for this demonstration, but that alone gives me a ton of flexibility!</p> 42 43 <p>Here&#39;s how I&#39;ve made that work. I&#39;ve got a <code>segments</code> hash that will store all of the lines of a “segment” in the literate program. I have the program begin in a segment identified with the <code>:start</code> symbol:</p> 44 45 <pre class="ruby"><span class="ruby-identifier">myseg</span> = <span class="ruby-value">:start</span> 46 <span class="ruby-identifier">segments</span> = {} 47 <span class="ruby-identifier">segments</span>[<span class="ruby-identifier">myseg</span>] = [] 48 </pre> 49 50 <p>Then I loop through all of the lines in the source document and handle just two special cases. (All other lines are treated as the “document” part of the literate program and are completely ignored here!)</p> 51 52 <pre>File.open(fname_input).each do |line| 53 &lt;&lt;Handle segments&gt;&gt; 54 &lt;&lt;Handle code lines&gt;&gt; 55 end</pre> 56 57 <p>And after gathering the lines into segments, I recursively follow the includes to write out the final program to a file:</p> 58 59 <pre>&lt;&lt;Write the program&gt;&gt;</pre> 60 61 <h2 id="label-Handle+segments">Handle segments<span><a href="#label-Handle+segments">&para;</a> <a href="#top">&uarr;</a></span></h2> 62 63 <p>When I see a “## ” at the beginning of the line, I know it&#39;s a Markdown level 2 heading, which I&#39;m using to indicate new code segments. They don&#39;t <em>have</em> to include code and even if they do, they don&#39;t <em>have</em> to be used.</p> 64 65 <p>Here you can see that I&#39;m setting the current segment to the name of the heading and initializing a new array to store the code lines:</p> 66 67 <pre class="ruby"> <span class="ruby-keyword">if</span>(<span class="ruby-identifier">line</span>.<span class="ruby-identifier">start_with?</span>(<span class="ruby-string">&#39;## &#39;</span>)) 68 <span class="ruby-identifier">myseg</span> = <span class="ruby-identifier">line</span>[<span class="ruby-value">3</span><span class="ruby-operator">..</span>].<span class="ruby-identifier">chomp</span> 69 <span class="ruby-identifier">segments</span>[<span class="ruby-identifier">myseg</span>] = [] 70 <span class="ruby-keyword">end</span> 71 </pre> 72 73 <h2 id="label-Handle+code+lines">Handle code lines<span><a href="#label-Handle+code+lines">&para;</a> <a href="#top">&uarr;</a></span></h2> 74 75 <p>When I see a space at the beginning of the line, I know it&#39;s indented source code. This is extremely strict and not flexible and is just one example of the non-industrial nature of this demonstration program. :-)</p> 76 77 <pre class="ruby"> <span class="ruby-keyword">if</span>(<span class="ruby-identifier">line</span>.<span class="ruby-identifier">start_with?</span>(<span class="ruby-string">&#39; &#39;</span>)) 78 <span class="ruby-identifier">segments</span>[<span class="ruby-identifier">myseg</span>].<span class="ruby-identifier">push</span>(<span class="ruby-identifier">line</span>) 79 <span class="ruby-keyword">end</span> 80 </pre> 81 82 <h2 id="label-Write+the+program">Write the program<span><a href="#label-Write+the+program">&para;</a> <a href="#top">&uarr;</a></span></h2> 83 84 <p>Here&#39;s the fun part! I&#39;ve got this recursive method called <code>put_lines</code> that takes an open destination file, the hash of named code segments (each segment being an array of lines), and a target segment to print.</p> 85 86 <p>I&#39;m looking at each line to see if it&#39;s a <code>&lt;&lt;macro thingy&gt;&gt;</code> (include request). If it is, I recurse into the requested segment. Otherwise, I just output the current line to the file:</p> 87 88 <pre class="ruby"><span class="ruby-keyword">def</span> <span class="ruby-identifier ruby-title">put_lines</span>(<span class="ruby-identifier">file</span>, <span class="ruby-identifier">segments</span>, <span class="ruby-identifier">sname</span>) 89 <span class="ruby-identifier">segments</span>[<span class="ruby-identifier">sname</span>].<span class="ruby-identifier">each</span> <span class="ruby-keyword">do</span> <span class="ruby-operator">|</span><span class="ruby-identifier">line</span><span class="ruby-operator">|</span> 90 91 <span class="ruby-keyword">if</span>(<span class="ruby-identifier">m</span> = <span class="ruby-regexp">/^\s*&lt;&lt;([^&gt;]+)&gt;&gt;\s*$/</span>.<span class="ruby-identifier">match</span>(<span class="ruby-identifier">line</span>)) 92 93 <span class="ruby-identifier">put_lines</span>(<span class="ruby-identifier">file</span>, <span class="ruby-identifier">segments</span>, <span class="ruby-identifier">m</span>[<span class="ruby-value">1</span>]) 94 95 <span class="ruby-keyword">else</span> 96 97 <span class="ruby-identifier">file</span>.<span class="ruby-identifier">puts</span> <span class="ruby-identifier">line</span> 98 99 <span class="ruby-keyword">end</span> 100 101 <span class="ruby-keyword">end</span> 102 <span class="ruby-keyword">end</span> 103 </pre> 104 105 <p>To start the above recursive process, I open the “.rb” output file and request the <code>:start</code> segment:</p> 106 107 <pre class="ruby"><span class="ruby-constant">File</span>.<span class="ruby-identifier">open</span>(<span class="ruby-node">&quot;#{fname}.rb&quot;</span>, <span class="ruby-string">&#39;w&#39;</span>) <span class="ruby-keyword">do</span> <span class="ruby-operator">|</span><span class="ruby-identifier">out</span><span class="ruby-operator">|</span> 108 <span class="ruby-identifier">put_lines</span>(<span class="ruby-identifier">out</span>, <span class="ruby-identifier">segments</span>, <span class="ruby-value">:start</span>) 109 <span class="ruby-keyword">end</span> 110 </pre> 111 112 <p>That&#39;s it!</p> 113 114 <h2 id="label-Weave">Weave<span><a href="#label-Weave">&para;</a> <a href="#top">&uarr;</a></span></h2> 115 116 <p>The “weave” part of the application is the documentation creation portion.</p> 117 118 <p>Since my scheme for this literate program is to encode it as pure Markdown, I could just rely on an external tool to create the HTML (in fact, that&#39;s probably how you&#39;re reading this README right now).</p> 119 120 <p>But since Ruby comes with a Markdown parser as part of it&#39;s Standard Library, I figured I might as well include it. The parser and generator are part of the RDoc (Ruby documentation) module:</p> 121 122 <pre class="ruby"><span class="ruby-identifier">require</span> <span class="ruby-string">&#39;rdoc&#39;</span> 123 </pre> 124 125 <p>The markdown source is the literate program document (yeah, we read it in the Tangle process and we&#39;ll read it again for Weave):</p> 126 127 <pre class="ruby"><span class="ruby-identifier">data</span> = <span class="ruby-constant">File</span>.<span class="ruby-identifier">read</span>(<span class="ruby-identifier">fname_input</span>) 128 </pre> 129 130 <p>Then some boilerplate. RDoc is like a mini-Pandoc in that it can take input and produce output in a bunch of different formats, and we pay for that flexibility with some complexity:</p> 131 132 <pre class="ruby"><span class="ruby-identifier">formatter</span> = <span class="ruby-constant">RDoc</span><span class="ruby-operator">::</span><span class="ruby-constant">Markup</span><span class="ruby-operator">::</span><span class="ruby-constant">ToHtml</span>.<span class="ruby-identifier">new</span>(<span class="ruby-constant">RDoc</span><span class="ruby-operator">::</span><span class="ruby-constant">Options</span>.<span class="ruby-identifier">new</span>, <span class="ruby-keyword">nil</span>) 133 <span class="ruby-identifier">html</span> = <span class="ruby-constant">RDoc</span><span class="ruby-operator">::</span><span class="ruby-constant">Markdown</span>.<span class="ruby-identifier">parse</span>(<span class="ruby-identifier">data</span>).<span class="ruby-identifier">accept</span>(<span class="ruby-identifier">formatter</span>) 134 </pre> 135 136 <p>And then I just write that out to a “.html” file, bookended by start and end document tags:</p> 137 138 <pre class="ruby"><span class="ruby-constant">File</span>.<span class="ruby-identifier">open</span>(<span class="ruby-node">&quot;#{fname}.html&quot;</span>, <span class="ruby-string">&#39;w&#39;</span>) <span class="ruby-keyword">do</span> <span class="ruby-operator">|</span><span class="ruby-identifier">out</span><span class="ruby-operator">|</span> 139 <span class="ruby-identifier">out</span>.<span class="ruby-identifier">puts</span>(<span class="ruby-string">&quot;&lt;html&gt;&lt;body&gt;&quot;</span>) 140 <span class="ruby-identifier">out</span>.<span class="ruby-identifier">print</span>(<span class="ruby-identifier">html</span>) 141 <span class="ruby-identifier">out</span>.<span class="ruby-identifier">puts</span>(<span class="ruby-string">&quot;&lt;/body&gt;&lt;/html&gt;&quot;</span>) 142 <span class="ruby-keyword">end</span> 143 </pre> 144 145 <p>And that&#39;s it!</p> 146 147 <h2 id="label-Running+it-21">Running it!<span><a href="#label-Running+it-21">&para;</a> <a href="#top">&uarr;</a></span></h2> 148 149 <p>To turn this README into a program starts with “stage1”, which only includes the “tangle” part of the process (no documentation output):</p> 150 151 <pre>$ ruby stage1.rb README 152 Found segment &#39;Tangle&#39; 153 Found segment &#39;Handle segments&#39; 154 Found segment &#39;Handle code lines&#39; 155 Found segment &#39;Write the program&#39; 156 Found segment &#39;Weave&#39; 157 Found segment &#39;Comparing this document with its output&#39; 158 Fetching segment &#39;start&#39; 159 Fetching segment &#39;Tangle&#39; 160 Fetching segment &#39;Handle segments&#39; 161 Fetching segment &#39;Handle code lines&#39; 162 Fetching segment &#39;Write the program&#39; 163 Fetching segment &#39;Weave&#39;</pre> 164 165 <p>(As you can see, I also gave it some output to help me debug segment names.)</p> 166 167 <p>That produces <code>README.rb</code>, which is now the Ruby program we&#39;ve described above, which can be used to process itself again:</p> 168 169 <pre class="ruby"><span class="ruby-identifier">ruby</span> <span class="ruby-constant">README</span>.<span class="ruby-identifier">rb</span> <span class="ruby-constant">README</span> 170 </pre> 171 172 <p>And running that <em>again</em> proves that we&#39;re <strong>fully bootstrapped</strong>. We&#39;re running the output of the README against the README:</p> 173 174 <pre class="ruby"><span class="ruby-identifier">ruby</span> <span class="ruby-constant">README</span>.<span class="ruby-identifier">rb</span> <span class="ruby-constant">README</span> 175 </pre> 176 177 <p>(Note that this part of the document you&#39;re reading right now has indented “code” blocks to show the command line and output. Those are not valid Ruby, so why is that okay? That&#39;s okay because they&#39;re never explicitly included in the program! The Ruby interpreter never sees them.)</p> 178 179 <h2 id="label-Bootstrapping">Bootstrapping<span><a href="#label-Bootstrapping">&para;</a> <a href="#top">&uarr;</a></span></h2> 180 181 <p>This repo includes two simple literate test programs (Markdown files) I used to get the intial “stage1” program working:</p> 182 <ul><li> 183 <p><code>hello.md</code></p> 184 </li><li> 185 <p><code>hello-segments.md</code></p> 186 </li></ul> 187 188 <p>When stage1 was done, I copied it to use as the basis for the final document/program you&#39;re reading now. Then I followed the “Running it!” process exactly as shown above and it worked! :-)</p> 189 </body></html>