Hi. I’m Ryan Moser.

Problem starter. Party solver.

The Baconator: Finding Erdős Numbers on Wikipedia

The Erdős number describes the collaborative distance between an author of mathematical papers and Paul Erdős. The Bacon number, which measures an actor’s degress of separation from Kevin Bacon in cinema, is a well known varition on this theme. I recently came across another interesting variation – what is the shortest possible path between any two articles on Wikipedia?

Finding the shortest path between two nodes suggests a breadth-first search. Given any starting page and a target page, our search should look something like this:

  1. Extract all links on the page
  2. Check if any of these are the target page. If so, return the path from the starting page to the target page
  3. Return to step 1, visiting all of the links found on the page (yay recursion!)

This gives us a few problems to solve:

  1. Extracting all of the links from a Wikipedia page that point to other Wikipedia pages
  2. Returning the sequence of articles that makes up a shortest path. Many paths could tie for shortest; we are satisfied with returning one of them
  3. Keep track of which links we have visited to avoid visiting them a second time

The first problem is readily solved with yokogiri (Nokogiri for Clojure), a lovely HTML/XML parser library. A solution might look something like this, with the addition of error handling:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
(ns erdos-c.core
  [:require [yokogiri.core :refer :all]])

(def client (make-client :javascript false))

(defn page [uri]
  (get-page client (str "http://en.wikipedia.org" uri)))

(defn link-snippets [uri]
  (xpath (page uri) "//a"))

(defn links [uri]
  (->> uri
    (link-snippets)
    (map attrs)
    (map :href)
    (remove nil?)))

(defn wikifilter [uri]
  (re-find #"\A\/wiki\/" uri))

(defn wiki-links [uri]
  (filter wikifilter (links uri)))

The second problem is more interesting – how do we ensure the shortest path is returned? Our goal is to visit all of the distance-1 links first, then all of the distance-2 links, etc. to ensure the shortest possible path is found. The distance-2 links are all created from distance-1 links, hence the only challenge here is to ensure we visit links in order of their depth. We can accomplish this with a queue, taking the next link to test from the front of the queue and adding new links discovered to the back of the queue. While you won’t find it on the cheatsheet, Clojure does include a queue that can be created with clojure.lang.PersistentQueue/EMPTY.

As links are extracted from the page, they are compared against the visited set to ensure we do not visit the same link twice. New links are added to the visited set and the queue as they are discovered.

1
2
3
4
5
6
7
8
9
10
(defn search [start target]
  (loop [queue (conj (clojure.lang.PersistentQueue/EMPTY) (list start))
         visited #{}]
    (let [links (peek queue)
          test-link (first links)
          page-links (filter #(not (contains? visited %)) (wiki-links test-link))]
      (if (some #(= target %) page-links)
        links
        (recur (pop (apply conj queue (map #(cons % links) page-links)))
               (apply conj visited page-links))))))

Instead of enqueueing links, we enqueue lists of links to maintain knowledge of the path that got us to the next link to visit, which is the first link in each list. Hence, running (search "/wiki/Secondhand_Lions:_A_New_Musical" "/wiki/Kevin_Bacon") ultimately returns ("/wiki/Robert_Duvall" "/wiki/Secondhand_Lions" "/wiki/Secondhand_Lions:_A_New_Musical") giving Secondhand Lions: A New Musical a Wiki-Bacon number of 3.

If Clojure isn’t quite your speed and you’d prefer to see a Ruby implementation you can find it here.


Building a Work Queue With Redis and Lua

I recently encountered a situation that called for a work queue. The system seemed simple enough at the outset – only producers and consumers, with jobs prioritized by when they are enqueued. There is, however, a curious twist – the consumers are people doing work in a single page app. There are enough concurrent users to warrant a datastore with atomic operations.

Jobs cannot simply be dequeued when people start them because many jobs go unfinished for one reason or another. The behavior we want is a timeout – the person has some amount of time to complete the job after it was dequeued, else the system considers the job abandoned and must it back in the inbound queue.

Redis does not have primitives for this specific use case, but it turns out we don’t need them – we can run Lua scripts on our Redis server to get the exact behavior we need.

First, a script to reserve jobs:

1
2
3
4
5
6
7
8
-- arguments are passed in as strings; we need our score to be a number
local time = tonumber(ARGV[1])
local val  = redis.call('zrange', KEYS[1], 0, 0)[1]
if val then
  redis.call('zadd', KEYS[2], time, val)
  redis.call('zremrangebyrank', KEYS[1], 0, 0)
end
return val

This takes jobs from our default queue and moves them to a reserved queue. Both queues are sorted sets – the time is passed in as a string (ARGV[1]), converted to a number and used as the score. Using a sorted set keeps the inbound queue ordered by the time when jobs were queued up and prevents duplicate jobs. We can also use the timestamp/score to determine which jobs in the reserved queue have timed out and move them back to the inbound queue:

1
2
3
4
5
6
7
local time = tonumber(ARGV[1])
local vals = redis.call('zrangebyscore', KEYS[1], 0, time)
redis.call('zremrangebyscore', KEYS[1], 0, time)
for i, val in ipairs(vals) do
  redis.call('zadd', KEYS[2], time+i, (val))
end
return #vals

Redis scripts run as atomic operations and, as Antirez notes, much more performant than using WATCH.

The server used as a backend for our single page app would interact with Redis like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
class MyQueue

  INBOUND_QUEUE =  "#{self}"
  RESERVED_QUEUE = "#{self}:reserved"

  def enqueue(data)
    $redis.zadd(INBOUND_QUEUE, timestamp, data)
  end

  def reserve
    $redis.eval zreserve, keys: [INBOUND_QUEUE, RESERVED_QUEUE], argv: [timestamp]
  end

  JOB_EXPIRES_IN = 300 # seconds

  def clear
    $redis.eval zclear_queue, keys: [RESERVED_QUEUE, INBOUND_QUEUE], argv: [timestamp - JOB_EXPIRES_IN]
  end

  def delete(key, queue=RESERVED_QUEUE)
    $redis.zrem(queue, key)
  end

  private

  # note - reading in the Lua script as a concatenated string!
  def zreserve
    File.readlines("lib/zreserve.lua").map(&:chomp).join(' ')
  end

  def zclear_queue
    File.readlines("lib/zclear_queue.lua").map(&:chomp).join(' ')
  end

  def timestamp
    Time.now.to_i
  end
end

This server needs a cron job or some other way to periodically call #clear to clear the reserved queue of timed-out jobs. A more fully baked implementation might also use SCRIPT LOAD to upload the script to our Redis server and then call it via EVALSHA instead of sending the entire script every time.

The tour of Lua on Learn X in Y Minutes and the examples on the Redis homepage are all you need to get started and extend the usefulness of Redis in meaningful way within a couple of hours. I recommend giving this a shot the next time you encounter a unique problem where Redis seems close but not quite right for your specific use case.


Building Knod Part 2: Workflow and Testing

This is part 2 in a series about building Knod, a tiny HTTP server designed to help with front-end prototyping.

One of the first challenges I faced in building Knod was establishing a good workflow. Using the REPL as a development tool is one of the great joys of using Ruby and languages in the LISP family. I tend to experiment with different things on the REPL when the domain isn’t well known and I want to spike on a solution – the exact circumstance faced at the outset of this project. However, the REPL falls short when building a server because, while a server will certainly run in a REPL, it provides no useful output. We ultimately need a client to connect, send a request, and ensure it receives an appropriate response. Furthermore, since the server will block while it is waiting for a connection, we cannot run the server and test it in the same REPL. When I started building Knod, my workflow was as follows:

  1. Split the terminal
  2. Start the server in one terminal session
  3. Use cURL or Net::HTTP in the other session to make a request
  4. Use pry-debugger to step through the request/response cycle.
  5. Change the code as necessary
  6. Shut down and restart the server
  7. Repeat steps 1-6 until the desired result is achieved

Repeatedly stopping and starting the server and jumping back and forth between terminal sessions became tedious almost immediately so I started looking for ways to streamline my workflow. There were two problems to solve – reloading the server every time I made a change, then rerunning a series of requests to the server to see if the changes had the desired effect without breaking anything that was already working. I ultimately needed was a faster, more repeatable way to test the server’s behavior.

Ruby’s standard library includes almost everything one needs for behavior-driven development in the form of MiniTest::Spec. Its syntax will look immediately familiar to anyone used to RSpec:

1
2
3
it 'should add numbers' do
  (2 + 2).must_equal 4
end

MiniTest does not provide a way to send HTTP requests out of the box. I initially tried using Net::HTTP to make requests. It looked something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
describe Knod, 'a tiny HTTP server' do
  describe 'PUT' do
    before do
      path = '/items/81.json'
      @host = '0.0.0.0'
      header = {'Content-Type' => 'application/json'}
      @request = Net::HTTP::Post.new(path, header)
      @request.body = {id: 81, state: 'swell', favorite_animal: 'giant squid'}.to_json
    end

    it 'returns a 200 on success' do
      response = Net::HTTP.new(host, port=4444).start {|http| http.request(request) }
      response.code.must_equal '200'
    end
    # more tests
  end
end

This is awful. Tons of setup for just a few tests. At this rate I would end up spending more time writing tests than on the actual server! I received an excellent piece of advice from my friend Eno Compton – create a wrapper class for Net::HTTP to cut down on boilerplate. Dan Knox provides a great rundown of creating such a wrapper class in this blog post. With a small wrapper class suited to my purposes, tests became much easier to read and write:

1
2
3
4
5
6
7
8
9
10
11
12
13
describe Knod, "a tiny http server" do
  let(:connection) {Connection.new("http://0.0.0.0:4444")} # Net::HTTP wrapper

  describe 'PUT' do
    let(:path) {"index/81.json"}
    let(:data) { {state: 'swell', favorite_animal: 'giant squid'} }

    it 'returns a 200 on success' do
      response = connection.put path, data
      response.code.must_equal '200'
    end

    # considerably more tests...

We could even take this a step further and delegate all of the HTTP verbs to our connection to almost exactly mimic the functionality of rspec-rails or minitest-rails:

1
2
3
4
5
6
7
8
9
10
11
describe Knod, 'a tiny http server' do
  extend Forwardable
  let(:connection) {Connection.new("http://0.0.0.0:4444")}
  def_delegators :connection, :get, :put, :post # ... all the verbs

  describe 'PUT' do
    # after setup:
    it 'writes to the local path' do
      put path, data
      File.file?(path).must_equal true
    end

With the Net::HTTP wrapper class complete, I was halfway to a good BDD setup – I could quickly write useful tests, but still had to stop and start the server between each test run to pickup code changes. I revisited Jesse Storimer’s Working With Ruby Threads for inspiration:

I’ve been saying the GIL prevents parallel execution of Ruby code, but blocking IO is not Ruby code… MRI doesn’t let a thread hog the GIL when it hits blocking IO.

This means we can just start the server in its own thread before the test suite starts running. At the top of our test file:

1
2
3
4
require 'minitest/autorun'
require 'knod'

Thread.new { Knod.start(port: 4444) }

The only issue with this implementation is that we are in trouble if another server is already running on the port specified when we start the server. The solution lies in an interesting propery of Ruby’s socket library: A new instance of TCPServer initialized with a port of 0 will automatically select an open port in the ephemeral range:

1
2
server = TCPServer.new('0.0.0.0', 0)
port   = server.addr[1]

Setting up tests this way took a couple of hours that I could have spent on the server, but I think it was time well spent. Subsequent features became much easier to build with the improved workflow a test suite provided and I gained insight into how DSLs like rspec-rails work in the process.

Next up in this series: A post on higher-level learnings that came out of the gem-building process.


Walking The Path: Notes on a Ten Year Journey From Stuttering to Fluency

Alt text

While the lighting in the photo could use some work, you can still observe that I am:

  1. In front of a room giving a presentation
  2. Enjoying myself

I gave this presentation as part of an ongoing lunch and learn series at thredUP and it went better than I could have hoped. The crowd was very much engaged and I was still fielding follow up questions the next day. A 15 minute presentation in front of ~40 people is barely a blip on the radar for some of my peers, but for me, it is a milestone at least a decade in the making.

Until I was well into my 20’s, I stuttered quite severly. It was difficult, and at times impossible, to speak on the phone, order in a restaurant, even say my own name – let alone engage in public speaking. Stress is a trigger for stuttering so it tends to self perpetuate. Stuttering lends itself to traumatic experiences, which in turn can lead to a lot of fear and shame associated with speaking, which drives still more stuttering.

This cycle started pretty early in my case – when I was five, a nun at St. James Catholic Church in Pittsburgh decided physical violence was the most appropriate remedy for my speech impediment. I spoke only with great difficulty from that point forward. I spent my youth being bullied by peers, ridiculed by teachers, and frequently feeling unwelcome in my own family. Fortunately for my current self, I managed to persist despite often thinking my life was not one worth living.

Things took a turn for the better during my graduate studies at Georgia Tech when I finally took charge of my life. I cut out every negative influence, found a great speech therapist based on my own research, and most important of all, convinced myself the people who had treated me like shit for my entire life, not me, had the problem. Tim Mackesey planted the idea in my head that I could counteract the negative feedback cycle of stuttering with a positive one – I could take any good experience speaking, no matter how small, use it to gain confidence, then use that confidence to drive a series of ever-improving experiences. My first victory was ordering a drink in Starbucks without stuttering – I still remember it as one of the happiest moments I’ve ever experienced, and at age 23, one of the first times I ever felt really good about myself. A cup of coffee gave me the beachhead I needed to retake my life.

While life became easier after that breakthrough, two decades of harm were not simply undone overnight, nor did I expect them to be. In the intervening years I continued to improve my speech and grow as a person by viewing my situation as a perpetual journey, pacing myself accordingly, and limiting my traveling companions to people who believed in me as much as I believed in them. The pattern of discipline and consistent improvement I established when I took on stuttering carried over into every other part of my life, becoming the core of my identity. Now, when I reflect on the person I used to be I find I hardly know him. Yet I take no shame in him. He and I simply mark different points on a path whose destination is unknown, but whose direction I will continue to follow.


Building Knod, a tiny HTTP server. Part 1: Introduction

While working through Facebook’s excellent React tutorial, I came across a common problem when developing and testing AJAX applications: Having some sort of rudimentary backend is necessary, or at the very least makes things easier. While options for http static servers are many, these only respond to GET requests and hence only get you part of the way there. The ideal solution would also respond to PUT, POST, and DELETE and would work with almost zero configuration.

People build quick prototypes often enough that I assumed a tool must exist for this very purpose. When I could not find such a tool, I built it. Knod, a tiny HTTP server for front-end development, was released on April 12 on RubyGems. The code is available here on GitHub. I limited myself to using the Ruby standard library and learned a lot in doing so. Subsequent posts in this series will detail what I learned about the request/response cycle, writing server tests, and turning a library into a useful command line tool.


Juxtaposition in Ruby

4Clojure recently introduced me to Clojure’s curious juxt function. From the official docs:

Takes a set of functions and returns a function that is the juxtaposition of those functions.

Makes a bit more sense once you see it in practice:

1
2
((juxt + max min) 2 3 5 1 6 4)
; => [21 6 1]

This alone is pretty neat, but it gets better. Jay Fields observed that juxt has other interesting applications:

1
2
((juxt filter remove) even? [1 2 4 3 5 6])
; => [(2 4 6) (1 3 5)]

Implementing juxtapose in Ruby sounds like my idea of a good time. How might we go about this? We immediately run into a problem if we want to mimic the interface of Clojure’s juxt: The first example takes advantage of variadic functions that lack Ruby equivalents. Examining one of these functions, however, gives us a pretty strong hint:

1
2
(+ 1 2 3 4)
; => 10

This looks like a reduce function! Reduce is the most universal of functional programming’s three cornerstone functions (map, reduce and filter), so some sort of reduce-based solution should provide the desired results. In our first example, we pass three functions and get three results back, a pretty strong hint we should map our list of functions over the reduce operation. Our first implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
juxt = ->(*fns) do
  ->(*args) do
    fns.map do |fn|
      args.reduce {|acc, e| fn.to_proc.call acc, e}
    end
  end
end

max = ->(a, b) {a > b ? a : b}
min = ->(a, b) {a < b ? a : b}

juxt.(:+, max, min).(2, 3, 5, 1, 6, 4)
# => [21, 6, 1]

filter = ->(fn, list) {list.select {|e| fn.to_proc.call e}}
remove = ->(fn, list) {list.select {|e| !fn.to_proc.call e}}

juxt.(filter, remove).(:even?, [1, 2, 3, 4, 5, 6])
# => [[2, 4, 6], [1, 3, 5]]

Jackpot! This solution works and we even maintained the variadic interface from Clojure. The downside is that this solution requires us to recreate most of the functionality provided by Enumerable and Array. While I rarely turn down the opportunity to write a stabby lambda, let’s assume we want to avoid this for the sake of argument. We can send symbols to our list of arguments if we are willing to adapt the interface. One way to go about it:

1
2
3
4
5
6
7
juxt2 = ->(*fns) do
  ->(*args) do
    fns.map do |fn, *options|
      options.empty? ? args.send(fn) : args.send(fn, &:"#{options.first}")
    end
  end
end

This version of juxt brings a second meaningful improvement on top of being able to use Enumerable: Now we can juxtapose methods that require a block alongside those where a block is optional:

1
2
juxt2.([:select, :even?], :max, [:reduce, :*]).(1, 2, 3, 4, 5, 6)
# => [[2, 4, 6], 6, 720]

I find this diversion so interesting because I only considered doing things like this in Ruby after I discovered Clojure. There is much to be learned about Ruby by exploring functional programming langugages.


Clojure Metaprogramming with Sequences

Generating methods based on the content of arrays, hashes, and other Enumerable things is a powerful metaprogramming technique in Ruby. To keep things relatively simple, let’s use an example problem from Katrina Owen’s fantastic site Exercism:

Write a program that, given an age in seconds, calculates how old someone is in terms of a given planet’s solar years.

We know the length of an Earth year in seconds and the length of every other planet’s orbital period in terms of earth years. Here is an implementation in Ruby:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class SpaceAge
  attr_reader :seconds

  SECONDS_PER_EARTH_YEAR = 31557600.0

  ORBITAL_PERIODS = {mercury:  0.2408467,
                     venus:    0.61519726,
                     mars:     1.8808158,
                     jupiter:  11.862615,
                     saturn:   29.447498,
                     uranus:   84.016846,
                     neptune:  164.79132}

  def initialize(seconds)
    @seconds = seconds
  end

  def on_earth
    seconds / SECONDS_PER_EARTH_YEAR
  end

  ORBIT_PERIODS.each do |planet, earth_years_per_orbit|
    define_method "on_#{planet}".to_sym do
      on_earth / earth_years_per_orbit
    end
  end
end

We use metaprogramming in lines 22 through 26 to generate methods for ages on every planet other than Earth based on our ORBITAL_PERIODS hash. This will make it super easy to change this class when we are done with Earth and want to define everything in terms of Martian years.

Writing the Clojure equivalent of this implementation proved a bit more difficult than expected. Let’s set up the Clojure equivalent and work through the metaprogramming piece:

1
2
3
4
5
6
7
8
9
10
11
12
13
(ns space-age)

(defn on-earth [seconds]
  (/ seconds 31557600.0))

(def orbital-periods
  {:mercury 0.2408467
   :venus   0.61519726
   :mars    1.8808158
   :jupiter 11.862615
   :saturn  29.447498
   :uranus  84.016846
   :neptune 164.79132})

How might we generate functions from our orbinal-periods hashmap? A list comprehension with for feels pretty close to the mark, but this cannot work because it yields a lazy sequence. We need to execute the contents of this sequence to get the functions we are creating into our namespace. Clojure’s’ doseq macro is purpose built for this use case. Now we have the start of our solution:

1
2
3
(doseq [[planet period] orbital-periods]
  ;somehow make functions
  )

I had a bit of trouble wrapping my mind around this part of the problem because I taught myself Clojure with resources placing a heavy emphasis on lazy evaluation and side-effect free functions. This case runs totally counter to that, executing a sequnce specifically for its side effects, which happen to be producing pure functions.

Now that we know something will execute, we must determine what to execute to generate a function from a key in the orbital-periods hashmap. My first instict was to try something like this:

1
(defn (string "on-" planet) [seconds] ;do stuff)

This fails because the first argument to def or defn must be a symbol at readtime. intern solves this problem by finding or creating a var by the supplied symbol at runtime. From there, it is as easy as building the function we want bound to that var:

1
2
3
4
(doseq [[planet period] orbital-periods]
  (let [fn-name (symbol (str "on-" (name planet)))]
    (intern *ns* fn-name
      (fn [seconds] (/ (on-earth seconds) period)))))

Better Tests Through Metaprogramming

Duplication causes all of the same maintainability issues in test suites it does in production code, but I often see the DRY principle violated in the name of comprehensive test coverage and fidelity.

Let’s say we have a Rectangle class that normalizes a range of inputs and we want to test all of them. How can we accomplish this without:

  • Writing an individual test for each input (duplication)?
  • Writing one test with multiple assertions (lost fidelity)?

Consider this approach:

1
2
3
4
5
6
7
8
9
subject {Rectangle}
formats = {comma_delimited_string: '100, 100, 500, 500',
           array_with_strings: ["100", "100", "500", "500"],
           array_with_ints: [100, 100, 500, 500]}
formats.each do |(format_name, data)|
  it "accepts and normalizes data passed in as a #{format_name}" do
    subject.create(data).coordinates.should == formats[:array_with_ints
  end
end

String interpolation in the test description provides nearly the fidelity of breaking out assertions into individual tests. Neat!


Destructuring in Clojure and Ruby

As a long time Rubyist who picked up Clojure earlier this year, I have noticed the following pattern repeats itself:

  1. Encounter some new concept in Clojure
  2. Become confused by said concept and spin my wheels for a bit
  3. Realize I was using this concept in Ruby without knowing its name and full potential
  4. Simultaneously level up Ruby and Clojure skills

Destructuring is the perfect example of one such concept. Per the offical Clojure documentation:

Clojure supports abstract structural binding, often called destructuring, in let binding lists, fn parameter lists, and any macro that expands into a let or fn.

It was not immediately obvious to me based on this definition, but this is actually quite common in Ruby. Consider the following:

1
2
3
4
5
some_hash = {a:1,b:2,c:1}

some_hash.each do |(k,v)|
  # do stuff
end

Using (k, v) effectively tells Ruby:

  1. This element has two parts
  2. Assign the first part of that element to “k”
  3. Assign the second part of that element to “v”

Destructuring within list comprehenstions in Clojure looks remarkably similar:

1
2
3
(defn use-a-hashmap [some-hashmap]
  (for [[k v] some-hashmap]
   ; do amazing functional things

We can also destructure method arguments in Ruby. Consider a method that takes a three element array as an argument. Instead of this:

1
2
3
4
def do_thing_with_point(point)
  x, y, z = point
  # do stuff
end

We can use destructuring to skip a step:

1
2
3
def do_thing_with_point((x,y,z))
  # do stuff
end

As with many things in Ruby, the better question than “Can it be done this way?” is “Is it wise?”. Think this is a win because the second example gives (slightly!) more immediate insight into the argument this method accepts.