Map Your World empowers youth to explore issues and ideas that matter - like clean drinking water, or food justice – then write surveys, collect data, and create maps to make change in their communities.
Yesterday I spoke at PyConZA 2014 about Ona’s work building the vote tallying system for the Libyan Constitutional Assembly Election last February.
The slides from my talk are below:
Here is the abstract:
Earlier this year Ona was given three weeks to write the software that will tally votes in the Libyan elections and decide who wins and who loses. This is not something we could get wrong. We combined agile development with best practices in testing and QA to build an open source tally system that was well tested, accurate, and easy to use. We will describe a success story of iterative behavior/test-driven-development under extreme conditions. Did the structure of the data change the day before the election? Yes. Did we have the tests to ensure that our implementation changes would not compromise the system’s integrity? Yes, and they didn’t.
This talk provides a narrative to both Software Engineers and Tech/Product Managers describing why best practices are essential for any organization and any project of any size. We will provide the audience with:
Real world examples they can implement in their own workflow and organizations,
Insight into what succeeded (quick iteration with prioritization) and what was challenging (nothing being static),
Anecdotes and coherent arguments they can take back to their organization to advocate for best practices.
At Ona we are rebuilding our data management platform. We are
starting with a light weight front-end that will serve up content pulled
from the REST API of our current application.
We are aiming to have the back-end in Clojure, the front-end
in ClojureScript, and the infrastructure
in Clojure using Pallet. We are excited to have a
single (and a great) language handle all of these responsibilities.
We are still at a very early stage but we are a distributed team and like to
have our apps on development boxes as we go. This allows us to share a common
reference point, give mini-demos, and QA each other’s changes. Like Fabric for Python
and Capistrano for Ruby, Pallet let’s us do quick
deploys of the latest master or branch code.
Even better, Pallet let’s us write Clojure to bring up new clusters, similarly to Puppet, Chef, or Ansible – but in Clojure. We deploy
to EC2 on AWS and are glad to avoid spending time mucking around in the AWS GUI.
A succinct pallet file
specifies the instance, the web application, and the deployment. Putting the current
code online and bringing up a server (if one doesn’t already exist) is a single command:
lein do uberjar, with-profile +pallet pallet up \--phasesinstall,configure,deploy
This tells Leiningen to first create an uberjar,
which puts all of our app’s dependencies in a single jar file. It then uses
the pallet profile to install, configure, and deploy our application. This
command is idempotent, making it easy to push the latest jar up.
A nuance we did not anticipate is that you cannot output logs to stdout in a
Jetty app. This is not particularly surprising, but using stdout was a development
configuration that we had not yet bothered to abstract.
This does the normal logging if verbose? is true and otherwise does nothing.
When you run lein ring server-headless a handler is called which sets verbose?
to true. When you run the app through java -jar ..., as in our pallet configuration,
verbose? is set to false.
The ona-viewer project is a work-in-progress and we would welcome any feedback. Check it out on github.
While I was at Intent Media I led the data
engineering team in rebuilding and extending the Intent Media data platform. To structure and
simplify queries we relied on Cascalog, a Clojure
DSL built on top of the Cascading library that is
built on top of Apache Hadoop.
Cascalog is inspired by Datalog and
uses logic programming
to simplify query expression. It is similar to
Datomic for Clojure and the recent
DataScript for ClojureScript. This
allows simple and concise queries, e.g. to compute the average age per country:
Jon Sondag, a data scientist at Intent Media, recently gave a presentation at
the NYC Clojure Meetup about Cascalog in production. His slides are embedded
below.
It is great to see Cascalog being used in production data platforms.
Ona, a company I co-founded, recently built the tallying
software used to aggregate votes in the Libyan
constitutional assembly elections. These votes were cast
throughout the country, on off-shore oil rigs, and at international voting
centers throughout the world.
The Libyan High National Election Commission has generously made
the tally system software open source. All application source code is on
github, there is an
overview of
the tallying process, and additional code documentation. A description of the
technologies used is posted on the Ona blog.