Mobile Technologies presentation to the Mount Sinai Global Health Program

Friday, January 17, 2014 Benedetta Simeonidis, Roger Wong, and I gave a lecture on Wednesday discussing mobile technologies and their intersection with global health. We demoed Formhub and talked about Drishti. We also talked about the importance of user centric design in mobile technology.

The slides from our lecture are below:


Introduction to Clojure Presenation at Intent Media

Wednesday, November 27, 2013

Ivan Willig and I gave a presentation on Monday introducing Clojure and discussing some of the production Clojure code in the Intent Media data platform.

Rich Hickey

The full presentation is available on github and as interactive slides.


Ruby and Clojure client libraries for the Helioid API

Sunday, November 10, 2013

We recently released a Helioid API that returns categorized search results. To retrieve JSON results for a query like “data analytics” simple append “?format=json” to the URL, i.e.

[http://www.helioid.com/searches/q/data+analytics?format=json] (http://www.helioid.com/searches/q/data+analytics?format=json)

To make this easier to use we have released open source Ruby and Clojure client libraries. Install the Ruby library with:

gem install heliapi

then load and fetch categories using:

require 'heliapi'

results = Heliapi.new.web('ruby apis')

results['categories'].keys

which returns:

=> ['Developer',
    'Access',
    'Provides',
    'Rails',
    'Building',
    'Install',
    'Google Api Ruby',
    'Ruby Client'
]

To install the Clojure library add heliapi to your Leiningen project.clj file:

[heliapi "0.0.1"]

then load and fetch categories with:

(:require [heliapi.core :as helioid])

(map #(:name %)
     (:categories (helioid/web "helioid")))

which returns the results as:

=> ("search refinement"
    "search engine"
    "results"
    "helioid choroiditis"
    "intranuclear helioid inclusions"
    "intranuclear helioid"
    "new"
    "helioid search")

We will add features to the API and client libraries as requested. We will also make libraries for other languages as requested.


Distributed Classification with ADMM

Wednesday, October 09, 2013

Today Jon Sondag and I presented our paper on ADMM for Hadoop at the IEEE BigData 2013 conference.

The paper describes our implementation of Boyd's ADMM algorithm in Hadoop Map Reduce. We talk about the statistical details of implementing ADMM as well as the nuances of storing state on Hadoop.

In our presentation we present background on the data pipeline we have built at Intent Media and motivate why a Hadoop Map Reduce job is the appropriate run-time for us to use. We mention the alternatives for building distributed logistic regression models, such as sampling the data, Apache Mahout, Vowpal Wabbit, and Spark.

We also discuss alternatives specifically designed for iterative computation on Hadoop, such as HaLoop and Twister.

Our presentation is below:

You may also read the full paper Practical Distributed Classification using the Alternating Direction Method of Multipliers Algorithm.

The paper describes our open source Hadoop based implementation of the ADMM algorithm and how to use it to compute a distributed logistic regression model.


Categorizing Text in Ruby

Thursday, May 23, 2013

We have open sourced the categorization libary that powers the fast dynamic labels and clusters on the Helioid site. This library is built to prioritize performance over accuracy. The library takes label quality into account by first generating a set of labels and then assigning documents to those labels, we have found that this increases the likelihood of producing meaningful labels.

The below example shows how to create a set of labeled cluster from documents. First include the categorize library.

require 'categorize'

include Categorize

Then define your set of documents.

documents = [
  'lorem ipsum dolor',
  'sed perspiciatis unde',
  'vero eos accusamus',
  'vero eos accusamus iusto odio'
]

Now make a model based on an additional query term, lorem, in this case.

Model.make_model('lorem', documents)
=> {
   'ipsum'            => [0],
   'sed perspiciatis' => [1],
   'vero'             => [2, 3]
}

The model output is a map of cluster labels to documents within those clusters. Install the gem and try it out.


Peter
Lubell-Doughtie

about
projects
archive