Fork me on GitHub

Joslyn Esser

Consuming the Twitter Streaming API using Heroku and MongoDB

Pulling in tweets with the Twitter Streaming API with web applications can be a challenge. Adam Wiggins of Heroku discussed a way to consume the API with EventMachine. Another post described using this method with thin and sinatra. I decided to take it one step further and created a demo application that you can deploy on Heroku and use MongoDB for fast and efficient storage.

Demo application

Code can speak for itself. Go ahead and clone my demo project:

git clone git://github.com/joslynesser/mongo-twitter-streaming.git

Create your heroku application, add the MongoHQ addon, and add your Twitter credentials:

heroku create
heroku addons:add mongohq:free
heroku config:add TWITTER_USERNAME=username
heroku config:add TWITTER_PASSWORD=password
git push heroku

You can see a working demo here: http://mongo-twitter-streaming-demo.heroku.com/. The demo displays the latest tweets and stores a maximum of 10MB worth of data before dropping old tweets. This keeps our Heroku application completely free by being under the free MongoHQ size limit.

Why use Heroku?

  • Thin web server provides the ability for asynchronous responses
  • Easy deployment
  • Simple MongoDB setup with MongoHQ

Why use MongoDB?

  • Very fast write performance
  • No schema
  • Tweets coming in as JSON (Mongo plays very nice with JSON)
  • Capped collections
    • extremely fast write performance (due to having 0 indexes)
    • remove old tweets automatically after a maximum has been reached
    • stored in the order that tweets are received

3 Comments

  1. Alessandro — February 09, 2011

    Thank you!! You saved my project with this! I was exactly searching for something to use Twitter Streams without having a (paid) background worker on Heroku.

    Cheers,
    Alessandro

  2. Kevin — April 09, 2011

    Awesome code! One question, however. How do you get it to reconnect when the connection drops? I am tracking a low volume keyword, and Twitter is connecting me because of lack of activity. At that point, I have to run ‘heroku restart’ to get it to reconnect.

    Any ideas? Thanks.

  3. Wes — September 29, 2011

    Thanks very much for this. I’m a n00b so I was wondering how does this code get kicked off on the server? Does it consume a background worker?

Make a Comment