Consuming the Twitter Streaming API using Heroku and MongoDB
Pulling in tweets with the Twitter Streaming API with web applications can be a challenge. Adam Wiggins of Heroku discussed a way to consume the API with EventMachine. Another post described using this method with thin and sinatra. I decided to take it one step further and created a demo application that you can deploy on Heroku and use MongoDB for fast and efficient storage.
Code can speak for itself. Go ahead and clone my demo project:
git clone git://github.com/joslynesser/mongo-twitter-streaming.git
Create your heroku application, add the MongoHQ addon, and add your Twitter credentials:
heroku create heroku addons:add mongohq:free heroku config:add TWITTER_USERNAME=username heroku config:add TWITTER_PASSWORD=password git push heroku
You can see a working demo here: http://mongo-twitter-streaming-demo.heroku.com/. The demo displays the latest tweets and stores a maximum of 10MB worth of data before dropping old tweets. This keeps our Heroku application completely free by being under the free MongoHQ size limit.
Why use Heroku?
- Thin web server provides the ability for asynchronous responses
- Easy deployment
- Simple MongoDB setup with MongoHQ
Why use MongoDB?
- Very fast write performance
- No schema
- Tweets coming in as JSON (Mongo plays very nice with JSON)
- Capped collections
- extremely fast write performance (due to having 0 indexes)
- remove old tweets automatically after a maximum has been reached
- stored in the order that tweets are received