• 10gen
  • DataStax, Inc.
  • Dell
  • Google
  • Lexis Nexis
  • Oracle
  • VMware
  • Percona

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the convention, contact Sharon Cordesse at scordesse@oreilly.com

Download the OSCON Data Sponsor/Exhibitor Prospectus

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

OSCON Bulletin

To stay abreast of convention news and announcements, please sign up for the OSCON email bulletin (login required)

Contact Us

View a complete list of OSCON contacts

The Hitchhiker’s Guide to A Kaggle Competition

Krishna Sankar (Tata America International)
Data: Roulette
Location: Oregon Ballroom 203
Average rating: ***..
(3.00, 3 ratings)

An introductory hands-on workshop, aimed at the Amateur Data Scientists among us, to the Heritage Health Prize competition. First, we will quickly look at the classes of algorithms & what they do through competition problems & datasets. Next we will dig deeper into one completion the Kaggle RTA Challenge(Ensemble/Random Forest). We will then dive into the Heritage Health Prize, work through the dataset & submit an entry!

Note: While there is not enough time for the participants to work through the different datasets, we will provide links to a hands-on tutorial which you’all can do after the workshop.


  • Algorithms for the Amateur Data Scientist
    • A look at the broader algorithms leading to Trees & Random Forests
  • The Art of Analytics Competitions – The Kaggle challenges
  • Anatomy of a competition – How the RTA was won
    • Predicting traffic at RTA using Ensemble /Random Forest Trees
  • Competition in flight – The HHP
    • Dataset Organization
    • Analytics Walkthrough
    • Submit our entry
  • Conclusion
Photo of Krishna Sankar

Krishna Sankar

Tata America International

Krishna Sankar is a Principal Architect/Data Scientist, with Tata Consultancy Services Digital Enterprise group. His focus includes Big Data, Analytics & AI. Earlier stints include Director of Engg & Data Science at a Bioinformatics startup, at Egnyte Inc as a Lead Architect for storage cloud and as a Distinguished Engineer at Cisco. Recently he was member of Industry Program Committee & Paper reviewer for KDD2013 & KDD2014. Krishna’s speaking engagements include pydata 2013 – Bayesian Machine Learning, OSCON – Social Media Analysis with Twitter (2012), Hitchhiker’s Guide to Kaggle (2011) & NOSQL (2010) and guest lecturing at the Naval Postgraduate School. His other passion is Lego Robotics – as Technical Judge in FLL world competitions.

Comments on this page are now closed.


Picture of Krishna Sankar
Krishna Sankar
07/27/2011 5:01pm PDT
There was a question from today’s workshop about good books on algorithms. The best list I have seen are answers at Quora and one at Linkedin:
Picture of Krishna Sankar
Krishna Sankar
07/25/2011 3:59pm PDT

I have downloaded a WIP snapshot at www.slideshare.net/ksankar/.... WOuld appreciate any comments. Beware – I have too many slides, it is intentional.