Personal schedule for David James
Download or
subscribe to David James's
schedule.
Semantic Technologies provide a simple, standardized methodology for representing, combing and sharing data and serve as the foundation for creating communities of open data. These technologies are both easy to learn and easy to use. This tutorial will introduce you to semantic programming using a variety of open source tools and programming techniques that you can use on your projects today.
Read more.
Event
Location: Exhibit Hall 3
Winners of the Google O'Reilly Open Source Award will be announced during this fun evening event.
Read more.
Event
Location: Meeting Room N
At the Sunlight Labs hackathon, Sunlight Labs will be working with developers on two major projects: 1. Parsing sites at for our 50 state project to get every state legislature in a common data format, and 2. Adding data into Sunlight's newest project, Congrelate.
Read more.
Abstraction is a powerful servant, but a dangerous master. We code, design, think, debug ... on a tower of abstractions. Spolsky's Law tells us that "All abstractions leak". This talk explores why they leak, why that's often a problem, what to do about it; I also cover why sometimes abstractions SHOULD "leak", and how best to produce and consume abstraction layers.
Read more.
Come learn the fundamentals of how to leverage Gearman, the open-source, distributed job queuing system. Originally designed to scale LiveJournal.com, Gearman is now faster than ever and can help you build your own scalable applications. Gearman's generic design allows it to be used as a building block for almost any use - from speeding up your website to building your own Map/Reduce cluster.
Read more.
Over the last few years, developments in the use of Open Source for creating efficient, verifiable, and trustworthy voting systems present viable approaches to solving technical problems in elections systems. The next wave of development will build on these recent achievements in the field by integrating them into the real, often messy, world of election administration and law.
Read more.
Panel of movers and shakers in the movement to open government using the principals of Open Source.
Read more.
Open source shares critical values with government and public education that make them function in the ideal; meritocracy of ideas, transparency, collaboration. But where is the sweet spot in the confluence of these social, technical, and public policy ideals? And where is the opportunity for the citizen developer to get involved?
Read more.
Hadoop is a powerful open source tool for analyzing large volumes of data. I'll provide an overview of Hadoop's architecture and describe some real-world use cases.
Read more.
This panel will discuss accessing open government initiatives and creating new services around existing government data on the internet. The idea is to get a point of view from each step of the process for open government initiatives, from producer and publisher, to standards advocate, to consumer and user, and to elected representative.
Read more.
The age of Big Data demands open-source tools that move beyond storage towards analytics: tools to turn terabytes into insights. R is an open-source language for statistical computing and graphics, and an extensible, embeddable tool for the analysis of large data sets. In this session, I showcase R's power by building predictive models for Brazilian soybean harvests and baseball slugger salaries.
Read more.
Moderated by: Colin Evans
Are you interested in open data? How about connecting your data to other data sets using Semantic Web technology? We'll be sharing ideas and answering questions on these topics and more.
Read more.
A graph db stores data in a network structure rather than in relational tables. This model is well suited for many web use cases such as tagging, metadata annotations, social networks, wikis and other network-shaped or hierarchical data sets. This talk will introduce Neo4j: a high-performance, transactional open source graph db, which frequently outperforms RDBMSs with >1000x for such use cases.
Read more.
Cassandra is a third-generation open source distributed database that
marries Bigtable's rich data model with Dynamo's aggressive simplicity
to produce a uniquely compelling alternative to traditional relational
databases.
Read more.
The end of "scale-up" computing is near. The coming wave of web-scale
data is too big to justify exponentially increasing hardware costs for
decreasing returns. Apache's "Cloud Stack" (Hadoop, Lucene, HBase,
etc) is enabling Visible Technologies to move from a non-scalable
MS-exclusive platform to a large cluster processing millions of pieces
of content a day.Here's what we learned.
Read more.
Event
Location: Meeting Room N
At the Sunlight Labs hackathon, Sunlight Labs will be working with developers on two major projects: 1. Parsing sites at for our 50 state project to get every state legislature in a common data format, and 2. Adding data into Sunlight's newest project, Congrelate.
Read more.
Replication. Partitioning. Relational databases. Bigtable. Dynamo.
There is no one-size-fits-all approach to scaling your database, and the CAP theorem proved that there never will be. This talk will explain the advantages and limits of the approaches to scaling traditional relational databases, as well as the tradeoffs made by the designers of newer systems like Google's Bigtable.
Read more.
The Haskell language makes it possible to write elegant code while achieving top-notch performance. We'll introduce you to the features that make fast code possible, focusing on one of the newest and most exciting techniques for number crunching and text processing: stream fusion.
Read more.