Not only do we have our usual top notch speakers talking about their first hand experiences across the subjects of real world deep learning, data and data systems engineering and building scalable engineering culture, but all proceeds are going to the outstanding charity, TechFugees. This one day event will bring new perspective across these three critical areas of modern day software engineering all the while helping refugees gain access to the knowledge economy.
Hakan Jakobsson
Piyush Narang
David Chaiken
Doug Loyer
Julien le Dem
Mohsin Hussain
Yoav Zimmerman
Amr Awdallah
Justin Coffey
Gerben Stavenga
Ted Dunning
Ran Lei
Jie Li
We are witnessing a new revolution in data, the age of automation of decisions. In this presentation, Cloudera cofounder and CTO Amr Awadallah will explain the historic importance of this wave, the common patterns with which it manifests itself in organizations today, then conclude by talking about the foundational capabilities required to enable it.
This talk describes the process of improving the quality of business metrics reporting at Pinterest. This process consisted of specifying core metrics, understanding the end-to-end architecture, executing a cross-functional improvement program, and creating a novel reporting tool. The talk extracts five tips for successfully improving metrics quality from this process: know your stakeholders; define core metrics; prioritize quality; fund test implementation, and measure progress. The talk focuses on the innovation that led to a new kind of metrics quality measurement report, which PInterest has been using to track our progress throughout the year.
Technical overview of challenges and trade offs in the design and use of protobuffers within Google: This talk will discuss code size, and CPU efficiency and how different languages and platforms lead to different designs. I touch upon using Arena's new upcoming features that we work on to release and compare with competitors like Cap'n Proto, flatbuffers and thrift.
Despite enormous excitement about the potential of deep learning, building practical applications powered by deep learning remains an enormous challenge: the necessary expertise is scarce, the hardware requirements can be prohibitive, and current software tools are immature and limited in scope. In this talk, we will first describe how deep learning workflows are supported by existing software tooling. We will then describe several promising opportunities to drastically improve these workflows via novel algorithmic and software solutions, including reproducible workflow management and efficient utilization of deep learning cluster resources. This talk draws on our experiences at Determined AI, a startup that builds software to make deep learning engineers dramatically more productive.
We all know how hard Big Data stacks can be to build, use and maintain. Gartner estimates that 85% of big data projects are killed before production release. In this talk engineering leaders from Criteo's Data Reliability Engineering team will show how wide spread use of SQL addressed the two biggest issues in data engineering: systems efficiency and developer productivity.
Deep learning on text and for recommendations has had some amazing successes in building very usable semantic models of words or behavior.
It is a little known fact that many of these results can also be achieved with vastly simpler techniques based on simply finding words or actions that appear together. Recently developed algorithms allow large-scale cooccurrence analysis of this sort to be updated accurately and safely in hard realtime. In contrast, this is particularly difficult with deep models. Models generated from cooccurrence analysis also retain sparsity so they can often be deployed using very standard software such as text search engines like ElasticSearch or Solr.
This talk will be very approachable and will not require any advanced mathematics and will be interesting to a wide audience but it won't be dumbed down, either. I will show example of applications for these algorithms as well as walk through the key algorithms at a high level as well as describing some open source implementations.
This talk delves into few of important questions:
1. Why culture matters?
2. What makes a good engineering culture and
3. How does Criteo evolve it’s engineering culture. We will deep-dive into an in-depth example and review other cultural elements that have worked for well Criteo engineering
Polar opposites abound in ML systems. Big data v.s. limited labeling capacity, Offline v.s. online learning, Newton vs. Hinton, etc. These polar opposites bring about conflict, trade-offs, and decision difficulties. In this talk, we will discuss examples of polar opposites in our ML efforts towards autonomous driving, and how we handle them at TRI
A project to evaluate alternative database technologies as a partial replacement for Hive. The motivation for the project was that Hive is slow and inefficient and it was felt that we could improve the productivity of our analysts with a technology with better response time while also saving money on hardware. We describe the evaluation process and the technology that was picked, Presto. We also describe some of the practical work that was done in order to deploy Presto on a 200-node production cluster, including frameworks for monitoring, testing, upgrading, failover, and end-user training
Machine Learning based predictive systems are used for numerous business applications, such as advertising, transaction and customer churn prediction. Good predictive systems can result in a better business outcomes. This talk addresses the practical problems and approaches to debugging and improving a predictive system and how they differ from other Machine Learning tasks.
Criteo Palo Alto
325 Lytton Street, Suite 200
Palo Alto CS 94301