إظهار الرسائل ذات التسميات Publications. إظهار كافة الرسائل
إظهار الرسائل ذات التسميات Publications. إظهار كافة الرسائل
الأربعاء، 3 يوليو 2013
Conference Report: USENIX Annual Technical Conference (ATC) 2013
This year marks Google’s eleventh consecutive year as a sponsor of the USENIX Annual Technical Conference (ATC), just one of the co-located events at USENIX Federated Conference Week (FCW), which combines numerous conferences and workshops covering fields such as Autonomic Computing, Feedback Computing and much more in an intensive week of research, trends, and community interaction.
ATC provides a broad forum for computing systems research with an emphasis on implementations and experimental results. In addition to the Googlers presenting publications, we had two members on the program committee of ATC and several keynote speakers, invited speakers, panelists, committee members, and participants at the other co-located events at FCW.
In the paper Janus: Optimal Flash Provisioning for Cloud Storage Workloads, Googler Christoph Albrecht and co-authors demonstrated a system that allows users to make informed flash memory provisioning and partitioning decisions in cloud-scale distributed file systems that include both flash storage and disk tiers. As flash memory is still expensive, it is best to use it only for workloads that can make good use of it. Janus creates long term workload characterizations based on RPC samples and file age metadata. It uses these workload characterizations to formulate and solve an optimization problem that maximizes the reads sent to the flash tier. Based on evaluations from workloads using Janus, in use at Google for the past 6 months, the authors conclude that the recommendation system is quite effective, with flash hit rates using the optimized recommendations 47-76% higher than the option of using the flash as an unpartitioned tier.
In packetdrill: Scriptable Network Stack Testing, from Sockets to Packets, Google’s Neal Cardwell and co-authors showcased a portable, open-source scripting tool that enables testing the correctness and performance of network protocols. Despite their importance in modern computer systems, network protocols often undergo only ad hoc testing before their deployment, in large part due to their complexity. Furthermore, new algorithms have unforeseen interactions with other features, so testing has only become more daunting as TCP has evolved. The packetdrill tool was instrumental in the development of three new features for Linux TCP—Early Retransmit, Fast Open, and Loss Probes—and allowed the authors to find and fix 10 bugs in Linux. Furthermore, the team uses packetdrill in all phases of the development process for the kernel used in one of the world’s largest Linux installations. In the hope that sharing packetdrill with the community will make the process of improving Internet protocols an easier one, the source code and test scripts for packetdrill have been made freely available.
There were also additional refereed publications with Google co-authors at some of the co-located events at FCW, notably NicPic: Scalable and Accurate End-Host Rate Limiting, which outlines a system which enables accurate network traffic scheduling in a scalable fashion, and AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service, a system that efficiently handles dynamic application workloads, reducing both penalties and user dissatisfaction.
Google is proud to support the academic community through conference participation and sponsorship. In particular, we are happy to mention one of the other interesting papers from this year’s USENIX FCW, co-authored by former Google PhD fellowship recipient Ashok Anand, MiG: Efficient Migration of Desktop VM Using Semantic Compression.
USENIX is a supporter of open access, so the papers and videos from the talks are available on the conference website.
الخميس، 27 يونيو 2013
Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
Posted by Tom Dean, Google Research
Humans can distinguish among approximately 10,000 relatively high-level visual categories, but we can discriminate among a much larger set of visual stimuli referred to as features. These features might correspond to object parts, animal limbs, architectural details, landmarks, and other visual patterns we don’t have names for, and it is this larger collection of features we use as a basis with which to reconstruct and explain our day-to-day visual experience. Such features provide the components for more complicated visual stimuli and establish a context essential for us to resolve ambiguous scenes.
Contrary to current practice in computer vision, the explanatory context required to resolve a visual detail may not be entirely local. A flash of red bobbing along the ground might be a child’s toy in the context of a playground or a rooster in the context of a farmyard. It would be useful to have a large number of feature detectors capable of signaling the presence of such features, including detectors for sandboxes, swings, slides, cows, chickens, sheep and farm machinery necessary to establish the context for distinguishing between these two possibilities.
This year’s winner of the CVPR Best Paper Award, co-authored by Googlers Tom Dean, Mark Ruzon, Mark Segal, Jonathon Shlens, Sudheendra Vijayanarasimhan and Jay Yagnik, describes technology that will enable computer vision systems to extract the sort of semantically rich contextual information required to recognize visual categories even when a close examination of the pixels spanning the object in question might not be sufficient for identification in the absence of such contextual clues. Specifically, we consider a basic operation in computer vision that involves determining for each location in an image the degree to which a particular feature is likely to be present in the image at that particular location.
This so-called convolution operator is one of the key operations used in computer vision and, more broadly, all of signal processing. Unfortunately, it is computationally expensive and hence researchers use it sparingly or employ exotic SIMD hardware like GPUs and FPGAs to mitigate the computational cost. We turn things on their head by showing how one can use fast table lookup — a method called hashing — to trade time for space, replacing the computationally-expensive inner loop of the convolution operator — a sequence of multiplications and additions — required for performing millions of convolutions with a single table lookup.
We demonstrate the advantages of our approach by scaling object detection from the current state of the art involving several hundred or at most a few thousand of object categories to 100,000 categories requiring what would amount to more than a million convolutions. Moreover, our demonstration was carried out on a single commodity computer requiring only a few seconds for each image. The basic technology is used in several pieces of Google infrastructure and can be applied to problems outside of computer vision such as auditory signal processing.
On Wednesday, June 26, the Google engineers responsible for the research were awarded Best Paper at a ceremony at the IEEE Conference on Computer Vision and Pattern Recognition held in Portland Oregon. The full paper can be found here.
Humans can distinguish among approximately 10,000 relatively high-level visual categories, but we can discriminate among a much larger set of visual stimuli referred to as features. These features might correspond to object parts, animal limbs, architectural details, landmarks, and other visual patterns we don’t have names for, and it is this larger collection of features we use as a basis with which to reconstruct and explain our day-to-day visual experience. Such features provide the components for more complicated visual stimuli and establish a context essential for us to resolve ambiguous scenes.
Contrary to current practice in computer vision, the explanatory context required to resolve a visual detail may not be entirely local. A flash of red bobbing along the ground might be a child’s toy in the context of a playground or a rooster in the context of a farmyard. It would be useful to have a large number of feature detectors capable of signaling the presence of such features, including detectors for sandboxes, swings, slides, cows, chickens, sheep and farm machinery necessary to establish the context for distinguishing between these two possibilities.
This year’s winner of the CVPR Best Paper Award, co-authored by Googlers Tom Dean, Mark Ruzon, Mark Segal, Jonathon Shlens, Sudheendra Vijayanarasimhan and Jay Yagnik, describes technology that will enable computer vision systems to extract the sort of semantically rich contextual information required to recognize visual categories even when a close examination of the pixels spanning the object in question might not be sufficient for identification in the absence of such contextual clues. Specifically, we consider a basic operation in computer vision that involves determining for each location in an image the degree to which a particular feature is likely to be present in the image at that particular location.
This so-called convolution operator is one of the key operations used in computer vision and, more broadly, all of signal processing. Unfortunately, it is computationally expensive and hence researchers use it sparingly or employ exotic SIMD hardware like GPUs and FPGAs to mitigate the computational cost. We turn things on their head by showing how one can use fast table lookup — a method called hashing — to trade time for space, replacing the computationally-expensive inner loop of the convolution operator — a sequence of multiplications and additions — required for performing millions of convolutions with a single table lookup.
We demonstrate the advantages of our approach by scaling object detection from the current state of the art involving several hundred or at most a few thousand of object categories to 100,000 categories requiring what would amount to more than a million convolutions. Moreover, our demonstration was carried out on a single commodity computer requiring only a few seconds for each image. The basic technology is used in several pieces of Google infrastructure and can be applied to problems outside of computer vision such as auditory signal processing.
On Wednesday, June 26, the Google engineers responsible for the research were awarded Best Paper at a ceremony at the IEEE Conference on Computer Vision and Pattern Recognition held in Portland Oregon. The full paper can be found here.
الخميس، 13 يونيو 2013
Excellent Papers for 2012
Posted by Corinna Cortes and Alfred Spector, Google Research
Googlers across the company actively engage with the scientific community by publishing technical papers, contributing open-source packages, working on standards, introducing new APIs and tools, giving talks and presentations, participating in ongoing technical debates, and much more. Our publications offer technical and algorithmic advances, feature aspects we learn as we develop novel products and services, and shed light on some of the technical challenges we face at Google.
In an effort to highlight some of our work, we periodically select a number of publications to be featured on this blog. We first posted a set of papers on this blog in mid-2010 and subsequently discussed them in more detail in the following blog postings. In a second round, we highlighted new noteworthy papers from the later half of 2010 and again in 2011. This time we honor the influential papers authored or co-authored by Googlers covering all of 2012 -- covering roughly 6% of our total publications. It’s tough choosing, so we may have left out some important papers. So, do see the publications list to review the complete group.
In the coming weeks we will be offering a more in-depth look at some of these publications, but here are the summaries:
Algorithms and Theory
Online Matching with Stochastic Rewards
Aranyak Mehta*, Debmalya Panigrahi [FOCS'12]
Online advertising is inherently stochastic: value is realized only if the user clicks on the ad, while the ad platform knows only the probability of the click. This paper is the first to introduce the stochastic nature of the rewards to the rich algorithmic field of online allocations. The core algorithmic problem it formulates is online bipartite matching with stochastic rewards, with known click probabilities. The main result is an online algorithm which obtains a large fraction of the optimal value. The paper also shows the difficulty introduced by the stochastic nature, by showing how it behaves very differently from the classic (non-stochastic) online matching problem.
Matching with our Eyes Closed
Gagan Goel*, Pushkar Tripathi* [FOCS'12]
In this paper we propose a simple randomized algorithm for finding a matching in a large graph. Unlike most solutions to this problem, our approach does not rely on building large combinatorial structures (like blossoms) but works completely locally. We analyze the performance of our algorithm and show that it does significantly better than the greedy algorithm. In doing so we improve a celebrated 18 year old result by Aronson et. al.
Simultaneous Approximations for Adversarial and Stochastic Online Budgeted Allocation
Vahab Mirrokni*, Shayan Oveis Gharan, Morteza Zadimoghaddam, [SODA'12]
In this paper, we study online algorithms that simultaneously perform well in worst-case and average-case instances, or equivalently algorithms that perform well in both stochastic and adversarial models at the same time. This is motivated by online allocation of queries to advertisers with budget constraints. Stochastic models are not robust enough to deal with traffic spikes and adversarial models are too pessimistic. While several algorithms have been proposed for these problems, each algorithm was known to perform well in one model and not both, and we present new results for a single algorithm that works well in both models.
Economics and EC
Polyhedral Clinching Auctions and the Adwords Polytope
Gagan Goel*, Vahab Mirrokni*, Renato Paes Leme [STOC'12]
Budgets play a major role in ad auctions where advertisers explicitly declare budget constraints. Very little is known in auctions about satisfying such budget constraints while keeping incentive compatibility and efficiency. The problem becomes even harder in the presence of complex combinatorial constraints over the set of feasible allocations. We present a class of ascending-price auctions addressing this problem for a very general class of (polymatroid) allocation constraints including the AdWords problem with multiple keywords and multiple slots.
HCI
Backtracking Events as Indicators of Usability Problems in Creation-Oriented Applications
David Akers*, Robin Jeffries*, Matthew Simpson*, Terry Winograd [TOCHI '12]
Backtracking events such as undo can be useful automatic indicators of usability problems for creation-oriented applications such as word processors and photo editors. Our paper presents a new cost-effective usability evaluation method based on this insight.
Talking in Circles: Selective Sharing in Google+
Sanjay Kairam, Michael J. Brzozowski*, David Huffaker*, Ed H. Chi*, [CHI'12]
This paper explores why so many people share selectively on Google+: to protect their privacy but also to focus and target their audience. People use Circles to support these goals, organizing contacts by life facet, tie strength, and interest.
Information Retrieval
Online selection of diverse results
Debmalya Panigrahi, Atish Das Sarma, Gagan Aggarwal*, and Andrew Tomkins*, [WSDM'12]
We consider the problem of selecting subsets of items that are simultaneously diverse in multiple dimensions, which arises in the context of recommending interesting content to users. We formally model this optimization problem, identify its key structural characteristics, and use these observations to design an extremely scalable and efficient algorithm. We prove that the algorithm always produces a nearly optimal solution and also perform experiments on real-world data that indicate that the algorithm performs even better in practice than the analytical guarantees.
Machine Learning
Large Scale Distributed Deep Networks
Jeffrey Dean, Greg S. Corrado*, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Andrew Y. Ng, NIPS 2012;
In this paper, we examine several techniques to improve the time to convergence for neural networks and other models trained by gradient-based methods. The paper describes a system we have built that exploits both model-level parallelism (by partitioning the nodes of a large model across multiple machines) and data-level parallelism (by having multiple replicas of a model process different training data and coordinating the application of updates to the model state through a centralized-but-partitioned parameter server system). Our results show that very large neural networks can be trained effectively and quickly on large clusters of machines.
Open Problem: Better Bounds for Online Logistic Regression
Brendan McMahan* and Matthew Streeter*, COLT/ICML'12 Joint Open Problem Session, JMLR: Workshop and Conference Proceedings.
One of the goals of research at Google is to help point out important open problems--precise questions that are interesting academically but also have important practical ramifications. This open problem is about logistic regression, a widely used algorithm for predicting probabilities (what is the probability an email message is spam, or that a search ad will be clicked). We show that in the simple one-dimensional case, much better results are possible than current theoretical analysis suggests, and we ask whether our results can be generalized to arbitrary logistic regression problems.
Spectral Learning of General Weighted Automata via Constrained Matrix Completion
Borja Balle and Mehryar Mohri*, NIPS 2012.
Learning weighted automata from finite samples drawn from an unknown distribution is a central problem in machine learning and computer science in general, with a variety of applications in text and speech processing, bioinformatics, and other areas. This paper presents a new family of algorithms for tackling this problem for which it proves learning guarantees. The algorithms introduced combine ideas from two different domains: matrix completion and spectral methods.
Machine Translation
Improved Domain Adaptation for Statistical Machine Translation
Wei Wang*, Klaus Macherey*, Wolfgang Macherey*, Franz Och* and Peng Xu*, [AMTA'12]
Research in domain adaptation for machine translation has been mostly focusing on one domain. We present a simple and effective domain adaptation infrastructure that makes an MT system with a single translation model capable of providing adapted, close-to-upper-bound domain-specific accuracy while preserving the generic translation accuracy. Large-scale experiments on 20 language pairs for patent and generic domains show the viability of our approach.
Multimedia and Computer Vision
Reconstructing the World's Museums
Jianxiong Xiao and Yasutaka Furukawa*, [ECCV '12]
Virtual navigation and exploration of large indoor environments (e.g., museums) have been so far limited to either blueprint-style 2D maps that lack photo-realistic views of scenes, or ground-level image-to-image transitions, which are immersive but ill-suited for navigation. This paper presents a novel vision-based 3D reconstruction and visualization system to automatically produce clean and well-regularized texture-mapped 3D models for large indoor scenes, from ground-level photographs and 3D laser points. For the first time, we enable users to easily browse a large scale indoor environment from a bird's-eye view, locate specific room interiors, fly into a place of interest, view immersive ground-level panoramas, and zoom out again, all with seamless 3D transitions.
The intervalgram: An audio feature for large-scale melody recognition
Thomas C. Walters*, David Ross*, Richard F. Lyon*, [CMMR'12]
Intervalgrams are small images that summarize the structure of short segments of music by looking at the musical intervals between the notes present in the music. We use them for finding cover songs - different pieces of music that share the same underlying composition. Wedo this by comparing 'heatmaps' which look at the similarity between intervalgrams from different pieces of music over time. If we see a strong diagonal line in the heatmap, it's good evidence that the songs are musically similar.
General and Nested Wiberg Minimization
Dennis Strelow*, [CVPR'12]
Eriksson and van den Hengel’s CVPR 2010 paper showed that Wiberg’s least squares matrix factorization, which effectively eliminates one matrix from the factorization problem, could be applied to the harder case of L1 factorization. Our paper generalizes their approach beyond factorization to general nonlinear problems in two sets of variables, like perspective structure-from-motion. We also show that with our generalized method, one Wiberg minimization can also be nested inside another, effectively eliminating two of three sets of unknowns, and we demonstrated this idea using projective struture-from-motion
Calibration-Free Rolling Shutter Removal
Matthias Grundmann*, Vivek Kwatra*, Daniel Castro, Irfan Essa*, International Conference on Computational Photography '12. Best paper.
Mobile phones and current generation DSLR’s, contain an electronic rolling shutter, capturing each frame one row of pixels at a time. Consequently, if the camera moves during capture, it will cause image distortions ranging from shear to wobbly distortions. We propose a calibration-free solution based on a novel parametric mixture model to correct these rolling shutter distortions in videos that enables real-time rolling shutter rectification as part of YouTube’s video stabilizer.
Natural Language Processing
Vine Pruning for Efficient Multi-Pass Dependency Parsing
Alexander Rush, Slav Petrov*, The 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL '12), Best Paper Award.
Being able to accurately analyze the grammatical structure of sentences is crucial for language understanding applications such as machine translation or question answering. In this paper we present a method that is up to 200 times faster than existing methods and enables the grammatical analysis of text in large-scale applications. The key idea is to perform the analysis in multiple coarse-to-fine passes, resolving easy ambiguities first and tackling the harder ones later on.
Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure
Oscar Tackstrom, Ryan McDonald*, Jakob Uszkoreit*, North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL '12), Best Student Paper Award.
This paper studies how to build meaningful cross-lingual word clusters, i.e., clusters containing lexical items from two languages that are coherent along some abstract dimension. This is done by coupling distributional statistics learned from huge amounts of language specific data coupled with constraints generated from parallel corpora. The resulting clusters are used to improve the accuracy of multi-lingual syntactic parsing for languages without any training resources.
Networks
How to Split a Flow
Tzvika Hartman*, Avinatan Hassidim*, Haim Kaplan*, Danny Raz*, Michal Segalov*, [INFOCOM '12]
Decomposing a flow into a small number of paths is a very important task arises in various network optimization mechanisms. In this paper we develop an an approximation algorithm for this problem that has both provable worst case performance grantees as well as good practical behavior.
Deadline-Aware Datacenter TCP (D2TCP)
Balajee Vamanan, Jahangir Hasan*, T. N. Vijaykumar, [SIGCOMM '12]
Some of our most important products like search and ads operate under soft-real-time constraints. They are architected and fine-tuned to return results to users within a few hundred milliseconds. Deadline-Aware Datacenter TCP is a research effort into making the datacenter networks deadline aware, thus improving the performance of such key applications.
Trickle: Rate Limiting YouTube Video Streaming
Monia Ghobadi, Yuchung Cheng*, Ankur Jain*, Matt Mathis* [USENIX '12]
Trickle is a server-side mechanism to stream YouTube video smoothly to reduce burst and buffer-bloat. It paces the video stream by placing an upper bound on TCP’s congestion window based on the streaming rate and the round-trip time. In initial evaluation Trickle reduces the TCP loss rate by up to 43% and the RTT by up to 28%. Given the promising results we are deploying Trickle to all YouTube servers.
Social Systems
Look Who I Found: Understanding the Effects of Sharing Curated Friend Groups
Lujun Fang*, Alex Fabrikant*, Kristen LeFevre*, [Web Science '12]. Best Student Paper award.
In this paper, we studied the impact of the Google+ circle-sharing feature, which allows individual users to share (publicly and privately) pre-curated groups of friends and contacts. We specifically investigated the impact on the growth and structure of the Google+ social network. In the course of the analysis, we identified two natural categories of shared circles ("communities" and "celebrities"). We also observed that the circle-sharing feature is associated with the accelerated densification of community-type circles.
Software Engineering
AddressSanitizer: A Fast Address Sanity Checker
Konstantin Serebryany*, Derek Bruening*, Alexander Potapenko*, Dmitry Vyukov*, [USENIX ATC '12].
The paper “AddressSanitizer: A Fast Address Sanity Checker” describes a dynamic tool that finds memory corruption bugs in C or C++ programs with only a 2x slowdown. The major feature of AddressSanitizer is simplicity -- this is why the tool is very fast.
Speech
Japanese and Korean Voice Search
Mike Schuster*, Kaisuke Nakajima*, IEEE International Conference on Acoustics, Speech, and Signal Processing [ICASSP'12].
"Japanese and Korean voice search" explains in detail how the Android voice search systems for these difficult languages were developed. We describe how to segment statistically to be able to handle infinite vocabularies without out-of-vocabulary words, how to handle the lack of spaces between words for language modeling and dictionary generation, and how to deal best with multiple ambiguities during evaluation scoring of reference transcriptions against hypotheses. The combination of techniques presented led to high quality speech recognition systems--as of 6/2013 Japanese and Korean are #2 and #3 in terms of traffic after the US.
Google's Cross-Dialect Arabic Voice Search
Fadi Biadsy*, Pedro J. Moreno*, Martin Jansche*, IEEE International Conference on Acoustics, Speech, and Signal Processing [ICASSP 2012].
This paper describes Google’s automatic speech recognition systems for recognizing several Arabic dialects spoken in the Middle East, with the potential to reach more than 125 million users. We suggest solutions for challenges specific to Arabic, such as the diacritization problem, where short vowels are not written in Arabic text. We conduct experiments to identify the optimal manner in which acoustic data should be clustered among dialects.
Deep Neural Networks for Acoustic Modeling in Speech Recognition
Geoffrey Hinton*, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew W. Senior*, Vincent Vanhoucke*, Patrick Nguyen, Tara Sainath, Brian Kingsbury, Signal Processing Magazine (2012)"
Survey paper on the DNN breakthrough in automatic speech recognition accuracy.
Statistics
Empowering Online Advertisements by Empowering Viewers with the Right to Choose
Max Pashkevich*, Sundar Dorai-Raj*, Melanie Kellar*, Dan Zigmond*, Journal of Advertising Research, vol. 52 (2012).
YouTube’s TrueView in-stream video advertising format (a form of skippable in-stream ads) can improve the online video viewing experience for users without sacrificing advertising value for advertisers or content owners.
Structured Data
Efficient Spatial Sampling of Large Geographical Tables
Anish Das Sarma*, Hongrae Lee*, Hector Gonzalez*, Jayant Madhavan*, Alon Halevy*, [SIGMOD '12].
This paper presents fundamental results for the "thinning problem": determining appropriate samples of data to be shown on specific geographical regions and zoom levels. This problem is widely applicable for a number of cloud-based geographic visualization systems such as Google Maps, Fusion Tables, and the developed algorithms are part of the Fusion Tables backend. The SIGMOD 2012 paper was selected among the best papers of the conference, and invited to a special best-papers issue of TODS.
Systems
Spanner: Google's Globally-Distributed Database
James C. Corbett*, Jeffrey Dean*, Michael Epstein*, Andrew Fikes*, Christopher Frost*, JJ Furman*, Sanjay Ghemawat*, Andrey Gubarev*, Christopher Heiser*, Peter Hochschild*, Wilson Hsieh*, Sebastian Kanthak*, Eugene Kogan*, Hongyi Li*, Alexander Lloyd*, Sergey Melnik*, David Mwaura*, David Nagle*, Sean Quinlan*, Rajesh Rao*, Lindsay Rolig*, Dale Woodford*, Yasushi Saito*, Christopher Taylor*, Michal Szymaniak*, Ruth Wang*, [OSDI '12]
This paper shows how a new time API and its implementation can provide the abstraction of tightly synchronized clocks, even on a global scale. We describe how we used this technology to build a globally-distributed database that supports a variety of powerful features: non-blocking reads in the past, lock-free snapshot transactions, and atomic schema changes.
Googlers across the company actively engage with the scientific community by publishing technical papers, contributing open-source packages, working on standards, introducing new APIs and tools, giving talks and presentations, participating in ongoing technical debates, and much more. Our publications offer technical and algorithmic advances, feature aspects we learn as we develop novel products and services, and shed light on some of the technical challenges we face at Google.
In an effort to highlight some of our work, we periodically select a number of publications to be featured on this blog. We first posted a set of papers on this blog in mid-2010 and subsequently discussed them in more detail in the following blog postings. In a second round, we highlighted new noteworthy papers from the later half of 2010 and again in 2011. This time we honor the influential papers authored or co-authored by Googlers covering all of 2012 -- covering roughly 6% of our total publications. It’s tough choosing, so we may have left out some important papers. So, do see the publications list to review the complete group.
In the coming weeks we will be offering a more in-depth look at some of these publications, but here are the summaries:
Algorithms and Theory
Online Matching with Stochastic Rewards
Aranyak Mehta*, Debmalya Panigrahi [FOCS'12]
Online advertising is inherently stochastic: value is realized only if the user clicks on the ad, while the ad platform knows only the probability of the click. This paper is the first to introduce the stochastic nature of the rewards to the rich algorithmic field of online allocations. The core algorithmic problem it formulates is online bipartite matching with stochastic rewards, with known click probabilities. The main result is an online algorithm which obtains a large fraction of the optimal value. The paper also shows the difficulty introduced by the stochastic nature, by showing how it behaves very differently from the classic (non-stochastic) online matching problem.
Matching with our Eyes Closed
Gagan Goel*, Pushkar Tripathi* [FOCS'12]
In this paper we propose a simple randomized algorithm for finding a matching in a large graph. Unlike most solutions to this problem, our approach does not rely on building large combinatorial structures (like blossoms) but works completely locally. We analyze the performance of our algorithm and show that it does significantly better than the greedy algorithm. In doing so we improve a celebrated 18 year old result by Aronson et. al.
Simultaneous Approximations for Adversarial and Stochastic Online Budgeted Allocation
Vahab Mirrokni*, Shayan Oveis Gharan, Morteza Zadimoghaddam, [SODA'12]
In this paper, we study online algorithms that simultaneously perform well in worst-case and average-case instances, or equivalently algorithms that perform well in both stochastic and adversarial models at the same time. This is motivated by online allocation of queries to advertisers with budget constraints. Stochastic models are not robust enough to deal with traffic spikes and adversarial models are too pessimistic. While several algorithms have been proposed for these problems, each algorithm was known to perform well in one model and not both, and we present new results for a single algorithm that works well in both models.
Economics and EC
Polyhedral Clinching Auctions and the Adwords Polytope
Gagan Goel*, Vahab Mirrokni*, Renato Paes Leme [STOC'12]
Budgets play a major role in ad auctions where advertisers explicitly declare budget constraints. Very little is known in auctions about satisfying such budget constraints while keeping incentive compatibility and efficiency. The problem becomes even harder in the presence of complex combinatorial constraints over the set of feasible allocations. We present a class of ascending-price auctions addressing this problem for a very general class of (polymatroid) allocation constraints including the AdWords problem with multiple keywords and multiple slots.
HCI
Backtracking Events as Indicators of Usability Problems in Creation-Oriented Applications
David Akers*, Robin Jeffries*, Matthew Simpson*, Terry Winograd [TOCHI '12]
Backtracking events such as undo can be useful automatic indicators of usability problems for creation-oriented applications such as word processors and photo editors. Our paper presents a new cost-effective usability evaluation method based on this insight.
Talking in Circles: Selective Sharing in Google+
Sanjay Kairam, Michael J. Brzozowski*, David Huffaker*, Ed H. Chi*, [CHI'12]
This paper explores why so many people share selectively on Google+: to protect their privacy but also to focus and target their audience. People use Circles to support these goals, organizing contacts by life facet, tie strength, and interest.
Information Retrieval
Online selection of diverse results
Debmalya Panigrahi, Atish Das Sarma, Gagan Aggarwal*, and Andrew Tomkins*, [WSDM'12]
We consider the problem of selecting subsets of items that are simultaneously diverse in multiple dimensions, which arises in the context of recommending interesting content to users. We formally model this optimization problem, identify its key structural characteristics, and use these observations to design an extremely scalable and efficient algorithm. We prove that the algorithm always produces a nearly optimal solution and also perform experiments on real-world data that indicate that the algorithm performs even better in practice than the analytical guarantees.
Machine Learning
Large Scale Distributed Deep Networks
Jeffrey Dean, Greg S. Corrado*, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Andrew Y. Ng, NIPS 2012;
In this paper, we examine several techniques to improve the time to convergence for neural networks and other models trained by gradient-based methods. The paper describes a system we have built that exploits both model-level parallelism (by partitioning the nodes of a large model across multiple machines) and data-level parallelism (by having multiple replicas of a model process different training data and coordinating the application of updates to the model state through a centralized-but-partitioned parameter server system). Our results show that very large neural networks can be trained effectively and quickly on large clusters of machines.
Open Problem: Better Bounds for Online Logistic Regression
Brendan McMahan* and Matthew Streeter*, COLT/ICML'12 Joint Open Problem Session, JMLR: Workshop and Conference Proceedings.
One of the goals of research at Google is to help point out important open problems--precise questions that are interesting academically but also have important practical ramifications. This open problem is about logistic regression, a widely used algorithm for predicting probabilities (what is the probability an email message is spam, or that a search ad will be clicked). We show that in the simple one-dimensional case, much better results are possible than current theoretical analysis suggests, and we ask whether our results can be generalized to arbitrary logistic regression problems.
Spectral Learning of General Weighted Automata via Constrained Matrix Completion
Borja Balle and Mehryar Mohri*, NIPS 2012.
Learning weighted automata from finite samples drawn from an unknown distribution is a central problem in machine learning and computer science in general, with a variety of applications in text and speech processing, bioinformatics, and other areas. This paper presents a new family of algorithms for tackling this problem for which it proves learning guarantees. The algorithms introduced combine ideas from two different domains: matrix completion and spectral methods.
Machine Translation
Improved Domain Adaptation for Statistical Machine Translation
Wei Wang*, Klaus Macherey*, Wolfgang Macherey*, Franz Och* and Peng Xu*, [AMTA'12]
Research in domain adaptation for machine translation has been mostly focusing on one domain. We present a simple and effective domain adaptation infrastructure that makes an MT system with a single translation model capable of providing adapted, close-to-upper-bound domain-specific accuracy while preserving the generic translation accuracy. Large-scale experiments on 20 language pairs for patent and generic domains show the viability of our approach.
Multimedia and Computer Vision
Reconstructing the World's Museums
Jianxiong Xiao and Yasutaka Furukawa*, [ECCV '12]
Virtual navigation and exploration of large indoor environments (e.g., museums) have been so far limited to either blueprint-style 2D maps that lack photo-realistic views of scenes, or ground-level image-to-image transitions, which are immersive but ill-suited for navigation. This paper presents a novel vision-based 3D reconstruction and visualization system to automatically produce clean and well-regularized texture-mapped 3D models for large indoor scenes, from ground-level photographs and 3D laser points. For the first time, we enable users to easily browse a large scale indoor environment from a bird's-eye view, locate specific room interiors, fly into a place of interest, view immersive ground-level panoramas, and zoom out again, all with seamless 3D transitions.
The intervalgram: An audio feature for large-scale melody recognition
Thomas C. Walters*, David Ross*, Richard F. Lyon*, [CMMR'12]
Intervalgrams are small images that summarize the structure of short segments of music by looking at the musical intervals between the notes present in the music. We use them for finding cover songs - different pieces of music that share the same underlying composition. Wedo this by comparing 'heatmaps' which look at the similarity between intervalgrams from different pieces of music over time. If we see a strong diagonal line in the heatmap, it's good evidence that the songs are musically similar.
General and Nested Wiberg Minimization
Dennis Strelow*, [CVPR'12]
Eriksson and van den Hengel’s CVPR 2010 paper showed that Wiberg’s least squares matrix factorization, which effectively eliminates one matrix from the factorization problem, could be applied to the harder case of L1 factorization. Our paper generalizes their approach beyond factorization to general nonlinear problems in two sets of variables, like perspective structure-from-motion. We also show that with our generalized method, one Wiberg minimization can also be nested inside another, effectively eliminating two of three sets of unknowns, and we demonstrated this idea using projective struture-from-motion
Calibration-Free Rolling Shutter Removal
Matthias Grundmann*, Vivek Kwatra*, Daniel Castro, Irfan Essa*, International Conference on Computational Photography '12. Best paper.
Mobile phones and current generation DSLR’s, contain an electronic rolling shutter, capturing each frame one row of pixels at a time. Consequently, if the camera moves during capture, it will cause image distortions ranging from shear to wobbly distortions. We propose a calibration-free solution based on a novel parametric mixture model to correct these rolling shutter distortions in videos that enables real-time rolling shutter rectification as part of YouTube’s video stabilizer.
Natural Language Processing
Vine Pruning for Efficient Multi-Pass Dependency Parsing
Alexander Rush, Slav Petrov*, The 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL '12), Best Paper Award.
Being able to accurately analyze the grammatical structure of sentences is crucial for language understanding applications such as machine translation or question answering. In this paper we present a method that is up to 200 times faster than existing methods and enables the grammatical analysis of text in large-scale applications. The key idea is to perform the analysis in multiple coarse-to-fine passes, resolving easy ambiguities first and tackling the harder ones later on.
Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure
Oscar Tackstrom, Ryan McDonald*, Jakob Uszkoreit*, North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL '12), Best Student Paper Award.
This paper studies how to build meaningful cross-lingual word clusters, i.e., clusters containing lexical items from two languages that are coherent along some abstract dimension. This is done by coupling distributional statistics learned from huge amounts of language specific data coupled with constraints generated from parallel corpora. The resulting clusters are used to improve the accuracy of multi-lingual syntactic parsing for languages without any training resources.
Networks
How to Split a Flow
Tzvika Hartman*, Avinatan Hassidim*, Haim Kaplan*, Danny Raz*, Michal Segalov*, [INFOCOM '12]
Decomposing a flow into a small number of paths is a very important task arises in various network optimization mechanisms. In this paper we develop an an approximation algorithm for this problem that has both provable worst case performance grantees as well as good practical behavior.
Deadline-Aware Datacenter TCP (D2TCP)
Balajee Vamanan, Jahangir Hasan*, T. N. Vijaykumar, [SIGCOMM '12]
Some of our most important products like search and ads operate under soft-real-time constraints. They are architected and fine-tuned to return results to users within a few hundred milliseconds. Deadline-Aware Datacenter TCP is a research effort into making the datacenter networks deadline aware, thus improving the performance of such key applications.
Trickle: Rate Limiting YouTube Video Streaming
Monia Ghobadi, Yuchung Cheng*, Ankur Jain*, Matt Mathis* [USENIX '12]
Trickle is a server-side mechanism to stream YouTube video smoothly to reduce burst and buffer-bloat. It paces the video stream by placing an upper bound on TCP’s congestion window based on the streaming rate and the round-trip time. In initial evaluation Trickle reduces the TCP loss rate by up to 43% and the RTT by up to 28%. Given the promising results we are deploying Trickle to all YouTube servers.
Social Systems
Look Who I Found: Understanding the Effects of Sharing Curated Friend Groups
Lujun Fang*, Alex Fabrikant*, Kristen LeFevre*, [Web Science '12]. Best Student Paper award.
In this paper, we studied the impact of the Google+ circle-sharing feature, which allows individual users to share (publicly and privately) pre-curated groups of friends and contacts. We specifically investigated the impact on the growth and structure of the Google+ social network. In the course of the analysis, we identified two natural categories of shared circles ("communities" and "celebrities"). We also observed that the circle-sharing feature is associated with the accelerated densification of community-type circles.
Software Engineering
AddressSanitizer: A Fast Address Sanity Checker
Konstantin Serebryany*, Derek Bruening*, Alexander Potapenko*, Dmitry Vyukov*, [USENIX ATC '12].
The paper “AddressSanitizer: A Fast Address Sanity Checker” describes a dynamic tool that finds memory corruption bugs in C or C++ programs with only a 2x slowdown. The major feature of AddressSanitizer is simplicity -- this is why the tool is very fast.
Speech
Japanese and Korean Voice Search
Mike Schuster*, Kaisuke Nakajima*, IEEE International Conference on Acoustics, Speech, and Signal Processing [ICASSP'12].
"Japanese and Korean voice search" explains in detail how the Android voice search systems for these difficult languages were developed. We describe how to segment statistically to be able to handle infinite vocabularies without out-of-vocabulary words, how to handle the lack of spaces between words for language modeling and dictionary generation, and how to deal best with multiple ambiguities during evaluation scoring of reference transcriptions against hypotheses. The combination of techniques presented led to high quality speech recognition systems--as of 6/2013 Japanese and Korean are #2 and #3 in terms of traffic after the US.
Google's Cross-Dialect Arabic Voice Search
Fadi Biadsy*, Pedro J. Moreno*, Martin Jansche*, IEEE International Conference on Acoustics, Speech, and Signal Processing [ICASSP 2012].
This paper describes Google’s automatic speech recognition systems for recognizing several Arabic dialects spoken in the Middle East, with the potential to reach more than 125 million users. We suggest solutions for challenges specific to Arabic, such as the diacritization problem, where short vowels are not written in Arabic text. We conduct experiments to identify the optimal manner in which acoustic data should be clustered among dialects.
Deep Neural Networks for Acoustic Modeling in Speech Recognition
Geoffrey Hinton*, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew W. Senior*, Vincent Vanhoucke*, Patrick Nguyen, Tara Sainath, Brian Kingsbury, Signal Processing Magazine (2012)"
Survey paper on the DNN breakthrough in automatic speech recognition accuracy.
Statistics
Empowering Online Advertisements by Empowering Viewers with the Right to Choose
Max Pashkevich*, Sundar Dorai-Raj*, Melanie Kellar*, Dan Zigmond*, Journal of Advertising Research, vol. 52 (2012).
YouTube’s TrueView in-stream video advertising format (a form of skippable in-stream ads) can improve the online video viewing experience for users without sacrificing advertising value for advertisers or content owners.
Structured Data
Efficient Spatial Sampling of Large Geographical Tables
Anish Das Sarma*, Hongrae Lee*, Hector Gonzalez*, Jayant Madhavan*, Alon Halevy*, [SIGMOD '12].
This paper presents fundamental results for the "thinning problem": determining appropriate samples of data to be shown on specific geographical regions and zoom levels. This problem is widely applicable for a number of cloud-based geographic visualization systems such as Google Maps, Fusion Tables, and the developed algorithms are part of the Fusion Tables backend. The SIGMOD 2012 paper was selected among the best papers of the conference, and invited to a special best-papers issue of TODS.
Systems
Spanner: Google's Globally-Distributed Database
James C. Corbett*, Jeffrey Dean*, Michael Epstein*, Andrew Fikes*, Christopher Frost*, JJ Furman*, Sanjay Ghemawat*, Andrey Gubarev*, Christopher Heiser*, Peter Hochschild*, Wilson Hsieh*, Sebastian Kanthak*, Eugene Kogan*, Hongyi Li*, Alexander Lloyd*, Sergey Melnik*, David Mwaura*, David Nagle*, Sean Quinlan*, Rajesh Rao*, Lindsay Rolig*, Dale Woodford*, Yasushi Saito*, Christopher Taylor*, Michal Szymaniak*, Ruth Wang*, [OSDI '12]
This paper shows how a new time API and its implementation can provide the abstraction of tightly synchronized clocks, even on a global scale. We describe how we used this technology to build a globally-distributed database that supports a variety of powerful features: non-blocking reads in the past, lock-free snapshot transactions, and atomic schema changes.
الأربعاء، 29 مايو 2013
Open Access for Publications
Posted by Alfred Spector, Vice President, Engineering
The Association for Computing Machinery (ACM) recently announced a new option for publication rights management, wherein researchers can choose to pay for the public to have perpetual open access to the publication. Google applauds this new option, and today we are announcing that we will pay the open access fees for all articles by Google researchers that are published in ACM journals. IEEE also has an open access option for some of its publications, and we also pay the open access fee for them and for publications in like organizations.
Google has always believed that by improving access to the world’s knowledge, we can help improve everyone’s lives. When it comes to scientific research, we have consistently said that open access to publications speeds up research, accelerates innovation, and helps grow the global economy.
Policies like ACM’s continue to demonstrate the sustainability of open access publishing. It will also provide better access to the papers that we write at Google. We encourage researchers everywhere to pursue open access options whenever publishing articles, and to continue to make publications available as widely as possible, within your rights.
The Association for Computing Machinery (ACM) recently announced a new option for publication rights management, wherein researchers can choose to pay for the public to have perpetual open access to the publication. Google applauds this new option, and today we are announcing that we will pay the open access fees for all articles by Google researchers that are published in ACM journals. IEEE also has an open access option for some of its publications, and we also pay the open access fee for them and for publications in like organizations.
Google has always believed that by improving access to the world’s knowledge, we can help improve everyone’s lives. When it comes to scientific research, we have consistently said that open access to publications speeds up research, accelerates innovation, and helps grow the global economy.
Policies like ACM’s continue to demonstrate the sustainability of open access publishing. It will also provide better access to the papers that we write at Google. We encourage researchers everywhere to pursue open access options whenever publishing articles, and to continue to make publications available as widely as possible, within your rights.
الاثنين، 4 يونيو 2012
Research at Google on G+: Featuring Excellent Papers for 2011
Posted by Corinna Cortes, Google Research
In March, we announced on the blog our Excellent Papers for 2011. Chosen papers comprise a tiny fraction of our total publications and were selected for their outstanding contributions to a diverse range of disciplines across the computer science field. In the past, we have offered more detailed discussions of each featured paper in subsequent postings. We are pleased to be able to continue this tradition through our Research at Google page on G+, which we unveiled last month.
Just as our publications highlight technical and algorithmic advances, share lessons we’ve learned as we developed our products and services, and denote some of the technical challenges we face, our Research at Google G+ page will continue the communication in a format that is better for mutual interaction. Add Research at Google to your circles to learn more about our research agenda, technology behind products, and innovative developments across the broader academic and technical community.
This week, we picked up on our excellent papers recognition with a deep dive into Cascades of two-pole–two-zero asymmetric resonators are good models of peripheral auditory function, by Dick Lyon, Research Scientist. Tune into G+ regularly to learn more about the papers you’re most interested in.
In March, we announced on the blog our Excellent Papers for 2011. Chosen papers comprise a tiny fraction of our total publications and were selected for their outstanding contributions to a diverse range of disciplines across the computer science field. In the past, we have offered more detailed discussions of each featured paper in subsequent postings. We are pleased to be able to continue this tradition through our Research at Google page on G+, which we unveiled last month.
Just as our publications highlight technical and algorithmic advances, share lessons we’ve learned as we developed our products and services, and denote some of the technical challenges we face, our Research at Google G+ page will continue the communication in a format that is better for mutual interaction. Add Research at Google to your circles to learn more about our research agenda, technology behind products, and innovative developments across the broader academic and technical community.
This week, we picked up on our excellent papers recognition with a deep dive into Cascades of two-pole–two-zero asymmetric resonators are good models of peripheral auditory function, by Dick Lyon, Research Scientist. Tune into G+ regularly to learn more about the papers you’re most interested in.
الخميس، 22 مارس 2012
Excellent Papers for 2011
Posted by Corinna Cortes and Alfred Spector, Google Research
UPDATE: Added Theo Vassilakis as an author for "Dremel: Interactive Analysis of Web-Scale Datasets"
Googlers across the company actively engage with the scientific community by publishing technical papers, contributing open-source packages, working on standards, introducing new APIs and tools, giving talks and presentations, participating in ongoing technical debates, and much more. Our publications offer technical and algorithmic advances, feature aspects we learn as we develop novel products and services, and shed light on some of the technical challenges we face at Google.
In an effort to highlight some of our work, we periodically select a number of publications to be featured on this blog. We first posted a set of papers on this blog in mid-2010 and subsequently discussed them in more detail in the following blog postings. In a second round, we highlighted new noteworthy papers from the later half of 2010. This time we honor the influential papers authored or co-authored by Googlers covering all of 2011 -- covering roughly 10% of our total publications. It’s tough choosing, so we may have left out some important papers. So, do see the publications list to review the complete group.
In the coming weeks we will be offering a more in-depth look at these publications, but here are some summaries:
Audio processing
“Cascades of two-pole–two-zero asymmetric resonators are good models of peripheral auditory function”, Richard F. Lyon, Journal of the Acoustical Society of America, vol. 130 (2011), pp. 3893-3904.
Lyon's long title summarizes a result that he has been working toward over many years of modeling sound processing in the inner ear. This nonlinear cochlear model is shown to be "good" with respect to psychophysical data on masking, physiological data on mechanical and neural response, and computational efficiency. These properties derive from the close connection between wave propagation and filter cascades. This filter-cascade model of the ear is used as an efficient sound processor for several machine hearing projects at Google.
Electronic Commerce and Algorithms
“Online Vertex-Weighted Bipartite Matching and Single-bid Budgeted Allocations”, Gagan Aggarwal, Gagan Goel, Chinmay Karande, Aranyak Mehta, SODA 2011.
The authors introduce an elegant and powerful algorithmic technique to the area of online ad allocation and matching: a hybrid of random perturbations and greedy choice to make decisions on the fly. Their technique sheds new light on classic matching algorithms, and can be used, for example, to pick one among a set of relevant ads, without knowing in advance the demand for ad slots on future web page views.
“Milgram-routing in social networks”, Silvio Lattanzi, Alessandro Panconesi, D. Sivakumar, Proceedings of the 20th International Conference on World Wide Web, WWW 2011, pp. 725-734.
Milgram’s "six-degrees-of-separation experiment" and the fascinating small world hypothesis that follows from it, have generated a lot of interesting research in recent years. In this landmark experiment, Milgram showed that people unknown to each other are often connected by surprisingly short chains of acquaintances. In the paper we prove theoretically and experimentally how a recent model of social networks, "Affiliation Networks", offers an explanation to this phenomena and inspires interesting technique for local routing within social networks.
“Non-Price Equilibria in Markets of Discrete Goods”, Avinatan Hassidim, Haim Kaplan, Yishay Mansour, Noam Nisan, EC, 2011.
We present a correspondence between markets of indivisible items, and a family of auction based n player games. We show that a market has a price based (Walrasian) equilibrium if and only if the corresponding game has a pure Nash equilibrium. We then turn to markets which do not have a Walrasian equilibrium (which is the interesting case), and study properties of the mixed Nash equilibria of the corresponding games.
HCI
“From Basecamp to Summit: Scaling Field Research Across 9 Locations”, Jens Riegelsberger, Audrey Yang, Konstantin Samoylov, Elizabeth Nunge, Molly Stevens, Patrick Larvie, CHI 2011 Extended Abstracts.
The paper reports on our experience with a basecamp research hub to coordinate logistics and ongoing real-time analysis with research teams in the field. We also reflect on the implications for the meaning of research in a corporate context, where much of the value may be less in a final report, but more in the curated impressions and memories our colleagues take away from the the research trip.
“User-Defined Motion Gestures for Mobile Interaction”, Jaime Ruiz, Yang Li, Edward Lank, CHI 2011: ACM Conference on Human Factors in Computing Systems, pp. 197-206.
Modern smartphones contain sophisticated sensors that can detect rich motion gestures — deliberate movements of the device by end-users to invoke commands. However, little is known about best-practices in motion gesture design for the mobile computing paradigm. We systematically studied the design space of motion gestures via a guessability study that elicits end-user motion gestures to invoke commands on a smartphone device. The study revealed consensus among our participants on parameters of movement and on mappings of motion gestures onto commands, by which we developed a taxonomy for motion gestures and compiled an end-user inspired motion gesture set. The work lays the foundation of motion gesture design—a new dimension for mobile interaction.
Information Retrieval
“Reputation Systems for Open Collaboration”, B.T. Adler, L. de Alfaro, A. Kulshreshtha , I. Pye, Communications of the ACM, vol. 54 No. 8 (2011), pp. 81-87.
This paper describes content based reputation algorithms, that rely on automated content analysis to derive user and content reputation, and their applications for Wikipedia and google Maps. The Wikipedia reputation system WikiTrust relies on a chronological analysis of user contributions to articles, metering positive or negative increments of reputation whenever new contributions are made. The Google Maps system Crowdsensus compares the information provided by users on map business listings and computes both a likely reconstruction of the correct listing and a reputation value for each user. Algorithmic-based user incentives ensure the trustworthiness of evaluations of Wikipedia entries and Google Maps business information.
Machine Learning and Data Mining
“Domain adaptation in regression”, Corinna Cortes, Mehryar Mohri, Proceedings of The 22nd International Conference on Algorithmic Learning Theory, ALT 2011.
Domain adaptation is one of the most important and challenging problems in machine learning. This paper presents a series of theoretical guarantees for domain adaptation in regression, gives an adaptation algorithm based on that theory that can be cast as a semi-definite programming problem, derives an efficient solution for that problem by using results from smooth optimization, shows that the solution can scale to relatively large data sets, and reports extensive empirical results demonstrating the benefits of this new adaptation algorithm.
“On the necessity of irrelevant variables”, David P. Helmbold, Philip M. Long, ICML, 2011
Relevant variables sometimes do much more good than irrelevant variables do harm, so that it is possible to learn a very accurate classifier using predominantly irrelevant variables. We show that this holds given an assumption that formalizes the intuitive idea that the variables are non-redundant. For problems like this it can be advantageous to add many additional variables, even if only a small fraction of them are relevant.
“Online Learning in the Manifold of Low-Rank Matrices”, Gal Chechik, Daphna Weinshall, Uri Shalit, Neural Information Processing Systems (NIPS 23), 2011, pp. 2128-2136.
Learning measures of similarity from examples of similar and dissimilar pairs is a problem that is hard to scale. LORETA uses retractions, an operator from matrix optimization, to learn low-rank similarity matrices efficiently. This allows to learn similarities between objects like images or texts when represented using many more features than possible before.
Machine Translation
“Training a Parser for Machine Translation Reordering”, Jason Katz-Brown, Slav Petrov, Ryan McDonald, Franz Och, David Talbot, Hiroshi Ichikawa, Masakazu Seno, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP '11).
Machine translation systems often need to understand the syntactic structure of a sentence to translate it correctly. Traditionally, syntactic parsers are evaluated as standalone systems against reference data created by linguists. Instead, we show how to train a parser to optimize reordering accuracy in a machine translation system, resulting in measurable improvements in translation quality over a more traditionally trained parser.
“Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation”, Ashish Venugopal, Jakob Uszkoreit, David Talbot, Franz Och, Juri Ganitkevitch, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP).
We propose a general method to watermark and probabilistically identify the structured results of machine learning algorithms with an application in statistical machine translation. Our approach does not rely on controlling or even knowing the inputs to the algorithm and provides probabilistic guarantees on the ability to identify collections of results from one’s own algorithm, while being robust to limited editing operations.
“Inducing Sentence Structure from Parallel Corpora for Reordering”, John DeNero, Jakob Uszkoreit, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Automatically discovering the full range of linguistic rules that govern the correct use of language is an appealing goal, but extremely challenging. Our paper describes a targeted method for discovering only those aspects of linguistic syntax necessary to explain how two different languages differ in their word ordering. By focusing on word order, we demonstrate an effective and practical application of unsupervised grammar induction that improves a Japanese to English machine translation system.
Multimedia and Computer Vision
“Kernelized Structural SVM Learning for Supervised Object Segmentation”, Luca Bertelli, Tianli Yu, Diem Vu, Burak Gokturk,Proceedings of IEEE Conference on Computer Vision and Pattern Recognition 2011.
The paper proposes a principled way for computers to learn how to segment the foreground from the background of an image given a set of training examples. The technology is build upon a specially designed nonlinear segmentation kernel under the recently proposed structured SVM learning framework.
“Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths”, Matthias Grundmann, Vivek Kwatra, Irfan Essa, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011).
Casually shot videos captured by handheld or mobile cameras suffer from significant amount of shake. Existing in-camera stabilization methods dampen high-frequency jitter but do not suppress low-frequency movements and bounces, such as those observed in videos captured by a walking person. On the other hand, most professionally shot videos usually consist of carefully designed camera configurations, using specialized equipment such as tripods or camera dollies, and employ ease-in and ease-out for transitions. Our stabilization technique automatically converts casual shaky footage into more pleasant and professional looking videos by mimicking these cinematographic principles. The original, shaky camera path is divided into a set of segments, each approximated by either constant, linear or parabolic motion, using an algorithm based on robust L1 optimization. The stabilizer has been part of the YouTube Editor (youtube.com/editor) since March 2011.
“The Power of Comparative Reasoning”, Jay Yagnik, Dennis Strelow, David Ross, Ruei-Sung Lin, International Conference on Computer Vision (2011).
The paper describes a theory derived vector space transform that converts vectors into sparse binary vectors such that Euclidean space operations on the sparse binary vectors imply rank space operations in the original vector space. The transform a) does not need any data-driven supervised/unsupervised learning b) can be computed from polynomial expansions of the input space in linear time (in the degree of the polynomial) and c) can be implemented in 10-lines of code. We show competitive results on similarity search and sparse coding (for classification) tasks.
NLP
“Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections”, Dipanjan Das, Slav Petrov, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL '11), 2011, Best Paper Award.
We would like to have natural language processing systems for all languages, but obtaining labeled data for all languages and tasks is unrealistic and expensive. We present an approach which leverages existing resources in one language (for example English) to induce part-of-speech taggers for languages without any labeled training data. We use graph-based label propagation for cross-lingual knowledge transfer and use the projected labels as features in a hidden Markov model trained with the Expectation Maximization algorithm.
Networks
“TCP Fast Open”, Sivasankar Radhakrishnan, Yuchung Cheng, Jerry Chu, Arvind Jain, Barath Raghavan, Proceedings of the 7th International Conference on emerging Networking EXperiments and Technologies (CoNEXT), 2011.
TCP Fast Open enables data exchange during TCP’s initial handshake. It decreases application network latency by one full round-trip time, a significant speedup for today's short Web transfers. Our experiments on popular websites show that Fast Open reduces the whole-page load time over 10% on average, and in some cases up to 40%.
“Proportional Rate Reduction for TCP”, Nandita Dukkipati, Matt Mathis, Yuchung Cheng, Monia Ghobadi, Proceedings of the 11th ACM SIGCOMM Conference on Internet Measurement 2011, Berlin, Germany - November 2-4, 2011.
Packet losses increase latency of Web transfers and negatively impact user experience. Proportional rate reduction (PRR) is designed to recover from losses quickly, smoothly and accurately by pacing out retransmissions across received ACKs during TCP’s fast recovery. Experiments on Google Web and YouTube servers in U.S. and India demonstrate that PRR reduces the TCP latency of connections experiencing losses by 3-10% depending on response size.
Security and Privacy
“Automated Analysis of Security-Critical JavaScript APIs”, Ankur Taly, Úlfar Erlingsson, John C. Mitchell, Mark S. Miller, Jasvir Nagra, IEEE Symposium on Security & Privacy (SP), 2011.
As software is increasingly written in high-level, type-safe languages, attackers have fewer means to subvert system fundamentals, and attacks are more likely to exploit errors and vulnerabilities in application-level logic. This paper describes a generic, practical defense against such attacks, which can protect critical application resources even when those resources are partially exposed to attackers via software interfaces. In the context of carefully-crafted fragments of JavaScript, the paper applies formal methods and semantics to prove that these defenses can provide complete, non-circumventable mediation of resource access; the paper also shows how an implementation of the techniques can establish the properties of widely-used software, and find previously-unknown bugs.
“App Isolation: Get the Security of Multiple Browsers with Just One”, Eric Y. Chen, Jason Bau, Charles Reis, Adam Barth, Collin Jackson, 18th ACM Conference on Computer and Communications Security, 2011.
We find that anecdotal advice to use a separate web browser for sites like your bank is indeed effective at defeating most cross-origin web attacks. We also prove that a single web browser can provide the same key properties, for sites that fit within the compatibility constraints.
Speech
“Improving the speed of neural networks on CPUs”, Vincent Vanhoucke, Andrew Senior, Mark Z. Mao, Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011.
As deep neural networks become state-of-the-art in real-time machine learning applications such as speech recognition, computational complexity is fast becoming a limiting factor in their adoption. We show how to best leverage modern CPU architectures to significantly speed-up their inference.
“Bayesian Language Model Interpolation for Mobile Speech Input”, Cyril Allauzen, Michael Riley, Interspeech 2011.
Voice recognition on the Android platform must contend with many possible target domains - e.g. search, maps, SMS. For each of these, a domain-specific language model was built by linearly interpolating several n-gram LMs from a common set of Google corpora. The current work has found a way to efficiently compute a single n-gram language model with accuracy very close to the domain-specific LMs but with considerably less complexity at recognition time.
Statistics
“Large-Scale Parallel Statistical Forecasting Computations in R”, Murray Stokely, Farzan Rohani, Eric Tassone, JSM Proceedings, Section on Physical and Engineering Sciences, 2011.
This paper describes the implementation of a framework for utilizing distributed computational infrastructure from within the R interactive statistical computing environment, with applications to timeseries forecasting. This system is widely used by the statistical analyst community at Google for data analysis on very large data sets.
Structured Data
“Dremel: Interactive Analysis of Web-Scale Datasets”, Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, Theo Vassilakis, Communications of the ACM, vol. 54 (2011), pp. 114-123.
Dremel is a scalable, interactive ad-hoc query system. By combining multi-level execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. Besides continued growth internally to Google, Dremel now also backs an increasing number of external customers including BigQuery and UIs such as AdExchange front-end.
“Representative Skylines using Threshold-based Preference Distributions”, Atish Das Sarma, Ashwin Lall, Danupon Nanongkai, Richard J. Lipton, Jim Xu, International Conference on Data Engineering (ICDE), 2011.
The paper adopts principled approach towards representative skylines and formalizes the problem of displaying k tuples such that the probability that a random user clicks on one of them is maximized. This requires mathematically modeling (a) the likelihood with which a user is interested in a tuple, as well as (b) how one negotiates the lack of knowledge of an explicit set of users. This work presents theoretical and experimental results showing that the suggested algorithm significantly outperforms previously suggested approaches.
“Hyper-local, directions-based ranking of places”, Petros Venetis, Hector Gonzalez, Alon Y. Halevy, Christian S. Jensen, PVLDB, vol. 4(5) (2011), pp. 290-30.
Click through information is one of the strongest signals we have for ranking web pages. We propose an equivalent signal for raking real world places: The number of times that people ask for precise directions to the address of the place. We show that this signal is competitive in quality with human reviews while being much cheaper to collect, we also show that the signal can be incorporated efficiently into a location search system.
Systems
“Power Management of Online Data-Intensive Services”, David Meisner, Christopher M. Sadler, Luiz André Barroso, Wolf-Dietrich Weber, Thomas F. Wenisch, Proceedings of the 38th ACM International Symposium on Computer Architecture, 2011.
Compute and data intensive Web services (such as Search) are a notoriously hard target for energy savings techniques. This article characterizes the statistical hardware activity behavior of servers running Web search and discusses the potential opportunities of existing and proposed energy savings techniques.
“The Impact of Memory Subsystem Resource Sharing on Datacenter Applications”, Lingjia Tang, Jason Mars, Neil Vachharajani, Robert Hundt, Mary-Lou Soffa, ISCA, 2011.
In this work, the authors expose key characteristics of an emerging class of Google-style workloads and show how to enhance system software to take advantage of these characteristics to improve efficiency in data centers. The authors find that across datacenter applications, there is both a sizable benefit and a potential degradation from improperly sharing micro-architectural resources on a single machine (such as on-chip caches and bandwidth to memory). The impact of co-locating threads from multiple applications with diverse memory behavior changes the optimal mapping of thread to cores for each application. By employing an adaptive thread-to-core mapper, the authors improved the performance of the datacenter applications by up to 22% over status quo thread-to-core mapping, achieving performance within 3% of optimal.
“Language-Independent Sandboxing of Just-In-Time Compilation and Self-Modifying Code”, Jason Ansel, Petr Marchenko, Úlfar Erlingsson, Elijah Taylor, Brad Chen, Derek Schuff, David Sehr, Cliff L. Biffle, Bennet S. Yee, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2011.
Since its introduction in the early 90's, Software Fault Isolation, or SFI, has been a static code technique, commonly perceived as incompatible with dynamic libraries, runtime code generation, and other dynamic code. This paper describes how to address this limitation and explains how the SFI techniques in Google Native Client were extended to support modern language implementations based on just-in-time code generation and runtime instrumentation. This work is already deployed in Google Chrome, benefitting millions of users, and was developed over a summer collaboration with three Ph.D. interns; it exemplifies how Research at Google is focused on rapidly bringing significant benefits to our users through groundbreaking technology and real-world products.
“Thialfi: A Client Notification Service for Internet-Scale Applications”, Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek,Proc. 23rd ACM Symposium on Operating Systems Principles (SOSP), 2011, pp. 129-142.
This paper describes a notification service that scales to hundreds of millions of users, provides sub-second latency in the common case, and guarantees delivery even in the presence of a wide variety of failures. The service has been deployed in several popular Google applications including Chrome, Google Plus, and Contacts.
UPDATE: Added Theo Vassilakis as an author for "Dremel: Interactive Analysis of Web-Scale Datasets"
Googlers across the company actively engage with the scientific community by publishing technical papers, contributing open-source packages, working on standards, introducing new APIs and tools, giving talks and presentations, participating in ongoing technical debates, and much more. Our publications offer technical and algorithmic advances, feature aspects we learn as we develop novel products and services, and shed light on some of the technical challenges we face at Google.
In an effort to highlight some of our work, we periodically select a number of publications to be featured on this blog. We first posted a set of papers on this blog in mid-2010 and subsequently discussed them in more detail in the following blog postings. In a second round, we highlighted new noteworthy papers from the later half of 2010. This time we honor the influential papers authored or co-authored by Googlers covering all of 2011 -- covering roughly 10% of our total publications. It’s tough choosing, so we may have left out some important papers. So, do see the publications list to review the complete group.
In the coming weeks we will be offering a more in-depth look at these publications, but here are some summaries:
Audio processing
“Cascades of two-pole–two-zero asymmetric resonators are good models of peripheral auditory function”, Richard F. Lyon, Journal of the Acoustical Society of America, vol. 130 (2011), pp. 3893-3904.
Lyon's long title summarizes a result that he has been working toward over many years of modeling sound processing in the inner ear. This nonlinear cochlear model is shown to be "good" with respect to psychophysical data on masking, physiological data on mechanical and neural response, and computational efficiency. These properties derive from the close connection between wave propagation and filter cascades. This filter-cascade model of the ear is used as an efficient sound processor for several machine hearing projects at Google.
Electronic Commerce and Algorithms
“Online Vertex-Weighted Bipartite Matching and Single-bid Budgeted Allocations”, Gagan Aggarwal, Gagan Goel, Chinmay Karande, Aranyak Mehta, SODA 2011.
The authors introduce an elegant and powerful algorithmic technique to the area of online ad allocation and matching: a hybrid of random perturbations and greedy choice to make decisions on the fly. Their technique sheds new light on classic matching algorithms, and can be used, for example, to pick one among a set of relevant ads, without knowing in advance the demand for ad slots on future web page views.
“Milgram-routing in social networks”, Silvio Lattanzi, Alessandro Panconesi, D. Sivakumar, Proceedings of the 20th International Conference on World Wide Web, WWW 2011, pp. 725-734.
Milgram’s "six-degrees-of-separation experiment" and the fascinating small world hypothesis that follows from it, have generated a lot of interesting research in recent years. In this landmark experiment, Milgram showed that people unknown to each other are often connected by surprisingly short chains of acquaintances. In the paper we prove theoretically and experimentally how a recent model of social networks, "Affiliation Networks", offers an explanation to this phenomena and inspires interesting technique for local routing within social networks.
“Non-Price Equilibria in Markets of Discrete Goods”, Avinatan Hassidim, Haim Kaplan, Yishay Mansour, Noam Nisan, EC, 2011.
We present a correspondence between markets of indivisible items, and a family of auction based n player games. We show that a market has a price based (Walrasian) equilibrium if and only if the corresponding game has a pure Nash equilibrium. We then turn to markets which do not have a Walrasian equilibrium (which is the interesting case), and study properties of the mixed Nash equilibria of the corresponding games.
HCI
“From Basecamp to Summit: Scaling Field Research Across 9 Locations”, Jens Riegelsberger, Audrey Yang, Konstantin Samoylov, Elizabeth Nunge, Molly Stevens, Patrick Larvie, CHI 2011 Extended Abstracts.
The paper reports on our experience with a basecamp research hub to coordinate logistics and ongoing real-time analysis with research teams in the field. We also reflect on the implications for the meaning of research in a corporate context, where much of the value may be less in a final report, but more in the curated impressions and memories our colleagues take away from the the research trip.
“User-Defined Motion Gestures for Mobile Interaction”, Jaime Ruiz, Yang Li, Edward Lank, CHI 2011: ACM Conference on Human Factors in Computing Systems, pp. 197-206.
Modern smartphones contain sophisticated sensors that can detect rich motion gestures — deliberate movements of the device by end-users to invoke commands. However, little is known about best-practices in motion gesture design for the mobile computing paradigm. We systematically studied the design space of motion gestures via a guessability study that elicits end-user motion gestures to invoke commands on a smartphone device. The study revealed consensus among our participants on parameters of movement and on mappings of motion gestures onto commands, by which we developed a taxonomy for motion gestures and compiled an end-user inspired motion gesture set. The work lays the foundation of motion gesture design—a new dimension for mobile interaction.
Information Retrieval
“Reputation Systems for Open Collaboration”, B.T. Adler, L. de Alfaro, A. Kulshreshtha , I. Pye, Communications of the ACM, vol. 54 No. 8 (2011), pp. 81-87.
This paper describes content based reputation algorithms, that rely on automated content analysis to derive user and content reputation, and their applications for Wikipedia and google Maps. The Wikipedia reputation system WikiTrust relies on a chronological analysis of user contributions to articles, metering positive or negative increments of reputation whenever new contributions are made. The Google Maps system Crowdsensus compares the information provided by users on map business listings and computes both a likely reconstruction of the correct listing and a reputation value for each user. Algorithmic-based user incentives ensure the trustworthiness of evaluations of Wikipedia entries and Google Maps business information.
Machine Learning and Data Mining
“Domain adaptation in regression”, Corinna Cortes, Mehryar Mohri, Proceedings of The 22nd International Conference on Algorithmic Learning Theory, ALT 2011.
Domain adaptation is one of the most important and challenging problems in machine learning. This paper presents a series of theoretical guarantees for domain adaptation in regression, gives an adaptation algorithm based on that theory that can be cast as a semi-definite programming problem, derives an efficient solution for that problem by using results from smooth optimization, shows that the solution can scale to relatively large data sets, and reports extensive empirical results demonstrating the benefits of this new adaptation algorithm.
“On the necessity of irrelevant variables”, David P. Helmbold, Philip M. Long, ICML, 2011
Relevant variables sometimes do much more good than irrelevant variables do harm, so that it is possible to learn a very accurate classifier using predominantly irrelevant variables. We show that this holds given an assumption that formalizes the intuitive idea that the variables are non-redundant. For problems like this it can be advantageous to add many additional variables, even if only a small fraction of them are relevant.
“Online Learning in the Manifold of Low-Rank Matrices”, Gal Chechik, Daphna Weinshall, Uri Shalit, Neural Information Processing Systems (NIPS 23), 2011, pp. 2128-2136.
Learning measures of similarity from examples of similar and dissimilar pairs is a problem that is hard to scale. LORETA uses retractions, an operator from matrix optimization, to learn low-rank similarity matrices efficiently. This allows to learn similarities between objects like images or texts when represented using many more features than possible before.
Machine Translation
“Training a Parser for Machine Translation Reordering”, Jason Katz-Brown, Slav Petrov, Ryan McDonald, Franz Och, David Talbot, Hiroshi Ichikawa, Masakazu Seno, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP '11).
Machine translation systems often need to understand the syntactic structure of a sentence to translate it correctly. Traditionally, syntactic parsers are evaluated as standalone systems against reference data created by linguists. Instead, we show how to train a parser to optimize reordering accuracy in a machine translation system, resulting in measurable improvements in translation quality over a more traditionally trained parser.
“Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation”, Ashish Venugopal, Jakob Uszkoreit, David Talbot, Franz Och, Juri Ganitkevitch, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP).
We propose a general method to watermark and probabilistically identify the structured results of machine learning algorithms with an application in statistical machine translation. Our approach does not rely on controlling or even knowing the inputs to the algorithm and provides probabilistic guarantees on the ability to identify collections of results from one’s own algorithm, while being robust to limited editing operations.
“Inducing Sentence Structure from Parallel Corpora for Reordering”, John DeNero, Jakob Uszkoreit, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Automatically discovering the full range of linguistic rules that govern the correct use of language is an appealing goal, but extremely challenging. Our paper describes a targeted method for discovering only those aspects of linguistic syntax necessary to explain how two different languages differ in their word ordering. By focusing on word order, we demonstrate an effective and practical application of unsupervised grammar induction that improves a Japanese to English machine translation system.
Multimedia and Computer Vision
“Kernelized Structural SVM Learning for Supervised Object Segmentation”, Luca Bertelli, Tianli Yu, Diem Vu, Burak Gokturk,Proceedings of IEEE Conference on Computer Vision and Pattern Recognition 2011.
The paper proposes a principled way for computers to learn how to segment the foreground from the background of an image given a set of training examples. The technology is build upon a specially designed nonlinear segmentation kernel under the recently proposed structured SVM learning framework.
“Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths”, Matthias Grundmann, Vivek Kwatra, Irfan Essa, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011).
Casually shot videos captured by handheld or mobile cameras suffer from significant amount of shake. Existing in-camera stabilization methods dampen high-frequency jitter but do not suppress low-frequency movements and bounces, such as those observed in videos captured by a walking person. On the other hand, most professionally shot videos usually consist of carefully designed camera configurations, using specialized equipment such as tripods or camera dollies, and employ ease-in and ease-out for transitions. Our stabilization technique automatically converts casual shaky footage into more pleasant and professional looking videos by mimicking these cinematographic principles. The original, shaky camera path is divided into a set of segments, each approximated by either constant, linear or parabolic motion, using an algorithm based on robust L1 optimization. The stabilizer has been part of the YouTube Editor (youtube.com/editor) since March 2011.
“The Power of Comparative Reasoning”, Jay Yagnik, Dennis Strelow, David Ross, Ruei-Sung Lin, International Conference on Computer Vision (2011).
The paper describes a theory derived vector space transform that converts vectors into sparse binary vectors such that Euclidean space operations on the sparse binary vectors imply rank space operations in the original vector space. The transform a) does not need any data-driven supervised/unsupervised learning b) can be computed from polynomial expansions of the input space in linear time (in the degree of the polynomial) and c) can be implemented in 10-lines of code. We show competitive results on similarity search and sparse coding (for classification) tasks.
NLP
“Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections”, Dipanjan Das, Slav Petrov, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL '11), 2011, Best Paper Award.
We would like to have natural language processing systems for all languages, but obtaining labeled data for all languages and tasks is unrealistic and expensive. We present an approach which leverages existing resources in one language (for example English) to induce part-of-speech taggers for languages without any labeled training data. We use graph-based label propagation for cross-lingual knowledge transfer and use the projected labels as features in a hidden Markov model trained with the Expectation Maximization algorithm.
Networks
“TCP Fast Open”, Sivasankar Radhakrishnan, Yuchung Cheng, Jerry Chu, Arvind Jain, Barath Raghavan, Proceedings of the 7th International Conference on emerging Networking EXperiments and Technologies (CoNEXT), 2011.
TCP Fast Open enables data exchange during TCP’s initial handshake. It decreases application network latency by one full round-trip time, a significant speedup for today's short Web transfers. Our experiments on popular websites show that Fast Open reduces the whole-page load time over 10% on average, and in some cases up to 40%.
“Proportional Rate Reduction for TCP”, Nandita Dukkipati, Matt Mathis, Yuchung Cheng, Monia Ghobadi, Proceedings of the 11th ACM SIGCOMM Conference on Internet Measurement 2011, Berlin, Germany - November 2-4, 2011.
Packet losses increase latency of Web transfers and negatively impact user experience. Proportional rate reduction (PRR) is designed to recover from losses quickly, smoothly and accurately by pacing out retransmissions across received ACKs during TCP’s fast recovery. Experiments on Google Web and YouTube servers in U.S. and India demonstrate that PRR reduces the TCP latency of connections experiencing losses by 3-10% depending on response size.
Security and Privacy
“Automated Analysis of Security-Critical JavaScript APIs”, Ankur Taly, Úlfar Erlingsson, John C. Mitchell, Mark S. Miller, Jasvir Nagra, IEEE Symposium on Security & Privacy (SP), 2011.
As software is increasingly written in high-level, type-safe languages, attackers have fewer means to subvert system fundamentals, and attacks are more likely to exploit errors and vulnerabilities in application-level logic. This paper describes a generic, practical defense against such attacks, which can protect critical application resources even when those resources are partially exposed to attackers via software interfaces. In the context of carefully-crafted fragments of JavaScript, the paper applies formal methods and semantics to prove that these defenses can provide complete, non-circumventable mediation of resource access; the paper also shows how an implementation of the techniques can establish the properties of widely-used software, and find previously-unknown bugs.
“App Isolation: Get the Security of Multiple Browsers with Just One”, Eric Y. Chen, Jason Bau, Charles Reis, Adam Barth, Collin Jackson, 18th ACM Conference on Computer and Communications Security, 2011.
We find that anecdotal advice to use a separate web browser for sites like your bank is indeed effective at defeating most cross-origin web attacks. We also prove that a single web browser can provide the same key properties, for sites that fit within the compatibility constraints.
Speech
“Improving the speed of neural networks on CPUs”, Vincent Vanhoucke, Andrew Senior, Mark Z. Mao, Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011.
As deep neural networks become state-of-the-art in real-time machine learning applications such as speech recognition, computational complexity is fast becoming a limiting factor in their adoption. We show how to best leverage modern CPU architectures to significantly speed-up their inference.
“Bayesian Language Model Interpolation for Mobile Speech Input”, Cyril Allauzen, Michael Riley, Interspeech 2011.
Voice recognition on the Android platform must contend with many possible target domains - e.g. search, maps, SMS. For each of these, a domain-specific language model was built by linearly interpolating several n-gram LMs from a common set of Google corpora. The current work has found a way to efficiently compute a single n-gram language model with accuracy very close to the domain-specific LMs but with considerably less complexity at recognition time.
Statistics
“Large-Scale Parallel Statistical Forecasting Computations in R”, Murray Stokely, Farzan Rohani, Eric Tassone, JSM Proceedings, Section on Physical and Engineering Sciences, 2011.
This paper describes the implementation of a framework for utilizing distributed computational infrastructure from within the R interactive statistical computing environment, with applications to timeseries forecasting. This system is widely used by the statistical analyst community at Google for data analysis on very large data sets.
Structured Data
“Dremel: Interactive Analysis of Web-Scale Datasets”, Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, Theo Vassilakis, Communications of the ACM, vol. 54 (2011), pp. 114-123.
Dremel is a scalable, interactive ad-hoc query system. By combining multi-level execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. Besides continued growth internally to Google, Dremel now also backs an increasing number of external customers including BigQuery and UIs such as AdExchange front-end.
“Representative Skylines using Threshold-based Preference Distributions”, Atish Das Sarma, Ashwin Lall, Danupon Nanongkai, Richard J. Lipton, Jim Xu, International Conference on Data Engineering (ICDE), 2011.
The paper adopts principled approach towards representative skylines and formalizes the problem of displaying k tuples such that the probability that a random user clicks on one of them is maximized. This requires mathematically modeling (a) the likelihood with which a user is interested in a tuple, as well as (b) how one negotiates the lack of knowledge of an explicit set of users. This work presents theoretical and experimental results showing that the suggested algorithm significantly outperforms previously suggested approaches.
“Hyper-local, directions-based ranking of places”, Petros Venetis, Hector Gonzalez, Alon Y. Halevy, Christian S. Jensen, PVLDB, vol. 4(5) (2011), pp. 290-30.
Click through information is one of the strongest signals we have for ranking web pages. We propose an equivalent signal for raking real world places: The number of times that people ask for precise directions to the address of the place. We show that this signal is competitive in quality with human reviews while being much cheaper to collect, we also show that the signal can be incorporated efficiently into a location search system.
Systems
“Power Management of Online Data-Intensive Services”, David Meisner, Christopher M. Sadler, Luiz André Barroso, Wolf-Dietrich Weber, Thomas F. Wenisch, Proceedings of the 38th ACM International Symposium on Computer Architecture, 2011.
Compute and data intensive Web services (such as Search) are a notoriously hard target for energy savings techniques. This article characterizes the statistical hardware activity behavior of servers running Web search and discusses the potential opportunities of existing and proposed energy savings techniques.
“The Impact of Memory Subsystem Resource Sharing on Datacenter Applications”, Lingjia Tang, Jason Mars, Neil Vachharajani, Robert Hundt, Mary-Lou Soffa, ISCA, 2011.
In this work, the authors expose key characteristics of an emerging class of Google-style workloads and show how to enhance system software to take advantage of these characteristics to improve efficiency in data centers. The authors find that across datacenter applications, there is both a sizable benefit and a potential degradation from improperly sharing micro-architectural resources on a single machine (such as on-chip caches and bandwidth to memory). The impact of co-locating threads from multiple applications with diverse memory behavior changes the optimal mapping of thread to cores for each application. By employing an adaptive thread-to-core mapper, the authors improved the performance of the datacenter applications by up to 22% over status quo thread-to-core mapping, achieving performance within 3% of optimal.
“Language-Independent Sandboxing of Just-In-Time Compilation and Self-Modifying Code”, Jason Ansel, Petr Marchenko, Úlfar Erlingsson, Elijah Taylor, Brad Chen, Derek Schuff, David Sehr, Cliff L. Biffle, Bennet S. Yee, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2011.
Since its introduction in the early 90's, Software Fault Isolation, or SFI, has been a static code technique, commonly perceived as incompatible with dynamic libraries, runtime code generation, and other dynamic code. This paper describes how to address this limitation and explains how the SFI techniques in Google Native Client were extended to support modern language implementations based on just-in-time code generation and runtime instrumentation. This work is already deployed in Google Chrome, benefitting millions of users, and was developed over a summer collaboration with three Ph.D. interns; it exemplifies how Research at Google is focused on rapidly bringing significant benefits to our users through groundbreaking technology and real-world products.
“Thialfi: A Client Notification Service for Internet-Scale Applications”, Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek,Proc. 23rd ACM Symposium on Operating Systems Principles (SOSP), 2011, pp. 129-142.
This paper describes a notification service that scales to hundreds of millions of users, provides sub-second latency in the common case, and guarantees delivery even in the presence of a wide variety of failures. The service has been deployed in several popular Google applications including Chrome, Google Plus, and Contacts.
التسميات:
Audio,
Electronic Commerce and Algorithms,
HCI,
Information Retrieval,
Machine Translation,
ML,
Networks,
NLP,
Publications,
Security and Privacy,
Speech,
statistics,
Structured Data,
Systems,
Vision Research
الأربعاء، 21 مارس 2012
Google at INFOCOM 2012
Posted by Emilie Danna, Google Research & Michal Segalov,Networking Software
The computer networking community will get together in Orlando, Florida the week of March 25th for INFOCOM 2012, the Annual IEEE International Conference on Computer Communications.
At the conference, we will discuss topics such as traffic engineering, traffic anomaly detection, and random walk algorithms for topology-aware networks. We serve so much internet traffic to Google users and exchange so much data between our data centers that computer networking is naturally something we care about. As traffic grows with richer content (photos, video, ...), new modes of engagement (cloud computing, social networking, ...) and an increasing number of users, engineering and research efforts are necessary to help networks scale.
The following papers were co-authored by Googlers from offices around the world:
If you are attending, stop by and say hi!
The computer networking community will get together in Orlando, Florida the week of March 25th for INFOCOM 2012, the Annual IEEE International Conference on Computer Communications.
At the conference, we will discuss topics such as traffic engineering, traffic anomaly detection, and random walk algorithms for topology-aware networks. We serve so much internet traffic to Google users and exchange so much data between our data centers that computer networking is naturally something we care about. As traffic grows with richer content (photos, video, ...), new modes of engagement (cloud computing, social networking, ...) and an increasing number of users, engineering and research efforts are necessary to help networks scale.
The following papers were co-authored by Googlers from offices around the world:
- Near-optimal random walk sampling in distributed networks by Atish Das Sarma, Anisur Molla, and Gopal Pandurangan
- How to split a flow by Tzvika Hartman, Avinatan Hassidim, Haim Kaplan, Danny Raz, and Michal Segalov
- Upward max-min fairness by Emilie Danna, Avinatan Hassidim, Haim Kaplan, Alok Kumar, Yishay Mansour, Danny Raz, and Michal Segalov (runner up for best paper)
- A practical algorithm for balancing the max-min fairness and throughput objectives in traffic engineering by Emilie Danna, Subhasree Mandal, and Arjun Singh
- Traffic anomaly detection based on the IP size distribution by Fabio Soldo and Ahmed Metwally
If you are attending, stop by and say hi!
الخميس، 5 مايو 2011
Google at CHI 2011
Posted by Yang Li, Research Scientist
Cross-posted with the Technical Programs and Events Blog
User-Defined Motion Gestures for Mobile Interaction by Jaime Ruiz, Yang Li*, Edward Lank
Many Bills: Engaging Citizens through Visualizations of Congressional Legislation by Yannick Assogba, Irene Ros, Joan DiMicco, Matt McKeon*
YouPivot: Improving Recall with Contextual Search by Joshua Hailpern, Nicholas Jitkoff*, Andrew Warr*, Karrie Karahalios, Robert Sesek, Nik Shkrob
Festschrift Panel in Honor of Stuart K. Card by Ed H. Chi*, Peter Pirolli, Bonnie John, Judith S Olson, Dan Russell*, Tom Moran
CHI Should be Replicating and Validating Results More: Discuss by Max L. Wilson, Wendy Mackay, Ed H. Chi*, Michael Bernstein, Dan Russell*, Harold Thimbleby
CASE STUDIES
Note: * denotes a Googler
Cross-posted with the Technical Programs and Events Blog
Google has an increasing presence at ACM CHI: Conference on Human Factors in Computing Systems, which is the premiere conference for Human Computer Interaction research. Eight Google papers will appear at the conference. These papers not only touch on our core areas such as Search, Chrome and Android but also demonstrate our growing effort in new areas where HCI is essential, such as new search user interfaces, gesture-based interfaces and cross-device interaction. They showcase our efforts to address user experiences in diverse situations. Googlers are playing active roles in the conference in many other ways too: participating in conference committees, hosting panels, organizing workshops and teaching courses, as well as running demos and 1:1 sessions at Google's booth.
This year's CHI takes place in Vancouver, BC, from May 7th - 12th.
PAPERS
Gesture Avatar: A Technique for Operating Mobile User Interfaces Using Gestures, by Hao Lü, Yang Li*
User-Defined Motion Gestures for Mobile Interaction by Jaime Ruiz, Yang Li*, Edward Lank
Experimental Analysis of Touch-Screen Gesture Designs in Mobile Environments by Andrew Bragdon, Eugene Nelson, Yang Li*, Ken Hinckley
Many Bills: Engaging Citizens through Visualizations of Congressional Legislation by Yannick Assogba, Irene Ros, Joan DiMicco, Matt McKeon*
YouPivot: Improving Recall with Contextual Search by Joshua Hailpern, Nicholas Jitkoff*, Andrew Warr*, Karrie Karahalios, Robert Sesek, Nik Shkrob
Oops, I Did It Again: Mitigating Repeated Access Control Errors on Facebook by Serge Egelman, Andrew Oates*, Shriram Krishnamurthi
Deep Shot: A Framework for Migrating Tasks Across Devices Using Mobile Phone Cameras by Tsung-Hsiang Chang, Yang Li*
DoubleFlip: A Motion Gesture Delimiter for Mobile Interaction by Jaime Ruiz, Yang Li*
WORKSHOPS
Crowdsourcing and Human Computation: Systems, Studies and Platforms by Michael Bernstein, Ed H. Chi*, Lydia B. Chilton, Björn Hartmann, Aniket Kittur, Robert C. Miller PANELS
Designing for User Experience: Academia & Industry by Joseph 'Jofish' Kaye, Elizabeth Buie, Jettie Hoonhout, Kristina Höök, Virpi Roto, Scott Jenson*, Peter Wright
Festschrift Panel in Honor of Stuart K. Card by Ed H. Chi*, Peter Pirolli, Bonnie John, Judith S Olson, Dan Russell*, Tom Moran
CHI Should be Replicating and Validating Results More: Discuss by Max L. Wilson, Wendy Mackay, Ed H. Chi*, Michael Bernstein, Dan Russell*, Harold Thimbleby
Transferability of Research Findings: Context-Dependent or Model-Driven by Ed H. Chi*, Mary Czerwinski, David Millen, Dave Randall, Gunnar Stevens, Volker Wulf, John Zimmerman
The Future of Child-Computer Interaction by Allison Druin, Gary Knell, Elliot Soloway, Dan Russell*, Elizabeth Mynatt, Yvonne Rogers
CASE STUDIES
From Basecamp to Summit: Scaling Field Research Across 9 Locations by Jens Riegelsberger*, Audrey Yang*, Konstantin Samoylov*, Elizabeth Nunge*, Molly Stevens*, Patrick Larvie*
COURSES
Design and Analysis of Large Scale Log Studies by Susan Dumais, Robin Jeffries*, Dan Russell*, Diane Tang*, Jaime Teevan
SIG MEETING
Participatory Culture in the Age of Social Media by Dana Rotman, Sarah Vieweg, Sarita Yardi, Ed H. Chi*, Jenny Preece, Ben Shneiderman, Peter Pirolli, Tom GlaisyerNote: * denotes a Googler
الجمعة، 25 فبراير 2011
Where does my data live?
Posted by Daniel Ford, Senior Mathematician
Have you ever wondered what happens when you upload a photo to Picasa, or where all your Gmail or YouTube videos are stored? How it is that you can read or watch them from anywhere at any time?
If you stored your data on a single hard disk, like the one in your personal computer, then the disk would eventually fail and your data would be lost forever. If you want to protect your data from the possibility of such a failure, you can store copies across many different disks so that if any one fails then you just access the data from another.
However, once storage systems get large enough, anything and everything can and does go wrong. You have to plan not just for disk failures but for server, network, and entire datacenter failures. Add to this software bugs and maintenance operations and you have a whole lot more failures.
Using measurements from dozens of Google data centers, we found that almost-simultaneous failure of many servers in a data center has the greatest impact on availability. On the other hand, disk failures have relatively little impact because our systems are specifically designed to cope with these failures.
Once you have a model of failures, you can also look at the impact of various design choices. Where exactly should you place your data replicas? How fast do you need recover from losing a disk or server? What encoding scheme or number of replicas of the data is enough, given a desired level of availability? For example, we found that storing data across multiple data centers reduces data unavailability by many orders of magnitude compared to having the same number of replicas in a single data center. The added complexity and potential for slower recovery times is worth it to get better availability, or use less storage space, or even both at the same time.
As you can see, something as simple as storing your photos, mail, or videos becomes a lot more involved when you want to be sure it's always available.
In our paper, Availability in Globally Distributed Storage Systems, we characterize the availability of cloud storage systems, based on extensive monitoring of Google's main storage infrastructure, and the sources of failure which affect availability. We also present statistical models for reasoning about the impact of design choices such as data placement, recovery speed, and replication strategies, including replication across multiple data centers.
Have you ever wondered what happens when you upload a photo to Picasa, or where all your Gmail or YouTube videos are stored? How it is that you can read or watch them from anywhere at any time?
If you stored your data on a single hard disk, like the one in your personal computer, then the disk would eventually fail and your data would be lost forever. If you want to protect your data from the possibility of such a failure, you can store copies across many different disks so that if any one fails then you just access the data from another.
However, once storage systems get large enough, anything and everything can and does go wrong. You have to plan not just for disk failures but for server, network, and entire datacenter failures. Add to this software bugs and maintenance operations and you have a whole lot more failures.
Using measurements from dozens of Google data centers, we found that almost-simultaneous failure of many servers in a data center has the greatest impact on availability. On the other hand, disk failures have relatively little impact because our systems are specifically designed to cope with these failures.
Once you have a model of failures, you can also look at the impact of various design choices. Where exactly should you place your data replicas? How fast do you need recover from losing a disk or server? What encoding scheme or number of replicas of the data is enough, given a desired level of availability? For example, we found that storing data across multiple data centers reduces data unavailability by many orders of magnitude compared to having the same number of replicas in a single data center. The added complexity and potential for slower recovery times is worth it to get better availability, or use less storage space, or even both at the same time.
As you can see, something as simple as storing your photos, mail, or videos becomes a lot more involved when you want to be sure it's always available.
In our paper, Availability in Globally Distributed Storage Systems, we characterize the availability of cloud storage systems, based on extensive monitoring of Google's main storage infrastructure, and the sources of failure which affect availability. We also present statistical models for reasoning about the impact of design choices such as data placement, recovery speed, and replication strategies, including replication across multiple data centers.
A Runtime Solution for Online Contention Detection and Response
Posted by Jason Mars, Software Engineering Intern
In our recent paper, Contention Aware Execution: Online Contention Detection and Response, we have made a big step forward in addressing an important and pressing problem in the field of Computer Science today. This work appears in the 2010 Proceedings of the International Symposium on Code Generation and Optimization (CGO) and was awarded the CGO 2010 Best Presentation Award at the conference.
One of the greatest challenges when using multicore processors arise when critical resources, such as the on-chip caches, are shared by multiple executing programs. If these programs simultaneously place heavy demands on shared resources, the may be forced to "take turns," and as a result, unpredictable and abrupt slowdowns may occur. This unexpected "cross-core interference" is especially problematic when considering the latency sensitive applications that are found in Google's datacenters, such as web-search. The commonly used solution is to dedicate separate machines to each application, however this leaves the processing capabilities of multicore processors underutilized. In our work, we present the Contention Aware Execution Runtime (CAER) environment that provides a lightweight runtime solution that minimizes cross-core interference, while maximizing utilization. CAER leverages the ubiquitous performance monitoring capabilities present in current state-of-the-art multicore processors to infer and respond to cross-core interference and requires no added hardware support. Our experiments show that when using our CAER system, we are able to increase the utilization of the multicore CPU by 58% on average. Meanwhile CAER brings the performance penally due to allowing co-location from 17% down to just 4% on average.
In our recent paper, Contention Aware Execution: Online Contention Detection and Response, we have made a big step forward in addressing an important and pressing problem in the field of Computer Science today. This work appears in the 2010 Proceedings of the International Symposium on Code Generation and Optimization (CGO) and was awarded the CGO 2010 Best Presentation Award at the conference.
One of the greatest challenges when using multicore processors arise when critical resources, such as the on-chip caches, are shared by multiple executing programs. If these programs simultaneously place heavy demands on shared resources, the may be forced to "take turns," and as a result, unpredictable and abrupt slowdowns may occur. This unexpected "cross-core interference" is especially problematic when considering the latency sensitive applications that are found in Google's datacenters, such as web-search. The commonly used solution is to dedicate separate machines to each application, however this leaves the processing capabilities of multicore processors underutilized. In our work, we present the Contention Aware Execution Runtime (CAER) environment that provides a lightweight runtime solution that minimizes cross-core interference, while maximizing utilization. CAER leverages the ubiquitous performance monitoring capabilities present in current state-of-the-art multicore processors to infer and respond to cross-core interference and requires no added hardware support. Our experiments show that when using our CAER system, we are able to increase the utilization of the multicore CPU by 58% on average. Meanwhile CAER brings the performance penally due to allowing co-location from 17% down to just 4% on average.
الخميس، 17 فبراير 2011
Query Language Modeling for Voice Search
Posted by Ciprian Chelba, Research Scientist
About three years ago we set a goal to enable speaking to the Google Search engine on smart-phones. On the language modeling side, the motivation was that we had access to large amounts of typed text data from our users. At the same time, that meant that the users also had a clear expectation for how they would interact with a speech-enabled version of the Google Search application.
The challenge lay in the scale of the problem and the perceived sparsity of the query data. Our paper, Query Language Modeling for Voice Search, describes the approach we took, and the empirical findings along the way.
Besides data availability, the project succeeded due to our excellent computational platform, the culture built around teams that wholeheartedly tackle such challenges with the conviction that they will set a new bar, and a collaborative mindset that leverages resources across the company. In this case we used training data made available by colleagues working in query spelling correction, query stream sampling procedures devised for search quality evaluation, the open finite state tools, and distributed language modeling infrastructure built for machine translation.
Perhaps the most satisfying part of this research project was its impact on the end-user: when presenting the poster at SLT 2010 in Berkeley I offered to demo Google Voice Search, and often got the answer “Thanks, I already use it!”.
About three years ago we set a goal to enable speaking to the Google Search engine on smart-phones. On the language modeling side, the motivation was that we had access to large amounts of typed text data from our users. At the same time, that meant that the users also had a clear expectation for how they would interact with a speech-enabled version of the Google Search application.
The challenge lay in the scale of the problem and the perceived sparsity of the query data. Our paper, Query Language Modeling for Voice Search, describes the approach we took, and the empirical findings along the way.
Besides data availability, the project succeeded due to our excellent computational platform, the culture built around teams that wholeheartedly tackle such challenges with the conviction that they will set a new bar, and a collaborative mindset that leverages resources across the company. In this case we used training data made available by colleagues working in query spelling correction, query stream sampling procedures devised for search quality evaluation, the open finite state tools, and distributed language modeling infrastructure built for machine translation.
Perhaps the most satisfying part of this research project was its impact on the end-user: when presenting the poster at SLT 2010 in Berkeley I offered to demo Google Voice Search, and often got the answer “Thanks, I already use it!”.
الخميس، 27 يناير 2011
Google at NIPS 2010
Posted by Slav Petrov, Doug Aberdeen, and Lisa McCracken, Google Research
The machine learning community met in Vancouver in December for the 24th Neural Information Processing Systems Conference (NIPS). As always, the single-track program of the main conference featured a number of outstanding talks, followed by interesting late night poster sessions. A record number of workshops covered a wide variety of topics, while allocating sufficient time for skiing in Whistler - after all, many of the most interesting research conversations happen while riding the lift in-between ski runs. This year’s conference also featured a symposium dedicated to Sam Roweis, providing a retrospective on Sam’s life and work. Sam, a fellow Googler and professor at NYU, was at the heart of the NIPS community and is terribly missed.
As always, Google was involved in various ways with NIPS. Here at Google, we take a data-driven approach when solving problems. Therefore, Machine Learning is in one way or another at the core of most of the things that we do. It is therefore unsurprising that many Googlers helped shape the program of the conference or were in the audience. This year, three Googlers served as area chairs and even more were reviewers. Googlers also co-authored the following papers:
Additionally, Googlers co-organized three well attended workshops:
Finally, Yoram Singer gave a great talk on Learning Structural Sparsity at the Sam Roweis symposium and Googlers presented the following talks during the workshops:
Overall, it was a very successful conference and it was good to be back in Vancouver one last time. This coming year NIPS 2011 will be in Granada, Spain. Hasta luego!
The machine learning community met in Vancouver in December for the 24th Neural Information Processing Systems Conference (NIPS). As always, the single-track program of the main conference featured a number of outstanding talks, followed by interesting late night poster sessions. A record number of workshops covered a wide variety of topics, while allocating sufficient time for skiing in Whistler - after all, many of the most interesting research conversations happen while riding the lift in-between ski runs. This year’s conference also featured a symposium dedicated to Sam Roweis, providing a retrospective on Sam’s life and work. Sam, a fellow Googler and professor at NYU, was at the heart of the NIPS community and is terribly missed.
As always, Google was involved in various ways with NIPS. Here at Google, we take a data-driven approach when solving problems. Therefore, Machine Learning is in one way or another at the core of most of the things that we do. It is therefore unsurprising that many Googlers helped shape the program of the conference or were in the audience. This year, three Googlers served as area chairs and even more were reviewers. Googlers also co-authored the following papers:
- Label Embedding Trees for Large Multi-Class Tasks by Samy Bengio and Jason Weston
- Learning Bounds for Importance Weighting by Corinna Cortes, Yishay Mansour, and Mehryar Mohri
- Online Learning in the Manifold of Low-Rank Matrices by Uri Shalit, Daphna Weinshall, and Gal Chechik
- Deterministic Single–Pass Algorithm for LDA by Issei Sato, Kenichi Kurihara, and Hiroshi Nakagawa
- Distributed Dual Averaging In Networks by John Duchi, Alekh Agarwal, and Martin Wainwright
Additionally, Googlers co-organized three well attended workshops:
- Coarse–to–Fine Learning and Inference by Ben Taskar, David Weiss, Benjamin Sapp, and Slav Petrov
- Low–rank Methods for Large–scale Machine Learning by Arthur Gretton, Michael Mahoney, Mehryar Mohri, and Ameet Talwalkar
- Learning on Cores, Clusters, and Clouds by John Duchi, Ofer Dekel, John Langford, Lawrence Cayton, and Alekh Agarwal
Finally, Yoram Singer gave a great talk on Learning Structural Sparsity at the Sam Roweis symposium and Googlers presented the following talks during the workshops:
- Online Learning in the Manifold of Low–Rank Matrices by Uri Shalit, Daphna Weinshall, and Gal Chechik
- Distributed MAP Inference for Undirected Graphical Models by Sameer Singh, Amar Subramanya, Fernando Pereira, and Andrew McCallum
- MapReduce/Bigtable for Distributed Optimization by Keith Hall, Scott Gilpin and Gideon Mann
- Self-Pruning Prediction Trees by Sally Goldman
- Web Scale Image Annotation: Learning to Rank with Joint Word-Image Embeddings by Jason Weston, Samy Bengio, and Nicolas Usunier
- Coarse–to–fine Decoding for Parsing and Machine Translation by Slav Petrov
Overall, it was a very successful conference and it was good to be back in Vancouver one last time. This coming year NIPS 2011 will be in Granada, Spain. Hasta luego!
الاثنين، 18 أكتوبر 2010
Google at the Conference on Empirical Methods in Natural Language Processing (EMNLP '10)
Posted by Slav Petrov, Research Scientist
The Conference on Empirical Methods in Natural Language Processing (EMNLP '10) was recently held at the MIT Stata Center in Massachusetts. Natural Language Processing is at the core of many of the things that we do here at Google. Googlers have therefore been traditionally part of this research community, participating as program committee members, paper authors and attendees.
At this year's EMNLP conference Google Fellow, Amit Singhal gave an invited keynote talk on "Challenges in running a commercial search engine" where he highlighted some of the exciting opportunities, as well as challenges, that Google is currently facing. Furthermore, Terry Koo (who recently joined Google), David Sontag (former Google PhD Fellowship recipient) and their collaborators from MIT received the Fred Jelinek Best Paper Award for their innovative work on syntactic parsing with the title "Dual Decomposition for Parsing with Non-Projective Head Automata".
Here is a complete list of the papers presented by Googlers at the conference:
The Conference on Empirical Methods in Natural Language Processing (EMNLP '10) was recently held at the MIT Stata Center in Massachusetts. Natural Language Processing is at the core of many of the things that we do here at Google. Googlers have therefore been traditionally part of this research community, participating as program committee members, paper authors and attendees.
At this year's EMNLP conference Google Fellow, Amit Singhal gave an invited keynote talk on "Challenges in running a commercial search engine" where he highlighted some of the exciting opportunities, as well as challenges, that Google is currently facing. Furthermore, Terry Koo (who recently joined Google), David Sontag (former Google PhD Fellowship recipient) and their collaborators from MIT received the Fred Jelinek Best Paper Award for their innovative work on syntactic parsing with the title "Dual Decomposition for Parsing with Non-Projective Head Automata".
Here is a complete list of the papers presented by Googlers at the conference:
- Dual Decomposition for Parsing with Non-Projective Head Automata (Fred Jelinek Best Paper Award) by Terry Koo, Alexander M. Rush, Michael Collins, Tommi Jaakkola, and David Sontag
- "Poetic" Statistical Machine Translation: Rhyme and Meter (see also here) by Dmitriy Genzel, Jakob Uszkoreit, and Franz Och
- Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models by Amarnag Subramanya, Slav Petrov, and Fernando Pereira
- Uptraining for Accurate Deterministic Question Parsing by Slav Petrov, Pi-Chuan Chang, Michael Ringgaard, and Hiyan Alshawi
- Self-training with Products of Latent Variable Grammars by Zhongqiang Huang, Mary Harper, and Slav Petrov
الاثنين، 11 أكتوبر 2010
Making an Impact on a Thriving Speech Research Community
Posted by Vincent Vanhoucke, Google Research
While we continue to launch exciting new speech products--most recently Voice Actions and Google Search by Voice in Russian, Czech and Polish--we also strive to contribute to the academic research community by sharing both innovative techniques and experiences with large-scale systems.
This year’s gathering of the world’s experts in speech technology research, Interspeech 2010 in Makuhari, Japan, which Google co-sponsored, was a fantastic demonstration of the momentum of this community, driven by new challenges such as mobile voice communication, voice search, and the increasing international reach of speech technologies.
Googlers published papers that showcased the breadth and depth of our speech recognition research. Our work addresses both fundamental problems in acoustic and language modeling, as well as the practical issues of building scalable speech interfaces that real people use everyday to make their lives easier.
Here is a list of the papers presented by Googlers at the conference:
This year’s gathering of the world’s experts in speech technology research, Interspeech 2010 in Makuhari, Japan, which Google co-sponsored, was a fantastic demonstration of the momentum of this community, driven by new challenges such as mobile voice communication, voice search, and the increasing international reach of speech technologies.
Googlers published papers that showcased the breadth and depth of our speech recognition research. Our work addresses both fundamental problems in acoustic and language modeling, as well as the practical issues of building scalable speech interfaces that real people use everyday to make their lives easier.
Here is a list of the papers presented by Googlers at the conference:
- Direct Construction of Compact Context-Dependency Transducers From Data, David Rybach and Michael Riley (Computer Speech & Language Best Paper Award).
- Voice Search for Development, Etienne Barnard, Johan Schalkwyk, Charl van Heerden and Pedro J. Moreno.
- Unsupervised Discovery and Training of Maximally Dissimilar Cluster Models, Françoise Beaufays, Vincent Vanhoucke and Brian Strope.
- Search by Voice in Mandarin Chinese, Jiulong Shan, Genqing Wu, Zhihong Hu, Xiliu Tang, Martin Jansche and Pedro J. Moreno.
- On-Demand Language Model Interpolation for Mobile Speech Input, Brandon Ballinger, Cyril Allauzen, Alexander Gruenstein, and Johan Schalkwyk.
- Building Transcribed Speech Corpora Quickly and Cheaply for Many Languages, Thad Hughes, Kaisuke Nakajima, Linne Ha, Atul Vasu, Pedro J. Moreno and Mike LeBeau.
- Say What? Why Users Choose to Speak their Web Queries, Maryam Kamvar and Doug Beeferman.
- Study on Interaction between Entropy Pruning and Kneser-Ney Smoothing, Ciprian Chelba, Thorsten Brants, Will Neveitt and Peng Xu.
- Decision Tree State Clustering with Word and Syllable Features, Hank Liao, Chris Alberti, Michiel Bacchiani and Olivier Siohan.
الاشتراك في:
الرسائل (Atom)