Google CEO Sundar Pichai stood on stage at the company’s yearly developer conference on Tuesday and rolled out some of its most advanced technology: an assistant that can schedule appointments for you over the phone, customized suggestions in Google Maps, and even a new feature that can help finish your sentences as you type an email.
It’s all underpinned by the same thing: the massive trove of data that Google is collecting on billions of people every day.
Until recently, most users may have either been unaware their data was being used like this or were fine with the tradeoff. Google has seven products that each have at least 1 billion active monthly users, and they couldn’t work as well without access to users’ data.
That has helped make Google one of the world’s most well-regarded brands, according to a Morning Consult poll. But in a post-Cambridge Analytica world that is growing increasingly leery of how major tech companies track people, the data collection practices by the world’s leading digital advertising company have come under renewed scrutiny.
“Google is walking a very fine line,” David Yoffie, a professor at the Harvard Business School, said in an email. “Search, plus Android gives Google amazing insight into individual behavior. Google’s stated privacy policies seem adequate, but the question that I cannot answer is whether Google’s stated policy and actual behavior are one and the same. Facebook had a stated policy for the last three years which most of us found acceptable until Cambridge Analytica came to light.”
Where does the data come from?
The more Google products you use, the more Google can gather about you. Whether it’s Gmail, the Android smartphone operating system, YouTube, Google Drive, Google Maps, and, of course, Google Search — the company is collecting gigabytes of data about you.
Google offers free access to these tools and in return shows you super-targeted advertising, which is how it made $31.2 billion in revenue in just the first three months of 2018.
The company’s data collection practices also include scanning your email to extract keyword data for use in other Google products and services and to improve its machine learning capabilities, Google spokesman Aaron Stein confirmed in an email to NBC News.
How Google collects data from Gmail users and what it uses that data for has been a particularly sensitive topic. In June 2017, Google said it would stop scanning Gmail messages in order to sell targeted ads. After this article was published, Google’s confirmation that it does still collect data from the email of Gmail users drew the attention of some journalists that cover technology and digital privacy.
Google reached out to NBC to clarify that the company’s spokesperson was referring to “narrow use cases” in Gmail.
“First, since 2012, we’ve enabled people to use Google Search to find information from their Gmail accounts by answering questions like ‘When is my restaurant reservation?’” Stein, the Google spokesperson, wrote in an email. “We present customized search results containing this information if someone is signed-in and asks us for it. Second, like other email providers, our systems may also automatically process email messages to detect spam, malware and phishing patterns, to help us stop this abuse and protect people’s inboxes. We have the most secure email service because of these systems – and they are powered by machine learning technology.”
It doesn’t stop there, though. Google says it also leverages some of its datasets to “help build the next generation of ground-breaking artificial intelligence solutions.” On Tuesday, Google rolled out “Smart Replies,” in which artificial intelligence helps users finish sentences.
The extent of the information Google has can be eyebrow-raising even for technology professionals. Dylan Curran, an information technology consultant, recently downloaded everything Facebook had on him and got a 600-megabyte file. When he downloaded the same kind of file from Google, it was 5.5 gigabytes, about nine times as large. His tweets highlighting each kind of information Google had on him, and therefore other users got nearly 170,000 retweets.