In a bid to enhance user privacy and control, Meta has rolled out a new tool that allows Facebook users to manage the personal data that the company uses in training generative artificial intelligence (AI) models. The recent update to the Facebook help center resource section includes a dedicated form named “Generative AI Data Subject Rights.” This form empowers users to submit requests related to their third-party information being utilized for training generative AI models.
This move comes at a time when generative AI technology is gaining momentum across the tech landscape, with companies actively developing advanced chatbots and transforming basic text inputs into intricate responses and images. Recognizing the significance of user data, Meta is now offering people the option to access, modify, or erase any personal information that might have been incorporated into the various third-party data sources the company employs to train its expansive language and related AI models.
The “Generative AI Data Subject Rights” form defines third-party information as data that is publicly available on the internet or sourced from licensed providers. This kind of data, Meta states, constitutes a substantial portion of the “billions of pieces of data” harnessed to train generative AI models, which employ predictive techniques and patterns to generate novel content.
In a related blog post detailing its data usage practices for generative AI, Meta elaborates on its approach. The company explains that it collects public information from the web and also licenses data from other providers. For instance, personal details like names and contact information might appear in blog posts.
It’s important to note that the form doesn’t account for a user’s activity on Meta-owned platforms such as Facebook comments or Instagram photos. This leaves open the possibility that the company could potentially utilize first-party data for training its generative AI models.
A spokesperson from Meta clarified that their latest Llama 2 open-source large language model was not trained on user data from Meta, and the company has yet to introduce any consumer features related to Generative AI on its systems.
The spokesperson further noted that depending on users’ geographic location, they might have the opportunity to exercise their data subject rights and object to specific data being utilized to train AI models. This reference is made to various data privacy regulations outside the U.S., which provide consumers with greater control over how tech companies can utilize their personal data.
Similar to its peers in the tech industry, such as Microsoft, OpenAI, and Google’s parent company Alphabet, Meta accumulates vast amounts of third-party data to train its models and related AI software.
Meta defended its data collection practices in the blog post, stating that gathering significant amounts of information from publicly available and licensed sources is necessary to develop effective models that drive advancements in the field. The company expressed its commitment to transparency regarding the legal basis for processing such data.
However, as of late, data privacy advocates have raised concerns about the practice of aggregating extensive amounts of publicly available information for AI model training.
Last week, a coalition of data protection agencies from countries including the U.K., Canada, and Switzerland issued a joint statement to companies like Meta, Alphabet, Microsoft, and others. The statement highlighted concerns about data scraping and the protection of user privacy. The agencies reminded these companies of their obligation to adhere to data protection and privacy laws worldwide.
The statement emphasized the role of social media and tech companies in safeguarding personal information from data scraping and enabling users to interact with their services while maintaining privacy. This underscores the growing importance of responsible data-handling practices in the tech industry.
To exercise their rights, users can follow these steps:
- Click on the link provided to “Learn more and submit requests here.”
- Choose one of the three options that best aligns with the issue or objection.
- The first option enables users to access, download, or correct their personal information sourced from third-party providers for generative AI model training.
- The second option empowers users to delete their personal information from third-party data sources used in training.
- The third option is available for those with different concerns.
- After selecting an option, users will need to complete a security check test. Some users have encountered software bugs that prevent them from completing the form.
In conclusion, Meta’s introduction of the “Generative AI Data Subject Rights” form reflects a broader industry shift toward granting users more control over their personal data, especially in the context of AI model training. As the conversation around data privacy continues to evolve, tech companies are compelled to strike a balance between innovation and user protection.