Microsoft has announced it will use customer data from Copilot, Bing, and Microsoft Start (MSN) to train its generative AI models for Copilot.
The company believes that "real-world consumer interactions provide greater breadth and diversity in training data" to create more "inclusive, relevant products" and improve the user experience, according to the announcement.
Microsoft says the AI models could learn from aggregated user ratings to give better answers in the future. They also plan to incorporate colloquial expressions and local references from Copilot conversations. Advertising data will help the models learn which ads are most effective.
Microsoft apparently plans forced opt-in
While Microsoft states twice in the announcement that it will "always ask first," the company appears to be planning an opt-out system rather than requiring users to opt in, which would be a true "ask first" approach. Basically, they're going to take your data first and then ask you if it's okay. For Microsoft, that apparently means "ask first."
"We will also make it simple for consumers to opt-out of their data being used for training, with clear notices displayed in Copilot, Bing, and Microsoft Start," the company writes. The opt-out controls will be available in October, and AI training will not begin until at least 15 days after users are notified of these controls.
This means users who don't want their data used for AI training will have to proactively opt out. Meta tried a similar approach with Meta AI, but faced data protection issues in the EU. That's likely why, for now, Microsoft won't be using consumer data from the European Economic Area (EEA) for Copilot training.
Microsoft has stated that it will comply with certain privacy obligations when training the models. Identifying information such as names or addresses will be removed before training, and the data will remain private and not be shared. The company says nothing will change for its commercial customers in terms of data management.