Reddit’s Sale of User Data for AI Training Draws FTC Inquiry

Reddit stated forward of its IPO subsequent week that licensing consumer posts to Google and others for AI tasks may usher in $203 million of income over the subsequent few years. The community-driven platform was compelled to reveal Friday that US regulators have already got questions on that new line of enterprise.

In a regulatory submitting, Reddit stated that it obtained a letter from the US Federal Trade Commision on Thursday asking about “our sale, licensing, or sharing of user-generated content with third parties to train AI models.”

The FTC, the US authorities’s main antitrust regulator, has the ability to sanction firms discovered to interact in unfair or misleading commerce practices. The concept of licensing user-generated content material for AI tasks has drawn questions from lawmakers and rights teams about privateness dangers, equity, and copyright.

Reddit isn’t alone in making an attempt to make a buck off licensing information, together with that generated by customers, for AI. Programming Q&A web site Stack Overflow has signed a cope with Google, the Associated Press has signed one with OpenAI, and Tumblr proprietor Automattic has stated it’s working “with select AI companies” however will enable customers to choose out of getting their information handed alongside. None of the licensors instantly responded to requests for remark. Reddit additionally isn’t the one firm receiving an FTC letter about information licensing, Axios reported on Friday, citing an unnamed former company official.

It’s unclear whether or not the letter to Reddit is straight associated to overview into every other firms.

Reddit stated in Friday’s disclosure that it doesn’t consider that it engaged in any unfair or misleading practices however warned that coping with any authorities inquiry might be pricey and time-consuming. “The letter indicated that the FTC staff was interested in meeting with us to learn more about our plans and that the FTC intended to request information and documents from us as its inquiry continues,” the submitting says. Reddit stated the FTC letter described the scrutiny as associated to “a non-public inquiry.”

Reddit, whose 17 billion posts and feedback are seen by AI specialists as helpful for coaching chatbots within the artwork of dialog, introduced a deal final month to license the content material to Google. Reddit and Google didn’t instantly reply to requests for remark. The FTC declined to remark. (Advance Magazine Publishers, dad or mum of WIRED’s writer Condé Nast, owns a stake in Reddit.)

AI chatbots like OpenAI’s ChatGPT and Google’s Gemini are seen as a aggressive risk to Reddit, publishers, and different ad-supported, content-driven companies. In the previous 12 months the prospect of licensing information to AI builders emerged as a possible upside of generative AI for some firms.

But the usage of information harvested on-line to coach AI fashions has raised a lot of questions winding by boardrooms, courtrooms, and Congress. For Reddit and others whose information is generated by customers, these questions embrace who really owns the content material and whether or not it’s honest to license it out with out giving the creator a minimize. Security researchers have discovered that AI fashions can leak private information included within the materials used to create them. And some critics have recommended the offers may make highly effective firms much more dominant.