Incentivizing Collaboration in Machine Learning via Synthetic Data Rewards

Tay, Sebastian Shenghong; Xu, Xinyi; Foo, Chuan Sheng; Low, Bryan Kian Hsiang

Computer Science > Machine Learning

arXiv:2112.09327 (cs)

[Submitted on 17 Dec 2021]

Title:Incentivizing Collaboration in Machine Learning via Synthetic Data Rewards

Authors:Sebastian Shenghong Tay, Xinyi Xu, Chuan Sheng Foo, Bryan Kian Hsiang Low

View PDF

Abstract:This paper presents a novel collaborative generative modeling (CGM) framework that incentivizes collaboration among self-interested parties to contribute data to a pool for training a generative model (e.g., GAN), from which synthetic data are drawn and distributed to the parties as rewards commensurate to their contributions. Distributing synthetic data as rewards (instead of trained models or money) offers task- and model-agnostic benefits for downstream learning tasks and is less likely to violate data privacy regulation. To realize the framework, we firstly propose a data valuation function using maximum mean discrepancy (MMD) that values data based on its quantity and quality in terms of its closeness to the true data distribution and provide theoretical results guiding the kernel choice in our MMD-based data valuation function. Then, we formulate the reward scheme as a linear optimization problem that when solved, guarantees certain incentives such as fairness in the CGM framework. We devise a weighted sampling algorithm for generating synthetic data to be distributed to each party as reward such that the value of its data and the synthetic data combined matches its assigned reward value by the reward scheme. We empirically show using simulated and real-world datasets that the parties' synthetic data rewards are commensurate to their contributions.

Comments:	36th AAAI Conference on Artificial Intelligence (AAAI 2022), Extended version with derivations, 42 pages
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2112.09327 [cs.LG]
	(or arXiv:2112.09327v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2112.09327

Submission history

From: Sebastian Tay Shenghong [view email]
[v1] Fri, 17 Dec 2021 05:15:30 UTC (21,243 KB)

Computer Science > Machine Learning

Title:Incentivizing Collaboration in Machine Learning via Synthetic Data Rewards

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Incentivizing Collaboration in Machine Learning via Synthetic Data Rewards

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators