A Universal Metric for Robust Evaluation of Synthetic Tabular Data

Lahoti, Mukund; Narang, Pratik

dc.contributor.author	Lahoti, Mukund
dc.contributor.author	Narang, Pratik
dc.date.accessioned	2024-09-18T09:03:10Z
dc.date.available	2024-09-18T09:03:10Z
dc.date.issued	2024-01
dc.identifier.uri	https://ieeexplore.ieee.org/document/9984938
dc.identifier.uri	http://dspace.bits-pilani.ac.in:8080/jspui/xmlui/handle/123456789/15617
dc.description.abstract	Synthetic tabular data generation becomes crucial when real data are limited, expensive to collect, or simply cannot be used due to privacy concerns. However, producing good quality synthetic data is challenging. Several probabilistic, statistical, generative adversarial networks and variational autoencoder-based approaches have been presented for synthetic tabular data generation. Once generated, evaluating the quality of the synthetic data is quite challenging. Some of the traditional metrics have been used in the literature, but there is lack of a common, robust, and single metric. This makes it difficult to properly compare the effectiveness of different synthetic tabular data generation methods. In this article, we propose a new universal metric, TabSynDex, for the robust evaluation of synthetic data. The proposed metric assesses the similarity of synthetic data with real data through different component scores, which evaluate the characteristics that are desirable for “high-quality” synthetic data. Being a single score metric and having an implicit bound, TabSynDex can also be used to observe and evaluate the training of neural network-based approaches. This would help in obtaining insights that was not possible earlier. We present several baseline models for comparative analysis of the proposed evaluation metric with existing generative models. We also give a comparative analysis between TabSynDex and existing synthetic tabular data evaluation metrics. This shows the effectiveness and universality of our metric over the existing metrics.	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.subject	Civil Engineering	en_US
dc.subject	Evaluation metrics	en_US
dc.subject	Generative adversarial networks (GANs)	en_US
dc.subject	Tabular data synthesis	en_US
dc.title	A Universal Metric for Robust Evaluation of Synthetic Tabular Data	en_US
dc.type	Article	en_US

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Department of Civil Engineering [1259]

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

A Universal Metric for Robust Evaluation of Synthetic Tabular Data

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account