Exploring the Impact of Synthetic Data Generation on Texture-based Image Classification Tasks

Borislav Yordanov, Carlo Harvey, Ian Williams, Craig Ashley, Paul Fairbrass

    Research output: Contribution to journalArticlepeer-review


    In this study, we introduce a novel pipeline for synthetic data generation of textured surfaces, motivated by the limitations of conventional methods such as Generative Adversarial Networks (GANs) and Computer-Aided Design (CAD) models in our specific context. We also investigate the pipeline's role in an image classification task. The primary objective is to determine the impact of synthetic data generated by our pipeline on classification performance. Using EfficientNetV2-S as our image classifier and a dataset of three texture classes, we find that synthetic data can significantly enhance classification performance when the amount of real data is scarce, corroborating previous research. However, we also observe that the balance between synthetic and real data is crucial, as excessive synthetic data can negatively impact performance when sufficient real data is available. We theorize that this might stem from imperfections in the synthetic data generation process that distort fine details essential for accurate classification, and propose possible improvements to the synthetic data generation pipeline. Furthermore, we acknowledge the potential limitations of our study and provide several promising avenues for future research. This work illuminates the advantages and potential drawbacks of synthetic data in image classification tasks, emphasizing the importance of high-quality, realistic synthetic data that complements, rather than undermines, the use of real data.


    • Synthetic Data
    • Image Classification
    • Textured Surface


    Dive into the research topics of 'Exploring the Impact of Synthetic Data Generation on Texture-based Image Classification Tasks'. Together they form a unique fingerprint.

    Cite this