Abstract:
Sport action recognition is an interesting area in computer vision. Categorization of sport actions, representing difficult and complex body postures, is regarded as a fine-grained visual classification problem. The Convolutional Neural Networks (CNNs) have attained enhanced performance over conventional feature descriptors in recognizing various sport activities. In general, though decent improvement has been gained using deep learning for sport action recognition, however, recognition of women’s sport activities is not widely explored. Even, no benchmark dataset depicting women’s sport action with sufficient variations is available yet for study. Hence, fine-grained image classification of diverse sport categories involving female/women athletics requires immediate research attention. To overcome this limitation, this paper proposes an image dataset comprising worldwide popular 50 sport categories of women players only. A simple deep learning model is proposed that extracts the high-level deep features using a backbone CNN. Then, these features are pooled from a collection of regular regions representing local discriminative information. The spatial pyramid pooling is applied for mining semantic information and enhancing feature aggregation for classification. The proposed method has achieved satisfactory performance on the Women Sports dataset using four standard backbone CNNs. Moreover, our method has achieved better accuracy on the Yoga-82 pose recognition dataset with a significant margin, e.g., 11.6% gain using ResNet-50 base CNN.