Search Word: *:*, Search Result: 2
1 The Influence of NPMI and TF-IDF-Based Automatic Stopword Generation on Semantic Consistency
Hye-soo Cho(Department of Sports Science, Hanyang University ERICA) ; Eun-Hyung Cho(Korea Institute of Sports Science) ; Hong-suk Kim(Department of Sports Science, Hanyang University) ; Soo-Kyung Cho(Department of Sports Science, Hanyang University) ; Ji-Yong Park(Department of Sports Science, Hanyang University) Vol.36, No.4, pp.557-567 https://doi.org/10.24985/kjss.2025.36.4.557
초록보기
Abstract

PURPOSE This study optimized stopword removal to enhance topic modeling performance. We propose an objective method combining normalized pointwise mutual information (NPMI) with median-based term frequency–inverse document frequency (TF–IDF) to automatically generate stopwords. METHODS Using text data from 443 research papers on “Taekwondo sparring,” we selected stopword candidates based on NPMI and identified 30 words with the lowest TF–IDF scores. We examined the impact of removing 1–30 stopwords on u_mass coherence scores. RESULTS The NPMI–TF–IDF method significantly improved coherence (R² = .456; p < .001). However, excessive removal led to diminishing returns, with the optimal coherence score (−11.442) achieved at 200 stopwords. In contrast, manually selected stopwords yielded a lower coherence score (−16.001). The findings indicate that integrating TF–IDF with NPMI effectively preserves meaningful words and outperforms PMI2 and PMI3 approaches. CONCLUSIONS Manual stopword selection can reduce reproducibility. Optimizing stopword removal based on domain-specific characteristics is essential. Future research should validate this method across diverse fields to establish a more generalizable standard.


2 Development of a Machine Learning-Based System for Classifying and Predicting Golf Players' Playing Styles
Hong-suk Kim(Department of Sports Science, Hanyang University) ; Hye-soo Cho(Department of Sports Science, Hanyang University ERICA) ; Ji-Yong Park(Department of Sports Science, Hanyang University) ; Hyeon-su Park(Department of Sports Science, Hanyang University) Vol.36, No.4, pp.592-604 https://doi.org/10.24985/kjss.2025.36.4.592
초록보기
Abstract

PURPOSE This study sought to classify the playing styles of KPGA players based on performance-related technical factors and develop a supervised learning model that automatically predicts and classifies these styles. METHODS Performance data were gathered from KPGA Korean Tour players between 2015 and 2024, focusing on six key technical indicators. Distinct playing styles were identified by standardizing the variables using z-scores and then clustering them using the K-means algorithm. Based on the clustering results, predictive classification models were built by applying five supervised learning algorithms—decision tree, random forest, K-nearest neighbors (KNN), support vector machine (SVM), and multinomial logistic regression. Model performance was then evaluated using accuracy, precision, recall, and F1-score, with generalizability assessed via five-fold cross-validation. RESULTS Four playing style clusters were obtained, each labeled according to players’ technical characteristics: “overall weakness type,” “distance-deficient but technically proficient type,” “accuracyoriented type,” and “power and risk-management type.” The multinomial logistic regression model showed the highest predictive performance, followed by SVM, KNN, random forest, and decision tree. CONCLUSIONS This study confirmed that KPGA players can be characterized into four distinct playing styles based on their technical performance data and that these styles can be effectively classified and predicted by supervised learning models. These findings highlight the models’ practical applicability in personalizing training strategies, developing course-specific game plans, and contributing to the advancement of AI-based sports analytics systems.


logo