PURPOSE This study optimized stopword removal to enhance topic modeling performance. We propose an objective method combining normalized pointwise mutual information (NPMI) with median-based term frequency–inverse document frequency (TF–IDF) to automatically generate stopwords. METHODS Using text data from 443 research papers on “Taekwondo sparring,” we selected stopword candidates based on NPMI and identified 30 words with the lowest TF–IDF scores. We examined the impact of removing 1–30 stopwords on u_mass coherence scores. RESULTS The NPMI–TF–IDF method significantly improved coherence (R² = .456; p < .001). However, excessive removal led to diminishing returns, with the optimal coherence score (−11.442) achieved at 200 stopwords. In contrast, manually selected stopwords yielded a lower coherence score (−16.001). The findings indicate that integrating TF–IDF with NPMI effectively preserves meaningful words and outperforms PMI2 and PMI3 approaches. CONCLUSIONS Manual stopword selection can reduce reproducibility. Optimizing stopword removal based on domain-specific characteristics is essential. Future research should validate this method across diverse fields to establish a more generalizable standard.
PURPOSE This study sought to classify the playing styles of KPGA players based on performance-related technical factors and develop a supervised learning model that automatically predicts and classifies these styles. METHODS Performance data were gathered from KPGA Korean Tour players between 2015 and 2024, focusing on six key technical indicators. Distinct playing styles were identified by standardizing the variables using z-scores and then clustering them using the K-means algorithm. Based on the clustering results, predictive classification models were built by applying five supervised learning algorithms—decision tree, random forest, K-nearest neighbors (KNN), support vector machine (SVM), and multinomial logistic regression. Model performance was then evaluated using accuracy, precision, recall, and F1-score, with generalizability assessed via five-fold cross-validation. RESULTS Four playing style clusters were obtained, each labeled according to players’ technical characteristics: “overall weakness type,” “distance-deficient but technically proficient type,” “accuracyoriented type,” and “power and risk-management type.” The multinomial logistic regression model showed the highest predictive performance, followed by SVM, KNN, random forest, and decision tree. CONCLUSIONS This study confirmed that KPGA players can be characterized into four distinct playing styles based on their technical performance data and that these styles can be effectively classified and predicted by supervised learning models. These findings highlight the models’ practical applicability in personalizing training strategies, developing course-specific game plans, and contributing to the advancement of AI-based sports analytics systems.
[Purpose] This study evaluated the predictive power of Body Mass Index (BMI) for metabolic syndrome in older adults across pre-, during-, and post-COVID-19 periods, and examined the effects of metabolic syndrome factors on BMI by income level, aiming to inform elderly health management and crisis-related policies. [Methods] Data from 6,242 older adults (aged 65–80) were drawn from the 2019–2022 Korea National Health and Nutrition Examination Survey. Income was divided into quartiles, and time was segmented into pre-, during-, and post-pandemic periods. Multiple linear regression was used to assess the effects of metabolic syndrome factors (diabetes, abdominal obesity, low HDL, hypertension, hypertriglyceridemia) on BMI by income and period. Receiver Operating Characteristic (ROC) analysis evaluated BMI’s predictive power for metabolic syndrome. Significance was set at .05. [Results] Abdominal obesity and low HDL consistently influenced BMI across all groups. In the lowest income group, hypertension increasingly affected BMI during and after the pandemic. BMI Area Under the Curve (AUC) values peaked during the pandemic in this group, while the highest income group showed stable predictive power. [Conclusion] The COVID-19 pandemic had a differential impact on the association between BMI and metabolic syndrome among older adults according to income level. In low-income older adults, the predictive power of BMI for metabolic syndrome increased during the mid-pandemic period, while it remained stable across all periods in high-income groups. Systematic health management programs and policy interventions targeting low-income older adults are required to reduce health disparities during public health crises.