Forecasting daily customer flow in restaurants: a multifactor machine learning approach

dc.contributor.authorShah, Himat
dc.contributor.authorMyller, Niko
dc.contributor.authorIslas Sedano, Carolina
dc.contributor.organizationfi=ohjelmistotekniikka|en=Software Engineering|
dc.contributor.organization-code1.2.246.10.2458963.20.71310837563
dc.converis.publication-id508199286
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/508199286
dc.date.accessioned2026-01-21T14:48:14Z
dc.date.available2026-01-21T14:48:14Z
dc.description.abstract<p>This paper presents a case study on predicting daily customer flow in a university’s self­service restaurant. We conduct a systematic comparison of multiple machine learning techniques and diverse feature sets to identify the best-performing model within our experimental scope, aiming to improve forecasting accuracy in this dynamic environment. We analyze real-time data collected via RFID sensors from spring 2019 to 2024. To ensure high data quality, we apply a robust preprocessing pipeline, followed by careful feature engineering to select 10 distinct features, labeled M1 to M10. These features include temporal attributes such as day, month, year, and season, as well as external factors like local weather conditions, public holidays, and menu choices. We conduct a systematic comparison across all feature sets and identify M10 as the optimal combination. A key finding highlights the importance of handling missing data—particularly during the COVID-19 period—as one of the most critical steps in the preprocessing stage. To evaluate the predictive power of our selected features, we tested various machine learning models, including linear regression, random forest, extreme gradient boosting (XGBoost), and long short-term memory (LSTM). Our findings indicate that XGBoost achieves the lowest mean absolute error (MAE) and mean squared error (MSE) values. The XGBoost model outperforms other models across all the feature sets, M1 to M10. XGBoost is particularly effective because it uses past data and a technique called exponential smoothing to understand what customers will do in the short term. Our analysis identifies the most influential features as the previous day’s customer count, exponential smoothing outputs, holidays, day of the week, and weather data. Overall, we recommend XGBoost as the most effective model for predicting daily customer numbers in similar contexts, given its superior accuracy across diverse feature sets.</p><p><br></p>
dc.format.pagerange168
dc.format.pagerange190
dc.identifier.eissn2771-392X
dc.identifier.olddbid213721
dc.identifier.oldhandle10024/196739
dc.identifier.urihttps://www.utupub.fi/handle/11111/55817
dc.identifier.urlhttps://doi.org/10.3934/aci.2025011
dc.identifier.urnURN:NBN:fi-fe202601215900
dc.language.isoen
dc.okm.affiliatedauthorMyller, Niko
dc.okm.affiliatedauthorIslas Sedano, Carolina
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherAmerican Institute of Mathematical Sciences (AIMS)
dc.publisher.countryUnited Statesen_GB
dc.publisher.countryYhdysvallat (USA)fi_FI
dc.publisher.country-codeUS
dc.relation.doi10.3934/aci.2025011
dc.relation.ispartofjournalApplied Computing and Intelligence
dc.relation.issue2
dc.relation.volume5
dc.source.identifierhttps://www.utupub.fi/handle/10024/196739
dc.titleForecasting daily customer flow in restaurants: a multifactor machine learning approach
dc.year.issued2025

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
10.3934_aci.2025011.pdf
Size:
3.72 MB
Format:
Adobe Portable Document Format