MULTI-LAYER PERCEPTRON BASED TRANSFER PASSENGER FLOW PREDICTION IN ISTANBUL TRANSPORTATION SYSTEM

: Predicting passenger movement in transportation networks is a critical aspect of public transportation systems. It allows for a greater understanding of traffic patterns, as well as efficient system evaluation and monitoring. It could also help with precise timing to emergencies or important events, as well as the improvement of urban transport system weaknesses and service quality. The number of transfer passengers demand in Istanbul, Turkey's biggest and most developed metropolis, was used to construct a real-world forecasting model in this study. The number of transfer passengers has been forecasted using popular machine learning methods such as kNN (k-Nearest Neighbours), LR (Linear Regression), RF (Random Forest), SVM (Support Vector Machine), XGBoost and MLP. The dataset utilized is made up of hourly passenger transfer counts gathered at two public transportation transfer stations in Istanbul in January 2020. Using MSE, RMSE, MAE and R 2 parameters, each model's experimental data have been thoroughly evaluated. MLP has more successfully other machine learning algorithms in the majority of transportation lines, according to the experimental results.


Introduction
In 2020, the city, which straddles the Bosporus and is located in both Europe and Asia, have a population of over 15 million people, contributing for 20 percent of Turkey's total population. (TUIK, 2021). In according to world demographics data, Istanbul is the most crowded city in Europe and the world's fifteenth most densely populated metropolis (Statista, 2020). The number of passengers utilizing public transportation is significantly higher as a result of the high density of the population (Pamucar et al., 2020). While nearly 11 million 500 thousand people use public transportation in Istanbul every day, passengers who prefer highway transportation (metrobus, public urban transportation, private bus, etc.) account for nearly 84 %, followed by railway transportation (metro, light metro, tram, etc.) at around %14, and sealine transportation at just under 2% (IETT, 2021). Despite the constant increase in the number of people and vehicles, the fact that the proportion of cars per thousand people is also constantly rising is important in terms of showing the increased traffic density in Istanbul. According to the traffic index published by TomTom, Europe's largest navigation systems company, Istanbul is the fifth city with the highest traffic density of 51% in the world (TomTom Traffic Index, 2021). With the increase of urban transportation challenges, forecasting the number of people entering and departing Istanbul's transit terminals has become more challenging. Passenger flow forecasting provides a better understanding of travel patterns, efficient monitoring and evaluation of the system status of Istanbul transportation system. It may also help in the prompt response to crises or special events, as well as the correction of defects and enhancement of public transportation service quality.
Several predicting methodologies have been proposed to enhance the effectiveness of passenger forecasting models, encompassing mathematical modelling methods, statistical methods, and non-parametric methodology. The machine learning-based (ML) framework is one of the most well-known non-parametric approaches today (Boukerche and Wang, 2020). It's a subset of AI that integrates the problem of learning from data samples with the concept of reasoning in generally (Boukerche & Wang, 2020). It's a subfield of AI that relates the difficulty of learning from sample data with the concept of reasoning in overall (Tom Mitchell, 2006). There are two stages to any learning process: (i) Particular a given dataset, calculation of unknown relationships in a system Particular a given dataset, calculation of unknown relationships in a system (ii) predicted connections are used to forecast new platform outputs. Machine Learning has also been shown to be an interesting topic of study in passenger demand prediction, with several applications , Zheng et al., 2021, Wang et al., 2021, Hayadi et al., 2021Gummadi and Edara (2018) Kamandanipour et al. (2022);Müller-Hannemann et al. (2022). The ability to predict passenger traffic in transportation networks is critical to public transportation management. It helps to improve transportation services, provide early warnings for unusual traffic situations, and make cities smarter and safer. Furthermore, transfer passenger flow prediction can improve the transfer operation efficiency reduce the transfer waiting time and enhancing passengers' satisfaction. To address this problem, the transfer passenger flow transferring a various modes transportation (metro & tram, bus & metrobus, rail, and ferries & sea-bus) in Istanbul has been developed for the first time in the literature.
The followings are some of the study's contributions: I. This paper offers a clear theoretical foundation and decision support for the practical work of using intelligent technologies to optimize the predictive performance of the number of passengers moving between different modes, including "metro and tram," "bus and metrobus," "rail," and "ferries and sea-bus." II. As Istanbul has a very heavy traffic; the number of lines can be increased or decreased according to the number of passengers. Accurate transfer passenger volume is the fundamental of transportation scheduling in Istanbul.
III. This enhances the service standards of an urban public transportation system and exposes passengers with real-time transfer passenger demand information across several routes, allowing them to make greater decision to travel.
IV. Prediction transfer passenger flow assists Istanbul IETT authorities and management in increasing public transit reliability of the system, improving passenger experience, and maximizing routing plans.
The motivation of this paper is the prediction of the number passengers transferring in various lines in Istanbul recorded at 1-hour intervals between 1-31 January 2020. Istanbul is Turkey's biggest and most developed metropolis therefore, the dataset utilized comprises of passenger transfer numbers on several transportation lines in Istanbul. The goal of this research is to anticipate the number of transfer passengers in Istanbul. The number of transfer passengers was determined based on passenger data gathered during one-hour intervals. The Istanbul Public Urban Transport Company (IETT), Private Public Bus (OHO), motor/boat, and the IETT tunnel will have been subjected to empirical investigations. Time information such as hourly, daily, and weekly has been revealed in this fashion on certain lines. The goal of this research is to use machine learning techniques to predict the amount of transfer passengers on transportation lines using a different kNN, LR, RF, SVM, XGBoost and MLP methods.

Literature Review
With the development of big data technology, using machine learning algorithms to detect the principles of urban passenger movement has become one of the research hotspots in the field of public transportation. In recent decades, there has been a huge amount of work on passenger flow and forecasting using statistical methodological approaches notably Machine Learning. Xie et al. (2014) employed a combination of Seasonal Decomposition (SD) and Least Squares Support Vector Regression (LSSVR) methods to forecast air passenger volume for a short amount of time. Sun et al. (2015) proposed a hybrid Wavelet and Support Vector Machine (SVM) methods that consist of three significant levels to predict the number of people entering and leaving the subway in Beijing. Roos et al. (2017) proposed a predicting technique, which is based on dynamic Bayesian Network (BN) built to function even passenger flow data is missing or uncertain. Ni et al. (2017)  Gummadi and Edara (2018) employed the ARIMA and seasonal ARIMA to estimate bus passenger flow in India's transport industry over a short period of time. Ye et al. (2019) aimed to predict the daily bus passenger traffic amount using the ARIMA method and examined the outcomes of predictions in the case of complete weekday non-peak data collected from January to March 2018. Li et al. (2020) predicted shared passenger demand in various locations with a hybrid algorithm based on WT-FCBF-LSTM (Wavelet Transform, Fast Correlation-based Filter, and Long Short-term Memory).  focused on a short-term estimation model for local bus passenger flow using SVM. Hayadi et al. (2021) proposed a Random Forest (RF) using the location data from the GPS devices in the buses, the location of the bus stop used for operation management, and the volume of traffic estimated by an image processing method. Li et al. (2021) adopted the seasonal ARIMA and SVM to predict the periodic flow of railway passenger. Guo et al. (2021) proposed a regression tree combined with copula-based simulations employing passenger level data to generate real-time distributional estimates of travels in an airport. Rajendran et al. (2021) 2022) analysed its use of traditional LR analysis and a RF model to unveil future passenger occupancies on a bus when it reaches at next stops using real-time data from bus operating and meteorological data. Reitmann and Schultz (2022) developed the gradient boosting (XGBoost) algorithm and the point-of-interest (POI) model, helping in the reduction of the passenger flow forecast model's total training time, to forecast bus passenger flow in Beijing. Comparisons of these models are listed in Table  1 in detail.

Machine Learning-Based Passenger Flow Prediction
The amount of immediate data produced by urban transportation systems is also expanding, thanks to the growth of big data, internet of things, sensor networks, and cloud computing applications. In topics like safety management, emergency response efficiency, and urban traffic management, passenger flow forecast in urban transportation networks is critical. Passenger flow planning is important for concerns including scheduling, traffic planning, and passenger flow control. The goal of this research is to anticipate the number of transfer passengers in Istanbul, Turkey's largest and most developed metropolis, using passenger flow data. The dataset utilized comprises of passenger transfer numbers on various transportation lines in Istanbul, such as transfers and normal boarding, recorded for one month between January 1, 2020 and January 31, 2020.
The objective of this research is to use machine learning algorithms to forecast the amount of transfer passengers on transportation lines. In practice, kNN (k-Nearest Neighbors), LR (Linear Regression), RF (Random Forest), SVM (Support Vector Machine), XGBoost (eXtreme Gradient Boosting), and MLP (Multi-layer Perceptron) have been examined then, each model's experimental findings have been thoroughly examined using MSE, RMSE, MAE, and R 2 metrics.

Original Data Analysis
In this study, a dataset consisting of the transfer numbers of passengers such as transfer and normal boarding in different transportation lines in Istanbul recorded at 1-hour intervals between 1-31 January 2020 by Istanbul Metropolitan Municipality has been used. The dataset used consists of 23163 rows of transportation data. The dataset contains id, date_time, transport_type_id, transport_type_desc, line, transfer_type_id, transfer_type, number_of_passenger parameters.
In this study, IETT, ÖHO, motor/boat and IETT tunnel transfer lines have been selected for prediction because they have the highest number of transfer passengers. IETT transfer line refers to all bus lines offered by the Istanbul Metropolitan Municipality. ÖHO transfer line refers to all bus lines offered by private public bus companies. Motor/boat, on the other hand, refers to all sea transportation made by marine vehicles. IETT tunnel refers to all transfers made using the underground metro. Table 2 shows the first 10 rows of the dataset used as an example.

Methodology
In this study, popular machine learning algorithms commonly used in the literature such as kNN, LR, RF, SVM, XGBoost and MLP have been applied. The dataset has been pre-processed before applying to the models. Possible blank or incorrect fields in the data have been checked. After the data pre-processing step, training, validation, and test datasets have been selected. 80% of the dataset is split into training and 20% testing. 10% of the training data have been split for validation. Validation data has been used for the optimization of model parameters.
Time series data refers to series of numbers ordered according to a time index. Time series data refers to series of numbers ordered according to a time index. In supervised learning problems, it is aimed to estimate the output from the inputs by using a function like y=f(x). Time series data can be transformed into supervised learning problem for use in time series analysis. The time series data can be transformed into a supervised learning problem by using the values from the previous time step to predict the value in the next time step as seen in Figure 1. In this study, time series data has been converted to supervised learning problem by using the sliding window method as seen in Figure 1. The number of previous timestamps determines the size of the sliding window. In this study, the size of the sliding window has been determined as 3 as a result of the experimental studies.
In order to optimize the parameters of the machine learning algorithms used, 10% of the training data has been used for validation. By using the optimized parameters, algorithms have been applied and prediction values have been obtained. The pseudo code of the developed system is presented below: Input: Passenger transfer data on IETT, OHO, motor/boat and IETT tunnel lines Output: Predicted passenger numbers 1: Start. 2: Checking the missing and incorrect areas in the data (data pre-processing). 3: Splitting training, validation and test sets and normalizing the data. 4: Optimizing model parameters using validation data. 5: Walk forward validation. 6: Have the parameters with the lowest MSE value been selected? If yes go to step 7, if no go to step 4. 7: Creation of the model. 8: Making predictions using the created model. 9: Calculation of MSE, RMSE, MAE and R 2 values according to the prediction results. 10: Finish.

Developed Model
In this study, a comparative analysis of the passenger number estimation problem of the MLP-based model developed with popular machine learning algorithms is presented. MLP is a neural network model inspired by the neuron structure in the brain. MLP is a combination of perceptron's that bind in different ways and operate in different activation functions. It consists of input nodes, hidden nodes and output nodes. Input nodes provide input information to the network. No computation is performed on any of the input nodes. This only relay information to hidden nodes. Hidden nodes are structures that are not directly connected to the outside world, perform calculations and transmit information from input nodes to output nodes. A hidden layer is created with a collection of hidden nodes. While a network has only a single input layer and a single output layer, it can have zero or multiple hidden layers. MLP has one or more hidden layers. Output nodes, on the other hand, are responsible for information processing and transferring information from the network to the outside world.
The developed MLP model takes the passenger flow data in the training dataset as input and predicts the passenger numbers in the test dataset. According to the obtained result, the training process has been continued. The architecture of the developed model is shown in Figure 2. In the developed MLP-based model, there are an input layer, three hidden layers and an output layer as seen in Figure 6. Hidden layers represent an intermediate processing step that is combined using weighted sums to obtain the classification result. The developed model is a sequential model with linear layers. There is a dropout layer between the input layer and the hidden layer. In the output layer, there are two output units that return the prediction of the probability of customer loss. ReLU activation function is used in the input layer and hidden layers, and the sigmoid activation function is used in the output layer.

Experimental Results
In this study, a dataset consisting of the transfer numbers of passengers such as 1month transfer and normal boarding in different transportation lines in Istanbul recorded at 1-hour intervals for 2020 has been used. IETT, ÖHO, motor/boat and IETT tunnel transfer lines with the highest transfer numbers have been selected for prediction. kNN, LR, RF, SVM, XGBoost and MLP algorithms, which are widely used in the literature, have been applied to the dataset. For each algorithm, the experimental results obtained using MSE, RMSE, MAE and R 2 metrics have been compared.

The IETT transfer line covers all bus lines offered by the Istanbul Metropolitan
Municipality. IETT transfer line consists of passenger flow information transferring in 687 different time zones. 80% of this data is split for training and 20% for testing. After the train/test split, 6070 rows of data have been used in the training and 1518 rows of data have been used in the testing. Figure 3 shows the change in the number of transfer passengers on the IETT line over time. Table 3 show the average MSE, RMSE, MAE and R 2 results obtained for each algorithm for IETT line.  The ÖHO line covers all passenger transfers offered by private public bus companies. ÖHO transfer line consists of passenger flow information transferring in 716 different timestamps. 80% of this data is split for training and 20% for testing.
After the train/test split, 572 rows of data have been used in the training and 144 rows of data have been used in the testing. Figure 4 shows the change in the number of transfer passengers on the ÖHO line over time. Table 4 show the average MSE, RMSE, MAE and R 2 results obtained for each algorithm for ÖHO line.  Motor/boat transfer line refers to all transfers made by sea vehicles that provide sea transportation. Motor/boat transfer line consists of passenger flow information transferring in 618 different timestamps. 80% of this data is split for training and 20% for testing. After the train/test split, 494 rows of data have been used in the training and 124 rows of data have been used in the testing. Figure 5 shows the change in the number of transfer passengers on the motor/boat line over time. Table 5 show the average MSE, RMSE, MAE and R 2 results obtained for each algorithm for motor/boat line.   LR,RF and SVM are 57453.940,48366.810,55962.547,34556.885,45494.010,30629.115,respectively. The RMSE values of kNN,LR,RF and SVM are 239.695,219.924,236.565,185.894,213.293,175.012,respectively. The MAE values of kNN,LR,RF and SVM are 160.711,168.390,159.844,136.343,144.571,125.101, respectively. The R 2 values of kNN, LR,RF and SVM are 0.884,0.903,0.887,0.930,0.907,0.938,respectively. IETT tunnel transfer line refers to all transfers made using the underground metro. IETT tunnel transfer line consists of passenger flow information transferring in 502 different timestamps. 80% of this data is split for training and 20% for testing. After the train/test split, 401 rows of data have been used in the training and 101 rows of data have been used in the testing. Figure 6 shows the change in the number of transfer passengers on the IETT tunnel line over time.  Table 6 show the average MSE, RMSE, MAE and R 2 results obtained for each algorithm for IETT tunnel line.  LR, RF and SVM are 1909.902, 2619.355, 2070.691, 2336.181, 2832.245, 1904 Figure 7.a, the ÖHO line in Figure 7.b, the motor/boat line in Figure 7.c and the IETT tunnel line in Figure 7.d are shown. As can be seen in the Figure 7, the MLP-based model successfully predicted the patterns in the training and test data.

Conclusions and Future Studies
In this study, a comparative analysis of popular machine learning algorithms such as kNN, LR, RF, SVM, XGBoost and MLP for passenger flow prediction is presented. The experimental results for IETT, ÖHO, motor/boat and IETT tunnel lines have been extensively tested using MSE, RMSE, MAE and R 2 .
For the IETT line, the MLP has more successful than the other models compared. After MLP, SVM, kNN, RF, XGBoost and LR have been successful, respectively. For the ÖHO line, the MLP has more successful than the other models compared. After MLP, RF, kNN, SVM, LR and XGBoost have been successful, respectively. For the motor/boat line, the MLP has more successful than the other models compared. After MLP, SVM, XGBoost, LR, RF and kNN have been successful, respectively. For the IETT tunnel line, the MLP has more successful than the other models compared. After MLP, kNN, RF, SVM, XGBoost and LR have been successful, respectively.
Experimental results show that these machine learning methods can be used in passenger flow prediction problems. Among the compared algorithms, MLP achieved successful results in all of the transportation lines. MLP is a neural network model developed based on biological neural network structures. The MLP consists of interconnected processing units, similar to the functioning of neurons. MLP's ability to detect non-linear, linear or non-linear distributed data makes it perform well on most datasets. XGBoost is a machine learning model that uses a gradient boosting framework. XGBoost is a decision-tree and gradient-boosting based machine learning model. It works successfully on non-structured data such as images, text and audio. kNN may be inefficient in terms of performance on small datasets. SVM is successful when having a limited set of points. SVM is good at outliers as it will only use the most relevant points to find support vectors. For this reason, SVM have successful results in this study. LR is expected to be successful when the dataset is truly linear, especially when there are many features with a very low signal-to-noise ratio. However, RF may fail to model linear combinations of many features.
All methods compared in this study had successful results. All methods had R 2 values above 0.90 for the IETT line, above 0.94 for the ÖHO line, above 0.88 for the motor/boat line, and above 0.84 for the IETT tunnel line. Experimental results showed that the developed MLP-based model gives better results than the compared models for all transfer lines used in the prediction of the number of passengers. The prediction of the number of passengers is an important factor for the urbanization phenomenon and city management. Transportation planning is also important in terms of avoiding disruptions in transportation and reducing the traffic load. The developed model can be applied to real-world problems by using effective passenger predicting in the field of transportation planning. In future studies, longer-term predictions can be made using passenger data over a larger time period. In addition, the results can be evaluated by applying different models such as deep learning.
In this study, traditional machine learning methods and MLP, which is a neural network-based model, are compared in practice. Here, it is aimed to benefit from the prominent features of neural networks in the time series prediction problem. The ability of a neural network to process data in detail stems from its ability to reveal hidden patterns between input and output data. An important advantage of neural networks is that they have the ability to learn and generalize information. MLP is tolerant of missing values and can model complex relationships such as nonlinear trends. It can also support multiple inputs.
One of the important limitations of this study is that it only considers the number of transfer passenger volume prediction. For this reason, different external factors such as transfer time, rush hours and holiday days could be examined for passenger prediction model in the future. Secondly, ML algorithms such as kNN, LR, RF, SVM, XGBoost and MLP methods was employed during the short-term prediction process. In the further study, a state of art deep neural network algorithm could be developed to improve the prediction result for the number of transferring passengers.
Author Contributions: Conceptualization, Software, Methodology, Validation, Writing, Visualization, Editing, A.U.; Review, Writing, Original draft preparation, Resources, Editing, S.K.K. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.