THE TOA ESTIMATION OF CELLULAR NETWORK SIGNALS BASED ON MACHINE LEARNING IN COMPLEX URBAN ENVIRONMENTS

: The precision location-based services in complex environment is a challenge in the field of navigation and positioning. With the continuous development of wireless communication technology in recent years, cellular network signals such as LTE and 5G have emerged as unique advantages in navigation and positioning applications. This paper presents a time-of-arrival (TOA) estimation method based on machine learning, which can use cellular network signals to obtain accurate ranging results in low signal-to-noise ratio conditions. For this purpose, we first present the cellular network signals that can be applied in navigation and positioning. Then, we describe in detail the process of TOA estimation based on machine learning. Finally, we carried out vehicular experiments in an urban environment to test the performance of the proposed method. The test results demonstrate the feasibility of the proposed method and achieve metre-level ranging accuracy.


INTRODUCTION
Location-based services (LBSs) in complex scenarios are the key focus of scholars and research institutions in recent years.Accurate location information has irreplaceable value in areas such as autonomous driving, precision marketing and emergency rescue.Traditional location services mostly provide location information to users through global navigation satellite system (GNSS) in outdoor open scenes.However, in complex scenarios such as cities, canyons and indoors, the performance of LBSs will be affected by the fading and refraction of GNSS signals.
The acquisition of high-precision navigation observation information through signal of opportunity (SOP) is a method to assist GNSS for high precision positioning.At present, WiFi (Shu et al., 2016, Yan et al., 2018, Gao et al., 2021), Bluetooth (Chen et al., 2013, Faragher and Harle, 2015, Zhuang et al., 2018) and cellular network signals (Driusso et al., 2017, Liu et al., 2023b, Shamaei and Kassas, 2021, Chen et al., 2022, Liu et al., 2023a, Ruan et al., 2022, Liu et al., 2022) are the widely used signals of opportunity in wireless positioning technology.Although WiFi and Bluetooth have the advantages of low-cost and lowpower consumption, they cannot provide high-precision LBSs to a large number of users under a wide area because of the limited coverage area of the base station (BS).
With the emergence and commercial application of the latest generation of cellular network technology, the introduction of multiple-input multiple-output (MIMO) and ultra-dense network (UDN) has enabled 5G signals to show unique advantages in wireless positioning technology.Researchers are gradually shifting their focus to positioning technologies based on cellular network signals.In (Driusso et al., 2017, Liu et al., 2023b), the researchers have developed several high-precision softwaredefined receivers (SDRs) for time-of-arrival (TOA) estimation that can be used in complex environments based on LTE signals.In (Shamaei and Kassas, 2021), Kimia Shamaei et al. de- At present, wireless signal tracking methods are commonly based on the principle of DLL or PLL to develop SDRs.However, this relatively sophisticated wireless positioning technique still suffers from large errors in environments with low signal-tonoise ratio (SNR) and severe multipath effect.With the development of machine learning (ML) technology in recent years, the deep learning method represented by convolution neural network (CNN) has been widely used in indoor fingerprint positioning with its accurate regression and classification ability for big data (Wang et al., 2015, Wang et al., 2020, Wang et al., 2021).The fingerprint positioning systems such as DeepFi (Wang et al., 2015), CiFi (Wang et al., 2020) and ResLoc (Wang et al., 2021) have high positioning accuracy, but the limitations in the application range are still unresolved.In wireless signal tracking methods, limited by the complex and variable channel state and physical structure of wireless signals, no research has been conducted to obtain navigation observations from wireless signals using ML methods.
In this paper, we introduce ML into the wireless signal tracking method and develop a opportunity navigation tracking system based on ML to improve the accuracy of TOA estimation of SDRs in a low SNR environment.As shown in Fig. 1, we take advantage of the accurate regression of support vector machine (SVM) for complex problems with large amounts of data in the developed SDR to fundamentally resolve the effect of noise on the TOA estimation of received signal in traditional SDRs.The specific contributions of this paper are shown below: • In this paper, the ML is introduced into TOA estimation of commercial downlink LTE signals to achieve highly accurate and stable signal tracking in the form of an SDR without changing the hardware device architecture.
• This paper implements the acquisition of navigation reference information at the device terminal for commercial downlink LTE signals, which reduces the computational pressure on the BSs while ensuring user privacy and data security.
• The proposed method was carried out through field tests, which showed a better performance than the traditional method.
The remainder of this paper is organized as follows: Section II introduces the cellular network signals that can be used in navigation positioning.Section III details the main process of the proposed method in this paper.Section IV shows the field test.Section V is a conclusion of the paper.

CELLULAR SIGNALS IN NAVIGATION AND POSITIONING
With the development of wireless communication technology, cellular networks have undergone a radical change from 1G to 5G.The application of each generation of cellular network signals for LBSs has received a lot of attention from researchers.
In this section, we first introduce the cellular network signals that can be applied in navigation and positioning.Then, we provide a detailed description of the signal model in navigation and positioning.

Signals in Navigation Positioning
At present, the cellular network signals represented by LTE and 5G are both set up with synchronization signal and reference signal (Shamaei andKassas, 2021, Liu et al., 2023a).In the communication field, the synchronization signal is mainly used for cell search to complete the coarse synchronization, and the reference signal is used for channel estimation.In the field of navigation and positioning, the reference signal in cellular networks can be well used for ranging and goniometry of wireless signals because of its advantages of large bandwidth and periodic emission.
In LTE networks, the cell reference signal (CRS) is uniformly distributed throughout the LTE downlink channel and transmits periodically at a high rate, which has been used by researchers in navigation and positioning applications (Liu et al., 2023b).
The demodulation reference signal (DMRS) in 5G networks is also an ideal choice in the field of navigation and positioning, and some research has demonstrated that the TOA estimation results at the meter level can be obtained in indoor environments by DMRS in recent years (Chen et al., 2022, Liu et al., 2023a).In the presently available cellular networks, both reference signals of appeal can be used in the field of navigation and positioning.

Signal Model
In the LTE signals, each orthogonal frequency division multiplexing (OFDM) symbol consists of N subcarriers.Let {tn|n = 0, ..., N − 1} denote the subcarrier symbol, where n represents the subcarrier number.After the inverse fast Fourier transform (iFFT) operation for every OFDM symbol, the samples of the transmitted baseband signal can be expressed as where j = √ −1, Ncp is the number of guard samples.
During the transmission of LTE signals, S (k) will change in amplitude, delay and phase because of the increasing transmission distance.The LTE signal received at the receiver is also affected by some noise.Here, we consider that the signal is transmitted over a frequency-selective fading channel of length L, where l = 0, 1, ..., L − 1.Hence, the received baseband signal can be written as where α l (k), τ l and ϕ l (k) stand for the amplitude, delay, and phase of the lth path received baseband signal.n(k) is the sample of zero-mean complex Gaussian noise process with variance σ 2 .The phase ϕ l (k) can be written as where f l (k) is the Doppler frequency normalized by the subcarrier spacing of lth path signal, and ϕ0 is the initial phase of the carrier.This section describes the system flow for implementing continuous tracking of cellular network signals using ML methods, and provides a detailed description of the main steps in the system.In this paper, we use LTE signals as the basis for testing and analysis.Therefore, in this section we all introduce LTE as an example.As shown in Fig. 2, the system mainly includes synchronization and demodulation, discriminator calculation, feature database definition, model training and online tracking.

Synchronization and Demodulation
In LTE networks, coarse synchronization of the signal can be completed by cell search.In this process, the cell ID of the received signal and the frame timing εmax can be obtained at the same time.Among them, the cell ID can be used in conjunction with the LTE protocol to extract the pilot signal in the subsequent demodulation process, while the coarse synchronization of the signal can be completed according to the frame timing εmax .
According to the result of the coarse synchronization of the received signal, the time offset can be corrected directly for the received time-domain signal.After removing the cyclic prefix of length Ncp, the time-domain signal can be converted to the frequency-domain signal by the fast Fourier transform (FFT) and the extraction of the pilot signal is completed.For this purpose, it is necessary to perform a FFT on the OFDM symbol sample of the LTE signal, which can be expressed as where FFT{• } is the discrete-time transform operator.

Discriminator Calculation
According to the LTE protocol, the CRS is mapped to an all zero sequence of length N according to the position, which yields the received pilot signal where ξ(0 < ξ < 1/2) is the advanced (and retarded) interval, which is normalized to the OFDM sample interval.When tracking the received signal continuously, we need to perform a phase rotation of the received pilot signal using the normalized symbol delay τ .Therefore, the received pilot signal after phase rotation can be expressed as The early and late correlations branch output in the frequencydomain can be written respectively as: where ∆τ is the normalized symbol delay variation of the received signal pilot with respect to τ .G is the number of pilot signals in an OFDM symbol.In the channel without multipath and noise, the Early-Minus-Late Power (EMLP) discriminator function is defined as where A is the signal gain, and S(∆τ, ξ) is the normalized Scurve of the received signal pilot, which can be expressed as:

Feature Database Definition
In an additive Gaussian white noise (AWGN) channel, the white noise introduces a new error VEMLP in the EMLP discriminator function (Yang et al., 2000).The EMLP discriminator function is defined as: where where covariance 2Cov[|ℜe(∆τ )| 2 , |ℜ l (∆τ )| 2 ] should be nonnegative to guarantee that the inequality in (11) holds, so that ∆τ is 0 and 0 < ξ ≤ 1/2.ξ should be set to 0.5 to ensure that |ℜe(∆τ )| 2 and |ℜ l (∆τ )| 2 are uncorrelated (Yang et al., 2000).Thus (11) can be written as follows: In an AWGN channel, the error of the EMLP discriminator function is closely related to the number of pilots G and the SNR (sn = A/σ 2 ).In order to make the training model more applicable in the low SNR environment, we define the simulated signals with different SNRs at the same time delay to build the feature database.The specific process is as follows: First, we define the simulation signal without noise according to the protocol of LTE signal and add different normalized symbol with different time delays is obtained.Here, we set the minimum normalized symbol delay variation resolution Υ = 0.025 and nb = 0, 1, 2, ..., 20.This means that the ML model which we trained can identify the fractional delay within 1 sample point with a resolution of 0.025 sample points.
Next, different SNRs are added to S(τ ) to increase the stability and anti-interference capability of the ML model, which gives the signal Ssn(τ ) with different SNRs.Considering the SNR demand for the available LTE signals in the communication network, thus setting sn = 0, 5, 10, ..., 50 in dB.
Finally, the pilot signal of Ssn(τ ) is extracted according to the steps of synchronization and demodulation.Combined with the method of discriminator calculation, the S-curve is calculated as the feature data to build the feature database.

Model Training
Considering the high transmission rate of CRS in LTE signals and the requirement of real-time signal tracking results for SDRs developed based on ML methods.In this paper, we train the feature database using a simple SVM method.The main purpose is to verify the feasibility of the signal tracking method based on ML proposed in this paper and to ensure the lightweight of the developed SDR.
In the process of ML model training, τ is used as the label and EM LP (τ ) is used as the training data, which are input to the SVM algorithm for training, respectively.In the SVM algorithm, a multiclass error-correcting output codes (ECOC) model is used to perform the fitting of labels and training data.
The training is terminated when the fitting accuracy of the ML model is not improving and the testing accuracy is stable, resulting in the corresponding ML model.

Online Tracking
There are two cases when tracking the received signal, i.e., first tracking and continuous tracking.The estimation and updating of the time delay is divided into two methods for both cases.
During the first tracking of the received signal, we set the time delay τ to 0 for calculating the S-curve of the received signal because the coarse synchronization of the signal has been completed.The S-curve is input into the trained ML model to obtain the time delay variation ∆τ of the received signal at this moment.Therefore, the first time delay estimation of the received signal can be expressed as τ (1)) = ∆τ .
During continuous tracking of the signal at moment k, we perform phase rotation of the received signal with the time delay estimation result τ (k − 1) at moment (k − 1).This can ensure the delay variation that needs to be estimated during continuous tracking always keeps within the identification range of the trained ML model.Then, the S-curve of the received signal is also input into the trained ML model to obtain the delay variation ∆τ of the received signal at this moment.The delay of the received signal can be expressed as τ (k) = τ (k − 1) + ∆τ .
And the phase of received signal can be expressed as

FIELD TEST
In order to evaluate the proposed method, a field experiment was carried out with commercial LTE signals on urban roads in Wuhan, Hubei Province, China.In this section, the experimental hardware and software setup are first presented.Then, the experimental results are presented.

Test Result
According to the method proposed in this paper, we obtained the result of field test as shown in Fig. 5.     test was carried out in complex urban environments to demonstrate the feasibility of the proposed method.The RMSE of the proposed method in the vehicle test is 9.4 m.Meanwhile, the test results of the paper show that the ranging error of the proposed method is reduced from 52.7 m to 31.6 m at 95% confidence compared with the DLL algorithm.We believe that the TOA method based on ML proposed in this paper has better performance than the traditional TOA method under low SNR environment.In the future, we will develop machine learning model with more applicability to cellular network signal feature data, and construct a complete positioning system.

ACKNOWLEDGMENT
The research is supported by Special Fund of Hubei Luojia Laboratory under grant number 220100008, the National Natural Science Foundation of China under grant number 42171417, the Key Research and Development Program of Hubei Province under grant number 2021BAA166, and the Guangxi Science and Technology Major Project under grant number AA22068072.

Figure 1 .
Figure 1.Schematic of TOA estimation from cellular network signals based on ML methods.

Figure 2 .
Figure 2. The flow of TOA estimation based on machine learning methods.
R(d).Meanwhile, the local reference pilot signal S(d) on each OFDM symbol can be generated correspondingly.d is the position of the CRS on each OFDM symbol.Since the time delay in the time-domain is equivalent to a phase rotation in the frequency-domain, the early and late code signals of S(d) can be obtained respectively as: Se(d) = e −j2πdξ/N S(d), S l (d) = e +j2πdξ/N S(d),

)
Figure 3.The S-curves of discriminator function with different fractional delays.

4. 1
Experimental Hardware and Software Setup As shown in Fig. 4(c), the receiver antenna was fixed to the top of the test vehicle.During the test, the experimenter collected the LTE signal in the vehicle with the universal software radio peripheral (USRP) X310, driven by a GPS constrained oscillator (GPSDO), mixing and sampling at 20 MSps.A laptop computer connected to the USRP X310 was used to record data using GNU Radio.The collected data are processed by a SDR developed on MATLAB.The trajectory of the vehicle during the test is shown in Fig.4(a).As shown in Fig.4(b), the vehicle was traveling under the overpass more than half of the time.In this case, the GPS cannot observe enough satellites for navigation positioning.We acquired the reference trajectory with Xsens MTi-G-710, which can provide positioning results with an accuracy of 1 m under ideal conditions.
Figure 4.The process of field test.
Fig. 5(a)  shows the TOA estimation based on the samples of time delay, and Fig.5(b) shows the TOA estimation error.In order to compare the effectiveness of the proposed method in this paper, we also show the TOA estimation results based on the DLL algorithm in Fig.5as a reference.The errors of the two different methods can be obtained by fitting the reference trajectory to the TOA estimation results, respectively.The root mean square error (RMSE) based on the results of the ML is 9.4 m.Fig.6shows the statistical results of the ranging errors for both algorithms.It can be clearly seen that the errors of the proposed method in this paper is reduced from 52.7 m to 31.6 m at 95% confidence compared to the DLL algorithm.Meanwhile, compared with DLL algorithm, the maximum error (ME) of test results based on ML is reduced from 197.3 m to 51.2 m.

Figure 5 .
Figure 5.The result of field test.

Figure 6 .
Figure 6.Comparison of error in calculation result.
Table1shows the detailed statistics of the errors for both methods.In this test, the overall performance of the proposed methods in this paper are better than the DLL algorithm.Compared with traditional TOA estimation methods, TOA estimation based on ML can obtain more stable result in low SNR environment.

Table 1 .
The performance statistics of the field test.In this paper, an SDR based on ML is developed for TOA estimation of cellular network signals.We detailed the process of TOA estimation by ML methods for cellular network signals that can be used in navigation and positioning.A vehicle