Ground motion prediction using Artificial Neural Network in Pakistan

The goal of this research project is to design, build, and validate an artificial neural network (ANN) model that predicts ground motion from the previous data of earthquake for seismic incidents in Pakistan. The prediction of ground motion is essential for determining the seismic risks and consequently providing measures to mitigate them. The ANN model implements activation function in hidden neurons to represent those relationships the seismic data holds and get logarithmic PGA values. The model performance evaluation metrics which are like Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Correlation Coefficients prove the accuracy and robustness of the ANNPGA model. The ANNPGA model displays the best predictive power of all PGA values through the validation dataset. The MSE value is 0.00264 which model's accuracy in capturing ground motion variability. A comparative analysis with already created empirical and physics-based models will demonstrate that the ANPGA model gives more accurate predictions in most cases and especially in situations where nonlinear relationships are involved.


Introduction
Earthquakes are natural phenomena that occur when the Earth's crust experiences sudden, violent shaking.They have been a part of human history since ancient times and have had significant impacts on societies and communities worldwide.Pakistan is located at the boundaries of Indian and Eurasian plates in the north and Eurasian and Arabian plates in south along Balochistan.The plate boundary in the north is of convergent nature, while going from north to south the plate boundary has become transform in nature and along Markran where Arabian and Eurasian plates are colliding making an active subduction zone with relatively high seismicity rate (Saffari et al., 2012).
The most recent Dalbadine earthquake, in 2011; Quetta earthquake in 2008; Kashmir earthquake in 2005, Astor earthquake in 2002; Makran earthquake in 1945 and Quetta earthquake in 1935 are the examples of the active tectonic of this region.This tectonic setting is unique and this makes Pakistan seismically very active.
Seismic design codes include parameters to carry out seismic design in order to mitigate the possible seismic actions expected at the sites where the structures are built.These structures include all types of buildings, bridges, dams, nuclear power plants and other essential facilities.
There are two possible outcomes of using an adopted equation (1) either you will get over estimation of ground motion values or (2) underestimation of ground motion values.
A number of recent investigations have found that ANN models work well in predicting seismic activities and they can simulate the nonlinear things of seismic events and desired ground motion well.Torky et al., (2021) designed the ANNbased PGA prediction models for the seismically active region having the facsimiles of geological similarities of Pakistan.ANN studies demonstrated that these models undoubtedly outperform those traditional regression techniques in the sense of representation of the complicated links between seismic quantities and ground motions (Yaghmaei-Sabegh et al., 2012).Reportedly, Maqsoom et al., (2022) have undertaken a thorough assessment of the prevalent PGA prediction models and the call of a region-specific methodology taking root in the subsurface geology and the tectonics of the study area has been made evident (Cho et al., 2022;Gerstenberger et al., 2020;Ahmed et al., 2008).Till today, numerous pieces of research support the notion of incorporating modern machine learning techniques like ANNs in seismic prediction for increased disaster preparedness in earthquake prone areas; for example, Pakistan.
In our country, we can neither afford to compromise on safety as Pakistan is an earthquake prone region, damaging earthquakes having a history of deadliest earthquakes, which are also expected in the future.At the same time, we are a developing country; we cannot spend a lot on seismic design.Therefore, we should get a reasonable estimate of ground shaking which is close to actual expected and is economical.Thus, development of ground motion prediction equations based on the local data is a very important step towards mitigation of the earthquake hazard.In the literature Shah et al. ( 2012) study is available related to ground motion equations.Waseem et al., (2017) study has a part dedicated to the selection and use of ground motion prediction equations.Apart from the set wostudiesnoother study available in the literature.There exists a scientific gap regarding ground motion prediction equation for Pakistan that needs to be filled by making efforts by derivation of equations based on local strong motion data and by identifying global equations applicable to Pakistan.
The objectives of this research study is to develop and propose round motion for Pakistan by using artificial neural network (ANN).The destruction caused by this hazard is dependent on the magnitude and depth of ground motion that occurred during the earthquake.We required the prediction of design ground motion as close as possible to hazards which exist.One of the main reasons is the lack of dataset required for developing regression analysis for ANN.
In this study, we derive an approximation function in ANN to estimate peak ground motion parameters.i.e., PGA Development of GMPEs using ANN has never been attempted for Pakistan.This work is an original scientific contribution for the technical community working in Pakistan and worldwide.The development of GMPEs for seismically active region like Pakistan is an active area of research in the field of seismology.

Materials and methods
In this research, strong motion data will be collected from different agencies like the Pakistan Meteorological Department (PMD), of the Pakistan, installed strong motion arrays to establish a common databank.The collected data will be used to develop a ground motion prediction equation.If the collected data is not enough, it will combine with data outside of Pakistan having the similar regional tectonic setting and regression analysis will be carried out to derive the equations.
ANN model was applied to data to develop ANN prediction model.For the data needed for ANN, different departments are involved in compiling the strong dataset.In Pakistan, different agencies.Data available in the literature will be also used to compile the database.The strong motion attenuation relation is used to develop a mathematical model relating the independent and dependent variables of the ground motion by regression analysis.We can predict the ground parameters using two approaches.
After the collection of dataset, then followed to train an ANN model that could predict the ground motion parameters for a given set of input parameters.The input parameters were as follows-earthquake magnitude, source-to-site distance, and site characteristics (like soil type).As output, the magnitude of peak ground acceleration (PGA) has been considered.
The ground motion parameters typically increase with increasing earthquake magnitude.The ground motion parameters typically decrease with increasing source-to-site distance.Once the network is trained, can use it to predict the ground motion parameters for new earthquakes with known input parameters.The training is carried out until the average sum of squared error over all the training patterns is minimized.
The predicted PGA values give an estimate of the expected level of ground shaking at a specific site based on the input variables (distance, magnitude, and site class).The closer the predicted value is to the actual value, the more accurate the model is at predicting PGA levels.
Shear wave velocity is a crucial factor in identifying the site class in soil classification (Vs30).The typical shear wave velocity in the top 30 meters of soil is Vs30.The soils are categorized into certain groups based on their Vs30 values in accordance with the National Earthquake Hazards Programme-Uniform Building Code (NEHRP-UBC) (Table 4.1).In order to incorporate soil classes as an input value into ANN analysis, soil classes in this study are transformed to categoric numbers.The relevant data is displayed in Table 4    The PGA amplification curves for different magnitudes are given in above figures.The amplification factor for PGA (ratio of stiff soil PGA to rock PGA) decreases with distance away from the fault which is consistent with the nonlinear soil behavior.

Results and Discussion
The arias intensity is not the very good predictor of perceived intensity whenever the ANN model's performance is compared to other ANN models the fact that employ PGA, PGV, and PGD.In comparison to Tselentis and Danciu's (2008) ANN-based prediction models, the ANN model with produced better performance.As a result, adding further parameters like Mw and FD enhanced the performance.

Conclusion
The implication of this research can be summarized stating that measuring the impact takes into account the given values of the magnitudes, distance and site class while trying to determine the log PGA value.The main objective of this study was to show that ANN models could effectively predict PGA for earthquake events in Pakistan.
These factors indicate the specific ANNPGA application accuracy in emulating the complicated parameters' interrelationships and PGA.
The outcomes of this study will facilitate accurate seismic safety assessment and risk management in Pakistan as well.This model will precisely estimate the PGA of the earthquake, which will help generating robust infrastructure and plans to reduce the earthquake loss to infrastructures as well as people's lives.Model for trained ANNPGA can be used for the seismic hazard analysis, earthquake engineering, etc.Such data will be useful in designing earthquake-proof structures and infrastructure in the areas that are more likely to have earthquakes.It will make our cities safer in the areas where there is a high earthquake risk.

3. 1 .
Predicted PGA The predicted log PGA (Peak Ground Acceleration) obtained from the trained ANNPGA model represents the model's estimation or prediction of the logarithm of the PGA value for a given set of input variables.Each point on the plot represents a pair of predicted and actual PGA values.The x-axis represents the predicted PGA values and the y-axis represents the actual PGA values.

Figure 1
Figure 1 Predicted PGA vs. Actual PGA The distance vs PGA graph represents the relationship between the logarithmically transformed distance (log10(Distance)) and the actual PGA values from the 'PGA' dataset.It helps visualize the predicted PGA values from the trained ANNPGA model and compare them to the actual PGA values.

Figure 2
Figure 2 Attenuation Curve for different site classes The attenuation curves for different soil class: rock.The curves are plotted for each soil class at different magnitudes (5, 6.4, and 7.6) and varying distances from the seismic source.

Figure 3
Figure 3 ANN model of validation The plot shows the predicted PGA values against the actual PGA values for the training, validation, and test sets.The mean square error of this model is MSE of 0.00264, which shows the validation of the model.In a multivariate connection proposal, Tselentis and Danciu (2008) incorporated Ia, Epicentral Distance (Repi), along with Soil Class (SC).

Table 1
.1.Classification of Vs30 with Soil Type During the training of the ANN model, the MSE is calculated for each epoch or iteration.The goal of the training process is to minimize the MSE.