Study areaKarun is the longest and most affluent river of Iran. With950-km long, this river originates in the north of the province of Khuzestan (atthe coordinates ) and forks into two branches in the city of Khorramshahr, and finallyempties into the Persian Gulf (at the coordinates ) (Fig. 1.

). The city of Ahvaz is the mostpopulated (1,136,989 people) and important city located along the path of this river( ). The Ahvaz has a significant contribution in thereduction of this river’s quality.

The quality of this river is better innorthern parts (beginning) by passing through different cities, its qualitydecreases. In order to assess the quality of Karun River,information was considered from two metering stations, one before and the otherlocated after the city of Ahvaz. The employed data in this study includedcalcium, magnesium, nitrite, and nitrate. Most parameters were measured monthlyby Khuzestan Water and Power Authority from 1995. Since the collectedstatistics at these stations in the early years were incomplete, the new andmore complete information recorded over 2013 to 2015 were used. Nevertheless,since some parameters were not collected in a few months, the data from theseperiods were removed.

Hence, the data of the first and second station werefinally down-sampled to 36 and 38 data, respectively. In the present study,three scenarios were considered to predict water quality. The first scenarioincludes using the collected parameters in the first station, the secondscenario includes employing the collected parameters in the second station, andthe third scenario uses the data from both stations. Fig. 1: Location ofthe study area in Khuzestan province of Iran Minimum, maximum, mean,standard deviation and skewnesscoefficient can describe the quality parameters of a water. Therefore, thespecifications of collecting parameters are presented in Table 1. 2.

2. Artificial neural networkStructure of neuralnetworks include 3 separate layers: 1) input layer which is responsible for introducing the data to the model, 2) hiddenlayer (s) where the data are processed, and 3) the output layer to produceresults. Each layer comprises one or multiple elements known as a neuron. A schematic view of a neural network isdemonstrated in Fig. 2. A number ofneurons in the input, hidden, and outputlayers depend on the problem type and are determined based on the difficulty level ofthe problem. In case an insufficient number of neurons is selected, the networkmay not demonstrate an appropriate degree of freedom for training purposes.

Onthe other hand, in case of selecting a largenumber of neurons for the hidden layer, the learning process can take aconsiderably long time to complete. A numberof neurons in input and output layers is constant and depends on the number ofinput and output parameters. Gamma test can be used to determine optimalparameters for the input layer. Although the number of neurons in the hiddenlayers is determined through trial and error (Salehnia etal., 2013), it is suggested that the number of neuronsin the hidden layer should be within the range n-m, where n and m are the numbersof neurons in the input and output layers, respectively. Table 1: Basic statistics of the measured water qualityvariables in Karun River, Iran Kurtosis Skewness SDc Mean Maxb Mina Unit Variable Number for Embedding 26.743 4.

077 45.76 40.00 344.00 2.00 NTU Turbidity 1 26.

496 4.443 53.68 64.00 420.00 20.00 mg/L SS 2 0.492 .

226 14.50 168.49 206.00 133.00 mg/L TA 3 8.

503 4.024 0.02 0.02 0.12 0.01 mg/L PO4 4 -1.066 .

010 5.70 22.41 35.70 12.00 ?C Temperature 5 3.777 .074 1.

65 6.65 13.53 1.92 mg/L NO3 6 9.

539 .398 0.006 0.

01 0.05 .010 mg/L NO2 7 -0.641 -.496 0.14 0.

51 0.85 0.28 mg/L NH4 8 -1.833 -.

282 46540.76 61887.83 110000.

00 2100.00 U/100mL* Total coliform bacteria 9 -0.323 2.112 366.88 1711.

51 2585.00 856.00 mg/L TDS 10 -0.294 .351 573.79 2692.04 4040.

00 1300.00 S/m? EC 11 0.280 -1.

631 0.22 7.60 8.00 6.90 – pH 12 0.625 -.

110 2.26 9.24 15.

51 2.83 mg/L SO4 13 3.233 -.240 0.39 3.24 4.

34 1.76 mg/L HCO3 14 -0.616 -.

485 4.46 15.47 27.

00 7.55 mg/L Cl 15 6.986 .112 1.63 7.71 15.60 3.37 mg/L Ca 16 0.

000 -.472 1.30 4.64 7.80 1.

51 mg/L Mg 17 -0.585 .431 4.43 15.

81 26.48 7.04 mg/L Na 18 1.

804 1.669 101.45022 618.05 905.00 268.

50 mg/L TH 19 0.558 .675 0.96 3.3115 6.22 1.08 mg/L BOD – -0.

314 .078 5.09 15.81 28.40 8.40 mg/L COD – 0.558 -.

248 1.22 7.42 10.00 3.

80 mg/L DO – N=74.a Min: minimum.b Max: maximum.

c SD: standard deviation.*=Unit is count per 100 mL Fig. 2: A typical artificial neural network Employing artificialneural networks (ANNs) is the most common method to solve complex, nonlinearmathematical problems. Similarly, multilayer perceptron (MLP) is the mostwidely used types of neural network in solving such problems. In order tocreate an MLP neural network, the appropriate threshold function, weight, andbias should be determined for each neuron. During training of neural networks,the weight and bias of each neuron are altered until their favorable values areobtained. The most important threshold functions used in the development of MLP models include Gaussian,sigmoid, and tangent sigmoid.

(1) (2) (3) In this study,parameters of calcium and magnesium were selected as the input, and parametersof nitrate and nitrite were selected as the output. According to the literaturestudies, no randomization was conducted on the data of water quality.Therefore, in order to predict the water quality of Karun river, the data ofwater quality were divided into two categories according to (Basant et al., 2010).

These categories included training and validation data, each comprising50 items (80 percent) and 24 items (20 percent) of the total data. Regardingthe first station, the two categories included 29 and 8 items of the data, andfor the second station, 31 and 8 data were included.Some drawbacks mightbe observed in the performance of the neural network due to the differencebetween the maximum and minimum rangesfor each parameter as well as the different type of each variable. Therefore,it seems necessary to convert the parameters into a dimensionless interval soas to standardize them. The general formula for standardization within theinterval (a, b) is as follows: (4) where xs and xo are the original and normalized observational parameters,respectively. a and b represent the upper and lower limits of standardization.

xmin and xmaxindicate the maximum and minimum values of parameter x, respectively. Since aand b are considered zero and one in the present study, respectively, theformula is further simplified as: (5) Moreover, Marquardt algorithm was used totrain the neural network, since according to literature studies, this method ismore powerful and faster than the other existing methods. An optimal number of hidden layers was obtainedthrough trial and error and based on the proposed domain byEhteshami (2014) for Karun river.2.3. Gamma testAs isdescribed in the previous section, in orderto determine the optimum neuron of the input layer, it is helpful to use gamma test(GT). This method is one of the most important procedures to select a useful predictor from a database.

Since GT hasbeen used in many studies in the field of ANN (Tian et al., 2016), it was then used in the presented study. A formal proof of GTwas extended by Chang et al. (2010). By supposing a set of data observation in the following form: (6) Where, are input vectorsconfined to some closed bounded set and, are correspondingoutputs. The system of GT can be expressed inthe following form: (7) Where f is a smooth function and r is a random variable representingnoise. In general, the mean of the distribution of r is assumed as 0 and thevariance of the noise (Kim and Kim, 2008) is bounded.

The gamma statistic is the main parameter,which can estimate the model’s output variance. For each vector xi, the are the kth nearest neighbors (8) Where, denotes Euclideandistance, and the corresponding Gamma function of the output values: (9) Where y is the corresponding y-value for the kth nearestneighbor of xi in Equation (8). In order to compute the points are calculated byunivariate linear regression equation with least-squares: (10) The value of is the intercept of the Equation (9). A is a gradient of a linethat describes the complexity of themodel.

The high value of A showmore complexity and low one indicate less complexity. Another term that candescribe invariant noise called Vratio: (11) Where, is the variance of output y. According to thedefinition of Vrario,the value of Vrarioclose to 0 indicate a high degree of predictability of given output y.In addition, the estimation of noise variance on the given output can be morecredible if the standard error (SE) is close to 0.GT estimate the mean square error (MSE) of noise variancewhich cannot be modeled by the smoothestpossible model (Goyal et al., 2013). 2.4.

Performance evaluation of modelsIn order to assess the generated neural networks, four metrics, namelyRMSE, MAE, and R2 were employed. RMSE metric represents the error ofthe model and is defined according to Relation 6. MAE metric determines over-and underestimation. The coefficient ofdetermination (R2) represents the percentage of the variables whichcan be estimated by the model, and is calculated as follows: (6) (7) (8) In the above equation, , and n are respectively therepresentatives of predicted values,observed values and the number of data.3.

Resultsand discussionIn this study, gamma test was used to omit less-effectiveparameters. However, this procedure also reduced the number of input parametersof the neural network. Table 2 shows the results of gamma test for BODsimulation, using Scenario 1, Scenario 2 and Scenario 3.

In row “embedding” inthe table, different types of input parameters for each station are determined.Here, 0 is assigned to the parameter which isnot considered as an input for ANN model and 1 is assigned for the considered parameter as an ANN model input. The orderingof 0, 1 given in Table 2-4 (1st row) is the same as the ordering ofparameters given in Table 1.

As shown in Table 2, the best inputs to developANN model to estimate BOD for station 1 areTurbidity, SS, TA, Temperature, NO2, Total coliform bacteria, TDS, EC, pH, SO4, HCO3,Cl. In this station, the number of neural network inputs decreased from19 to 13 parameters. In station 2, the number of input parameters to achieve the bestgamma was reduced to 11. In this station, Turbidity, TA, PO4, NO3, NO2, NH4, TDS, EC, pH, SO4, and HCO3 were determined as the mostoptimal input parameters for ANN model to estimate BOD. Furthermore, if data inboth stations 1 and 2 are simultaneously used in gamma test, the number ofinput parameters will reduce to 10. Applying gamma test for the data collectedfrom station 1, station 2 showed that TA, NO2, EC, pH, SO4, and HCO3 must be selected as input parameters in all three scenarios.Magnesium, sodium,and pH were not thus selected as inputs, at all.

. Table 2: The best selective masks and their performancecriteria for BOD Both of station Station 2 Station 1 Parameters 1011001110111100000 1011011101111100000 1110101011111111000 Embedding 0.0550 0.0001 0.0242 Gamma statistic 0.0963 0.0843 0.0700 Gradient 0.

0359 0.0239 0.0692 Standard error 0.2200 0.0005 0.0970 V ratio Table 3: The best selective masks and theirperformance criteria for COD Both of station Station 2 Station 1 Parameters 0011110010000101000 1111110100011010000 0111010010100000100 Embedding 0.0379 0.

0001 0.0001 Gamma statistic 0.1394 0.

0902 0.1377 Gradient 0.0350 0.0349 0.

0264 Standard error 0.1518 0.0001 0.0001 V ratio Table 4: The best selective masks and theirperformance criteria for DO Both of station Station 2 Station 1 Parameters 1101000111101000000 1001000111101010000 1111100000001001000 Embedding 0.0440 0.

0145 0.0001 Gamma statistic 0.1782 0.1985 0.1642 Gradient 0.0302 0.

0609 0.0516 Standard error 0.1761 0.0582 0.

0001 V ratio The Results of gamma testfor COD estimation are shown in Table 3. Applyinggamma test for data from station 1, showed that the best parameters for CODsimulation include SS, TA, PO4,Temperature, NH4, TDS and Mg. While, gamma test results implied that Turbidity,SS, TA, PO4, Temperature, NO3, NH4, pH, SO4,and Cl are the best input parameters for COD simulation considering data instation 2. Furthermore, Aggregation of data in station 1 and 2 for gamma testdemonstrated that TA, PO4, Temperature, NO3, Total coliform bacteria, HCO3 andCa should be selected as input for CODsimulation (Table 3). Moreover, Table 4 shows different inputs for Do simulationdue to applying gamma test under three scenarios. Table 5: Results of ANN to predict BOD, COD andDO Correlation coefficient MAE (mg/L) RMSE (mg/L) Stage Station Variable 0.89 0.

0411 0.0090 Training St. 1a BOD 0.91 0.0395 0.0112 Testing 0.88 0.

0452 0.0084 Training St. 2b 0.84 0.

0421 0.0142 Testing 0.81 0.0357 0.0093 Training Both station 0.

80 0.0574 0.0132 Testing 0.

89 0.0573 0.0112 Training St. 1 COD 0.85 0.0596 0.0089 Testing 0.89 0.

0695 0.0183 Training St. 2 0.

87 0.0609 0.0297 Testing 0.81 0.

0984 0.0401 Training Both station 0.79 0.0594 0.0114 Testing 0.82 0.

0325 0.0085 Training St. 1 DO 0.

81 0.0338 0.0086 Testing 0.84 0.0500 0.0184 Training St. 2 0.

86 0.1606 0.0528 Testing 0.82 0.0626 0.

0086 Training Both station 0.84 0.0770 0.0125 Testing a St. 1: station 1.b St.

2: station 2. Based on gamma test analysis, it can beinferred that SS, TA, and temperature arethe common parameters for COD simulation for all three scenarios. While phosphate was also the only common parameter for DO simulation under threescenarios. In a study on the Karun River, Emamgholizadeh et al. (2014) investigated the sensitivity of MLP model toinput parameters using omitting them one by one. Although they used fewerparameters; however, they reported similar results regarding the little impact of parameters such as Ca and Mgin predicting BOD, COD and DO.

Phosphate and turbidity were used in mostscenarios to predict BOD, COD and DO and their effectiveness in determining theseparameters can be expressed. These results correspond with findings of Emamgholizadeh et al. (2014) and Singh et al.(2009). Phosphate plays an important role inoxidation as well as the energy-release process and its increment, increasesthe number of microorganisms (Singh et al.,2009). Turbidity is an important parameter indetermining the self-purification and the amount of dissolved oxygen in theriver (Talib and Amat,2012).

Therefore, it plays an important role in the simulation of the quality of the Karun Riverwater. Fig. 3: Scatter plots of observed and predicted BOD usingthe data of station 1 (top panel), station 2 parameters (middle panel) and bothstation (bottom panel): a training and b testing.

ANNs results for simulation of BOD, COD and DOare shown in Table 5. Statistics of RMSE, MAE and coefficient correlation wereused to compare simulation results with observed data.RMSE and MAE values show that the neuralnetwork could well predict these parameters. These results correspond to thoseof Emamgholizadeh et al.’s study (2014). However, the presented study has improved RMSEand MAE. In the present study, two phases of “training” and “testing” wereemployed.

Comparing corresponding results of each phase shows that thecurrent networks has sufficient accuracyfor simulation of desired parameters. The Values of RMSEand MAE gave in Table 5 indicate the ANNsfor simulation BOD have more appropriate performance than the other ANNs underthree scenarios. However, All the ANNs have the acceptableperformance to simulate BOD, COD and DO in training and testing phases.Figures 3 to 5 show that how well the predicted values of BOD, COD and DO matchmeasured values