Flood frequency analysis using synthetic samples

Orsini-Zegada, Luis; Escalante-Sandoval, Carlos

doi:10.20937/ATM.2016.29.04.02

Información del artículo

Resumen

Texto completo

Bibliografía

Descargar PDF

Estadísticas

Figuras (12)

Mostrar másMostrar menos

Tablas (2)

Table I. Selected characteristics of the annual peak flow series.

Table II. CPFs applied in the conventional FFA for this study.

Mostrar másMostrar menos

Resumen

El flujo de diseño es la base para la planeación y el diseño de obras hidráulicas. La precisión en el cálculo de flujos es importante para el análisis de viabilidad de dichas estructuras porque el valor estimado influye directamente en la evaluación de los efectos de falla. Sin embargo, en razón de la variabilidad, la precisión del cálculo se reduce de manera drástica cuando se utilizan muestras pequeñas en el análisis de frecuencia de inundaciones (FFA, por sus siglas en inglés) convencional. En este trabajo se plantea un nuevo enfoque basado en la simulación combinada de flujos anuales máximos y medios. El método se evaluó tomando en consideración submuestras de 10, 20, 30, 40 y 50 años obtenidas a partir de 13 estaciones pluviométricas ubicadas en la cuenca del río Susquehanna. Los resultados se compararon con los obtenidos mediante FFA y el análisis regional de estaciones-año. Este enfoque novedoso puede reducir la incertidumbre en las estimaciones del flujo de diseño cuando los datos asequibles son escasos.

Abstract

The design flow is the basis for planning and designing different hydraulic works. The precision in estimated flows is important when analyzing the feasibility of such structures because the value directly influences the evaluation of the failure effects. However, due to flow variability, the precision of the estimate is drastically reduced when small samples are used in a conventional flood frequency analysis (FFA). This paper proposes a new approach based on a combined simulation of the annual peak and mean flows. The method was evaluated by considering 10-, 20-, 30-, 40- and 50-yr subsamples obtained from 13 gauging stations located in the Susquehanna River basin. The results were compared with those obtained by FFA and the regional station-year method. This new approach can reduce the uncertainty in estimating the design flow when few data are available.

Keywords:

Flood frequency analysis

small samples

synthetic samples

uncertainty

Texto completo

1Introduction

Flood frequency analysis (FFA) is the basis for planning and designing bridges, culverts and flood control structures (Chow et al., 1998). The maximum capacity of these structures is defined by the design flow, which is the annual peak flow (APF) with a certain probability of being exceeded at least once during operation. This probability is known as risk and is usually expressed as a return period. Furthermore, this probability is selected by considering the economic, social and environmental effects that would be produced by the failure of the structure. Therefore, precise design flow estimates are important when evaluating the feasibility of a structure. However, due to stream flow variability, the precision of the estimates is drastically reduced when small samples are used in a conventional FFA.

Conventional estimates of the design flow are achieved via a frequency analysis of the APFs measured over a long period at a single gauging station. Assuming that the APFs are independent and identically distributed (IID), a relation between their magnitudes and non-exceedance probabilities can be achieved by fitting a cumulative probability function (CPF).

A regional flood frequency analysis (RFFA) is a well-known option for reducing the uncertainty in quantile estimates when sufficient and homogeneous data are available. A RFFA requires all available information from neighboring sites to obtain at-site estimates for a specific return period. Several previous papers have described the advantages of RFFA methods. Darlymple (1960) first introduced the index flood method. Chander et al. (1978) suggested a regional box-cox transformation. Wallis (1980) proposed the use of regional probability weighted moments. Boes et al. (1989) considered the Weibull distribution as a regional distribution. Moreover, Hosking and Wallis (1993) introduced select discordancy, heterogeneity and fitting measures. Cunnane (1998) suggested using the station-year method, whereas Sveinsson et al. (2001) proposed the population index flood method. The method introduced by Hosking and Wallis (1993) appears to display more acceptability in RFFA; its application can be found in Lim (2007), Saf (2009), Notto and Loggia (2009), Hussain (2011), and Rostami (2013).

Other options to reduce the uncertainty in estimating quantiles are “transfer methods”. Zaidman et al. (2003) suggested two methods to transfer information from a donor basin to an object basin. In the first method, the standardized shape of the donor CPF is transferred. In the second method, the plotting positions of the events, occurring over the same period of time (year), are transferred. Dong et al. (2013) introduced a non-parametrical transfer method that uses an iterative procedure to gradually approximate the optimal CPF.

This paper proposes a new approach to reduce the uncertainty in the estimation of APF quantiles resulting from small sample sizes, and to dispense with neighboring information. This approach consists of simulating multiple APF samples to achieve a more accurate frequency distribution. To improve the accuracy, these synthetic samples must be conditionally simulated from a steadier variable. In this paper, APF samples are conditionally simulated using the annual mean flows (AMFs).

Hydrometric information from 13 gauging stations located in the Susquehanna River basin was employed in this study. APF quantiles were estimated according to the following methods: (1) a conventional FFA method, (2) a station-year method, and (3) the proposed method. The uncertainty of each method was measured by computing the coefficients of variation (CV) between the quantiles estimated from 10-, 20-, 30-, 40- and 50-year subsamples with those obtained from historical information.

2Materials and methods2.1Study region

The Susquehanna River is located in the northeastern United States. With a length of 715 km, the Susquehanna is one of the longest rivers along the east coast. It has a normal flow of approximately 15 926 million m3/yr (monitored at Havre de Grace in Maryland) (SRBC, 2015).

The Susquehanna River basin has an area of 71 244 km2 and is divided into six major sub-basins: lower Susquehanna, middle Susquehanna, upper Susquehanna, Juniata, west branch Susquehanna, and Chemung. This basin belongs to the hydrological region II or Middle Atlantic and is one of the most flood-prone areas in the United States (SRBC, 2015).

2.2Data

The stream flow time series used in this study were obtained from the National Water Information System of the United States (NWIS, 2015); 13 gauging stations in the Susquehanna River basin were selected. The locations of the study sites are shown in Figure 1.

Fig. 1.

Location of the gauging stations used in this study.

(0.15MB).

To evaluate the independence of the series, the autocorrelation functions (ACFs) were contrasted with the limits proposed by Anderson (1942). To identify possible non- homogeneities, such as change points or trends in the time series, the Pettitt (1979) and Mann-Kendall tests (Kendall, 1938;Mann, 1945) were performed. A brief description of the series and the results are summarized in Table I.

Table I.

Selected characteristics of the annual peak flow series.

Site	Length [years]	Coefficient of			Auto-correlation			Pettitt [p-value]	Mann-Kendall [p-value]
Site	Length [years]	Variation [ad]	Skew [ad]	Kurtosis [ad]	Lag-1 [ad]	Lag-2 [ad]	Lag-3 [ad]	Pettitt [p-value]	Mann-Kendall [p-value]
1503000	100	0.36	1.12	4.72	0.05	0.04	-0.01	0.05	0.09
1512500	100	0.45	3.32	21.40	0.14	-0.02	-0.04	0.04	0.06
1531500	100	0.40	1.44	6.68	-0.09	-0.11	0.06	0.20	0.23
1534000	99	0.49	0.91	4.05	-0.04	0.12	0.01	0.55	0.32
1536500	114	0.40	1.27	5.53	-0.13	-0.05	0.03	0.65	0.46
1540500	108	0.39	1.41	6.01	-0.10	-0.07	0.06	0.38	0.36
1541000	100	0.51	2.55	12.21	-0.14	0.04	-0.13	0.48	0.24
1543000	100	0.69	2.30	10.60	-0.08	-0.11	-0.05	0.35	0.34
1545500	104	0.56	2.46	11.77	-0.07	-0.02	0.06	0.00	0.00
1550000	99	0.66	1.99	7.94	-0.05	0.03	0.00	0.13	0.04
1551500	118	0.45	1.35	5.60	-0.14	-0.04	0.06	0.03	0.01
1562000	100	0.62	3.24	19.54	0.05	-0.04	-0.14	0.29	0.39
1570500	123	0.44	2.31	11.98	-0.16	-0.07	0.05	0.24	0.08

2.3Conventional FFA method

Let X represent the APFs and X = {xl,...,xn} be a sample of X at any site in the study region. Then, a conventional FFA method for APF quantile estimation can be defined as follows:

Step 1: Sort in ascending order X, i.e., X = {x(1) ≤…≤ x(n)}.

Step 2: Obtain the empirical non-exceedance probability of each x(1) in X, i.e., Pr[X ≤ x(n)], using the Weibull's plotting position formula:

Step 3: Estimate a theoretical CPF of X, i.e., Pr[X ≤ x] (from Table II), by minimizing the sum square error (SSE) of the sample quantiles:

Table II.

CPFs applied in the conventional FFA for this study.

Distribution	CPF	Restrictions
Log-normal	FXx;θ=∫ξx1t−ξσ2πexp−12lnt−ξ−μσ2dt	x≥ξσ>0
Pearson III	FXx;θ=∫ξx1μσΓσt−ξμσ−1exp−t−ξμdt	x≥ξμ>0;σ>0
Log-Pearson III	FXx;θ=∫ξx1tμσΓσlnt−ξμσ−1exp−lnt−ξμdt	lnx≥ξμ>0;σ>0
Weibull	FXx;θ=1−exp−x−ξμσ where ξ, μ, σ are the location, scale and shape parameters	x≥ξμ>0;σ>0

where xpi=infx∈ℝ:pi≤Fx;θ.

Step 4: Select (as the best CPF of X) the CPF with the smallest standard error of fit (SEF) (Kite, 1988) using the following relationship:

where k is the number of distribution parameters.

Step 5: Estimate the quantiles of X as xq=infx∈ℝ:q≤Fx;θ.

2.4Station-year method

Let X represent the APFs and Xj=x1j,...,xnjj (for j =1,.., m) correspond to samples of X at m sites from any homogeneous region inside the study region. Then, a station-year method for APF quantile estimation can be defined as follows:

Step 1:
Standardize each Xj in the homogeneous region, i.e., yij=xij/x¯j, where x¯j is the sample mean of Xj. Therefore, the m series Yj=y1j,...,ynji can be defined.
Step 2:
Join all Yj at the homogeneous region in one station-year series Y = {Y1 | ... | Ym}.
Step 3:
Apply the conventional FFA to series Y; find the best CPF of Y, i.e., F(y;θ).
Step 4:
Estimate the quantiles of Y as yq=infy∈ℝ:q≤Fy;θ.
Step 5:
Estimate the quantiles of X as xjq=yq·x¯j.

2.5Proposed method

Let Y represent the AMFs, Y = {y1,..., yn} be a sample of Y at any site in the study region, X represent the APFs and X = {x1,..., xn} be a sample of X at the same site over a consistent recording period. The proposed method for APF quantile estimation utilizes the following steps:

Step 1:
Sort Yin ascending order, i.e., Y=y1≤···≤yn, maintaining each xt occurring at the same time t as y(i). Thus, an empirical relation y1,xt1,···,yn,xtncan be defined.
Step 2:
Obtain the APF ratios Θ {θ1,..., θn} by dividing each xt by its corresponding y(i), i.e.;θi=xti/yi. Thus, an empirical relation {(y(1)), θ1),..., (y(n), θn} can also be defined.
Step 3:
Apply the conventional FFA to Y to determine the best CPF of Y, i.e., F(y;θ).
Step 4:
Generate 100 000 random synthetic AMF defined as yu=infy∈ℝ:u≤Fy;θ by sampling u from a continuous uniform distribution, i.e. u~U[0,1].
Step 5:
For each generated y(u), find its closest y(i) in Y and its corresponding θi in Θ. Then using a window of size h = [n2/3] centered on y(i), define Φ = {φ1,...φm} as a subsample of Θ, where φ1=θmax1,i+h and φm=θmaxn,i+h (Fig. 2). The specific window size was obtained after a trial and error process. It was observed that a window of this width provides a reasonable balance between flexibility and precision of the results.

Fig. 2.
Window centered on the closest generated AMF.
(0.04MB).
Step 6:
Assume that every φj in Φ are equally likely to occur, i.e., Prφ=φj=1/m, and extract (from each Φ) a φ value, such that φυ=supφj∈Φ;υ≤j/m, by sampling υ from a continuous uniform distribution, i.e., υ~U[0,1].
Step 7:
Multiply each generated y(u) by the extracted φ(υ), i.e., x(y) = x(u) · φ(υ).
Step 8:
Estimate the quantiles of X as the q-th percentiles of the generated x(y).

3Reliability of the methods

To evaluate the uncertainty in the estimated quantiles determined using the former methods, historical quantiles were first estimated from historical information. Then, different scenarios of available information were simulated by extracting 10-, 20-, 30-, 40- and 50-yr subsamples, and new quantiles were estimated. Finally, the coefficient of variation of the quantiles estimated from the 10-, 20-, 30-, 40- and 50-yr subsamples were computed using the following expression:

where x¯mjq and Smjq are the mean and standard deviation of the q-th quantiles xi,mjq estimated from each subsample i of size m at site j, respectively, and xj (q) are the q-th quantiles estimated from the historical information available from the same site j.

3.1Reliability of the conventional FFA method

Let X represent the APFs and Xj=x1j,...,xnjj(for j = 1,...,13) correspond to the samples at the 13 sites in the study region. The uncertainty of the conventional FFA method was computed through the following steps:

Step 1:
Apply the conventional FFA method to each Xj for all 13 study sites to estimate the historical quantiles xj (q) for different non-exceedance probabilities q.
Step 2:
From each Xj extract i = 1,..., nj – m + 1 subsamples with size m equal to 10, 20, 30, 40 and 50 years for each site, i.e., Xi,mj=xij,...,xi+m−1j.
Step 3:
Apply the conventional FFA method to each Xi,mj for all nj - m + 1 subsamples from the 13 study sites to estimate the quantiles xi,mjq for the identical non-exceedance probabilities q (in Step 1).
Step 4:
Compute the CV of each xi,mjq for all nj - m + 1 subsamples with respect to each xj (q) for all 13 study sites, using Eq. (4).

3.2Reliability of the station-year method

Let X represent the APFs and Xj=x1j,...,xnjj (for j = 1,...,13) correspond to the samples from the 13 sites in the study region. The uncertainty of the station-year method was computed through the following steps:

Step 1:
Apply the conventional FFA method to each Xj, for all 13 study sites to estimate the historical quantiles xj (q) for different non-exceedance probabilities q.
Step 2:
From each Xj extract i = 1,..., nj - m + 1 subsamples with size m equal to 10, 20, 30, 40 and 50 years for each site, i.e., Xi,mj=xij,...,xi+m−1j.
Step 3:
Randomly select 500 combinations of three subsamples Xi,mj with equal size m thus, 500 different regional information scenarios were simulated.
Step 4:
Apply the station-year method on each combination (in Step 3) to estimate the quantiles xi,mjq for the identical non-exceedance probabilities q (in Step 1).
Step 5:
Compute the CV of each xi,mjq, which was estimated before (in Step 4), with respect to each xj (q) for all 13 study sites using Eq. (4).

3.3Reliability of the proposed method

Let X represent the APFs and Xj=x1j,...,xnjj (for j = 1,...,13) correspond to the samples from the 13 sites in the study region. The uncertainty of the conventional FFA method was computed through the following steps:

Step 1:
Apply the conventional FFA method to each Xj for all 13 study sites to estimate the historical quantiles xj (q) for different non-exceedance probabilities.
Step 2:
From each Xj extract i = 1,...,nj - m + 1 subsamples with size m equal to 10, 20, 30, 40 and 50 years for all sites, i.e., Xi,mj=xij,...,xi+m−1j.
Step 3:
Apply the proposed method to each Xi,mj for all nj - m + 1 subsamples from the 13 study sites to estimate the quantiles xi,mjq for the identical non-exceedance probabilities q (in Step 1).
Step 4:
Compute the CV of each xi,mjq for all nj - m + 1 subsamples with respect to each xj (q) for all 13 study sites using Eq. (4).

4Results and discussion

The CV values for each subsample size (10, 20, 30, 40 and 50 years) by applying (1) the conventional FFA method, (2) the station-year method and (3) the proposed method were computed and contrasted.

Concerning all 13 study sites, approximately 200 CV values were computed for different q-th quantiles from each subsample size. Therefore, nearly 1000 CV values for each return period were computed for each method.

The variations in the set of estimated quantiles for the shortest length of records (i.e., 10 years) are shown in Figures 3-7. In these figures, the 50th percentile is the median of the estimates, and the 10th and 90th percentiles are considered as lower and upper bounds, respectively. These figures show that the proposed method generates the narrowest limits compared with those obtained using the conventional FFA approach.

Fig. 3.

Quantiles obtained from 10-yr subsamples at stations 1503000, 1512500 and 1531500.

(0.31MB).

Quantiles obtained from 10-yr subsamples at stations 1503400, 1536500 and 1540500.

Fig. 4.

Quantiles obtained from 10-yr subsamples at stations 1503400, 1536500 and 1540500.

Quantiles obtained from 10-yr subsamples at stations 1541000, 1543000 and 1545500.

Fig. 5.

Quantiles obtained from 10-yr subsamples at stations 1541000, 1543000 and 1545500.

Fig. 6.

Quantiles obtained from 10-yr subsamples at stations 1550000, 1551500 and 1552000.

(0.35MB).

Fig. 7.

Quantiles obtained from 10-yr subsamples at stations 1541000, 1543000 and 1545500.

(0.11MB).

In general, lower CV values were obtained in 71% of the cases using the proposed method instead of the conventional FFA method. Lower CV values were obtained for a return of approximately two years in 33% of the cases, whereas 99% of the cases showed lower CV values for return periods exceeding 100 years.

Furthermore, in 67% of the cases, lower CV values were obtained using the proposed method instead of the station-year method. In 50% of the cases, lower CV values were obtained for a return period of approximately two years, whereas lower CV values were found for return periods exceeding 100 years in 93% of the cases.

The geometric means of all CV values for the same return period obtained from (1) the conventional FFA, (2) the station-year method and (3) the proposed method are contrasted in Figures 8-12.

Fig. 8.

CV obtained using 10-yr subsamples.

(0.07MB).

Fig. 9.

CV obtained using 20-yr subsamples.

(0.07MB).

Fig. 10.

CV obtained using 30-yr subsamples.

(0.07MB).

Fig. 11.

CV obtained using 40-year subsamples.

(0.07MB).

Fig. 12.

CV obtained using 50-yr subsamples.

(0.07MB).

5Conclusions

A new approach for estimating APFs for different return periods is presented in this study. This approach consists of a conditional simulation process of synthetic samples of APFs and AMFs to achieve the frequency distribution. Thirteen gauging stations located in the Susquehanna River basin, which is along the east coast of the United States, were used in this study.

To evaluate the proposed method, the uncertainties in the quantiles estimated using (1) a conventional FFA, (2) a station-year method and (3) the newly proposed method were compared by computing.

The results indicated that the quantiles estimated using the proposed method varied less than those estimated via the conventional FFA, especially when they were estimated from 10-yr subsamples and for return periods exceeding 100 years. Therefore, the proposed method can reasonably reduce the uncertainty in quantile estimation from small sample sizes.

The results also showed that the quantiles estimated using the proposed method are equal or less than those computed by the station-year method, even if only a third of the information is used.

The analysis also demonstrated that the proposed method performed adequately when quantiles were estimated from the gathered samples. Moreover, more flexible frequency distributions were simulated using the proposed method than with the conventional frequency distributions.

References

[Anderson, 1942]

R.L. Anderson.

Distribution of the serial correlation coefficients.

Ann. Math. Stat., 13 (1942), pp. 1-13

doi:10.1214/aoms/1177731638

[Boes et al., 1989]

D.C. Boes, H. Jun, J.D. Salas.

Regional flood quantile estimation for Weibull model.

Water. Resour. Res., 25 (1989), pp. 979-990

doi:10.1029/WR025i005p00979

[Chander et al., 1978]

S. Chander, S.K. Spolia, A. Krumar.

Flood frequency analysis by power transformation.

J. Hydraul. Eng.-ASCE, 104 (1978), pp. 1495-1504

[Chow et al., 1988]

V.T. Chow, D.R. Maidment, L.W. Mays.

Applied hydrology.

McGraw-Hill, (1988), pp. 572

[Cunnane, 1998]

C. Cunnane.

Methods and merits of regional flood frequency analysis.

J. Hydrol., 100 (1998), pp. 269-290

doi:10.1016/0022-1694(88)90188-6

[Darlymple, 1960]

T. Darlymple.

Flood frequency methods.

U.S. Geological Survey Water-Supply paper, 1543 (1960), pp. 11-51

[Dong et al., 2013]

J. Dong, Y. Diao, G. Wang.

Flood frequency analysis transformed model for small sample.

Adv. Mat. Res., 610–613 (2013), pp. 2635-2639

doi:10.4028/www.scientific.net/AMR.610-613.2635

[Hosking and Wallis, 1993]

J.M.R. Hosking, J. Wallis.

Some statistics useful in regional frequency analysis.

Water. Resour. Res., 29 (1993), pp. 271-282

doi:10.1029/92WR01980

[Hussain, 2011]

Z. Hussain.

Application of the regional flood frequency analysis to the upper and lower basins of the Indus River, Pakistan.

Water. Resour. Manag., 25 (2011), pp. 2797-2822

doi:10.1007/s11269-011-9839-5

[Kendall, 1938]

M. Kendall.

A new measure of rank correlation.

Biometrika, 30 (1938), pp. 81-89

doi:10.2307/2332226

[Kite, 1988]

G.W. Kite.

Frequency and risk analyses in hydrology.

Water Resources Publications, (1988), pp. 187

[Lim, 2007]

Y. Lim.

Regional flood frequency analysis of the Red River basin using L-moments approach.

In: World Environmental and Water Resources Congress 2007: Restoring Our Natural Habitat, pp. 1-10

[Mann, 1945]

H.B. Mann.

Non-parametric tests against trend.

Econometrica, 13 (1945), pp. 163-171

doi:10.2307/1907187

[Noto and La Loggia, 2009]

L.V. Noto, G. La Loggia.

Use of L-moments Approach for Regional Flood Frequency Analysis in Sicily, Italy, Water.

Resour. Manag., 23 (2009), pp. 2207-2222

doi:10.1007/s11269-008-9378-x

[NWIS, 2015]

NWIS, 2015. National water information system. Available at: waterdata.usgs.gov/nwis (last accessed on January 2015).

[Pettitt, 1979]

A.N. Pettitt.

A Non-Parametric Approach to the Change Point Problem.

J. R. Stat. Soc. Ser. C., 28 (1979), pp. 126-135

doi:10.2307/2346729

[Rostami, 2013]

R. Rostami.

Regional Flood Frequency Analysis based on L-moment Approach (Case Study West Azarbayjan Basins).

J. Civil. Eng. Urb., 3 (2013), pp. 107-113

[Saf, 2009]

B. Saf.

Regional Flood Frequency Analysis Using L-moments for the West Mediterranean Region of Turkey, Water.

Resour. Manag., 23 (2009), pp. 531-551

doi:10.1007/s11269-008-9287-z,

[SRBC, 2015]

SRBC, 2015. Susquehanna River Basin Commission. Available at: http://www.srbc.net (last accessed on January 2015).

[Sveinsson et al., 2001]

O.G.B. Sveinsson, D.C. Boes, J.D. Salas.

Population Index Flood Method for Regional Frequency Analysis.

Water. Resour. Res., 37 (2001), pp. 2733-2748

doi:10.1029/2001WR000321

[Wallis, 1980]

J.R. Wallis.

Risk and Uncertainties in the Evaluation of Flood Events for the Design of Hydrological Structures.

In: Seminar on Hydrological Extreme Events-Flood and Droughts, pp. 33

[Zaidman et al., 2003]

M.D. Zaidman, V. Keller, A.R. Young, A. Wall.

Adapting Low-Flow Frequency Analysis for Use with Short-Period Record.

The Journal., 17 (2003), pp. 73-79

doi:10.1111/j.1747-6593.2003.tb00437.x

Indexada en:

Síguenos:

Indexada en:

Síguenos:

Suscríbase a la newsletter