Comput Stat https://doi.org/10.1007/s00180-018-0809-8 ORIGINAL PAPER
Rank-based Liu regression Mohammad Arashi1 · Mina Norouzirad1 · S. Ejaz Ahmed2 · Bahadır Yüzba¸sı3
Received: 14 June 2017 / Accepted: 23 March 2018 © Springer-Verlag GmbH Germany, part of Springer Nature 2018
Abstract Due to the complicated mathematical and nonlinear nature of ridge regression estimator, Liu (Linear-Unified) estimator has been received much attention as a useful method to overcome the weakness of the least square estimator, in the presence of multicollinearity. In situations where in the linear model, errors are far away from normal or the data contain some outliers, the construction of Liu estimator can be revisited using a rank-based score test, in the line of robust regression. In this paper, we define the Liu-type rank-based and restricted Liu-type rank-based estimators when a sub-space restriction on the parameter of interest holds. Accordingly, some improved estimators are defined and their asymptotic distributional properties are investigated. The conditions of superiority of the proposed estimators for the biasing parameter are given. Some numerical computations support the findings of the paper. Keywords Liu estimator · Multicollinearity · Preliminary test · Rank-based estimator · Ridge regression · Shrinkage estimator
1 Introduction n Consider the sample {(Xi , yi )}i=1 according to the following linear model,
yi = Xi β + i ,
B
(1.1)
Bahadır Yüzba¸sı
[email protected]
1
Department of Statistics, Faculty of Mathematical Sciences, Shahrood University of Technology, Shahrood, Iran
2
Department of Mathematics and Statistics, University of Brock, St. Catharines, Canada
3
Department of Econometrics, University of Inonu, Malatya, Turkey
123
M. Arashi et al.
where for i = 1, . . . , n, yi denotes the ith response variable, Xi s are vectors of explanatory variables of dimension p, β is the vector of unknown regression coefficients, and i is independent of Xi , having a continuous cumulative distribution function (c.d.f.), F(·), and finite Fisher information, I ( f ), where I( f ) =
f (x) 2 dF(x) d2 F(x) − , f (x) = f (x)dx < ∞, f (x) = . f (x) dx dx 2 R
The problem of our study is to test the following hypotheses in the presence of multicollinearity. Ho : Hβ = h
vs.
H A : Hβ = h,
(1.2)
where H is a q × p (q < p) matrix and h is a q-vector of known constant. The subspace restriction Hβ = h may be (a) a fact known from theoretical or experimental considerations, (b) a hypothesis that may have to be tested or (c) an artificially imposed condition to reduce or eliminate redundancy in the description of the model (see Sengupta and Jammalamadaka 2003 for details). From a general view, the literature dealing with studies of hypotheses of (1.2)type is in growing regime. We refer to Kibria (2004), Roozbeh and Arashi (2013), Tabatabaey et al. (2004) and Xu and Yang (2012) to mention a few. For shrinkage estimation strategies, the book of Ahmed (2014) is a excellent source. Very recently, Yüzba¸sı et al. (2017a) proposed ridge-type shrinkage estimators. In order to combat with autoregressive errors, Yüzba¸sı et al. (2017b) also proposed quantile shrinkage estimators. In all aforementioned references, proposed methods can be regarded as variants of ridge/Liu-type estimates in which the original observations yi are used. Hence, their statistical properties, designed to perform fairly well under the normality assumption, could potentially be (highly) affected when the errors are far away from normal or the data contain some outliers. In this respect, Saleh and Kibria (2011) proposed rankbased ridge regression and exhibited preliminary test and shrinkage rank-based (R) estimators for the regression coefficient β. We consider a slightly different approach for Liu-type regression and study the effect of biasing parameter in Liu-type R-estimates. For our purpose, in Sect. 2, we give some preliminaries about R-estimates. Section 3 is devoted to the proposal of different Liu-type R-estimates of β, while some penalized estimation strategies are given in Sect. 4 for comparison sake. The asymptotic characteristics of the proposed Liu-typed estimators are derived in Sect. 5. Section 6 contains some numerical illustrations and the paper is concluded in Sect. 7.
2 Preliminaries For testing Ho : β = 0, the rank-based score test (see Hettmansperger and McKean 1998, Ch. 3) is given by (a(R(y))) H (a(R(y))) , y = (y1 , . . . , yn )
123
Rank-based Liu regression
where a(R(y)) = (a(R(y1 )), . . . , a(R(yn ))) , R(yi ) is the rank of yi , i = 1, . . . , n, a(1) ≤ a(2) ≤ . . . ≤ a(n) is a set of scores generated as a(i) = ψ(i/(n + 1)) for some square-integrable and non-decreasing score function ψ(u) defined on the unit interval, satisfying
ψ 2 (u)du = 1,
ψ(u)du = 0 and
and H = X(X X)−1 X is the projection matrix onto the space F , the column space spanned by the columns of X, and X = (X1 , . . . , Xn ) . For the construction of the rank-based score test for (1.2), we need to find the Restimator of β. According to the result of Hettmansperger and McKean (1998), the R-estimate of β is given by yψ , β ψ = (X X)−1 X
(2.1)
where yψ is the minimizer of Dψ (η) = y − ηψ , over η ∈ F and vψ = n i=1 a(R(vi ))vi . Thus, β ψ is the solution to the rank-normal equations X a(R(y − yψ ψ . Xβ)) = 0; and Dψ (β ψ ) = y − and C is a nonnegative definite matrix. The Denote the ith element of H by h iin following conditions/assumptions should be held: (A1) f is absolutely continuous, 0 < I ( f ) < ∞. = 0. (A2) limn→∞ max1≤i≤n h iin n (A3) limn→∞ n1 i=1 Xi Xi = limn→∞ Cn = C, Cn = n1 X X. According to Lemma 3.4.1 of Hettmansperger and McKean (1998), the Huber’s condition (A2) implies the general Noethers condition. Under the named assumptions, from Puri and Sen (1986), we have the following fundamental asymptotic result which initiates the construction of test statistic for (1.2), √ D n( β ψ − β) → N p (0, τψ2 C−1 ), D
where → denotes convergency in distribution as n → ∞ and τψ2
=
ψ f (u) =
A2ψ
ψ(u)ψ f (u)du
−2
,
A2ψ
=
2
ψ (u) − 2
ψ(v)dv
du,
f (F −1 (u)) . f (F −1 (u))
Thus, by the utilities provided in Saleh (2006), the rank-based score test for Ho : Hβ = h is given by Tn =
n(H β ψ − h) Gn−1 (H β ψ − h) τψ2
,
123
M. Arashi et al. where Gn = HC−1 n H . Under the null hypothesis, the restricted estimator is defined as −1 β ψ − C−1 βψ = n H Gn (Hβ ψ − h).
(2.2)
See Saleh and Shiraishi (1989) for more details. Further, we have the following asymptotic result. P Tn → τψ−2 ( βψ − β ψ ) C( βψ − βψ )
= τψ−2 (H β ψ − h) G−1 (H β ψ − h), G = HC−1 H Also, lim P (Tn ≤ x|Ho ) = Hq (x; 0),
n→∞
where Hν (x; 0) is the c.d.f. of central chi-square distribution with ν degrees of freedom (d.f.).
3 Proposed Liu-type R-estimators Construction methodology of Liu-type R-estimators can be well motivated by their ordinary Liu-type estimators. Thus, the building block is to start from the well-known Liu-type estimator and replaces its kernel by an R-type counterpart. Hence, in this section, we propose rank-type counterparts of those estimators considered in Arashi et al. (2014). Since the usual least squares estimator (LSE) of β given by β(X X)−1 X y heavily depends on the characteristics of the matrix X X, if the X X matrix is ill-conditioned, then the LSE produces unduly large sampling variances. To resolve this problem, Hoerl and Kennard (1970) suggested to use X X + kI p , (k ≥ 0) rather than X X in the estimation of β. The resulting estimator of β is known as the ridge regression estimator. Recent applications of ridge regressions are given in Malthouse (1999) and the references therein. Ridge regression method has been considered by various researchers. Among them Hoerl and Kennard (1970) to mention a few. The ridge estimator is a complicated and non-linear function of k which is a drawback of this method. To overcome this problem, Liu (1993) proposed the following estimator, which combines the benefit of Hoerl and Kennard (1970) and Stein (1956), defined as follows β = (X X + I p )−1 (X y + d β), β(d) = (X X + I p )−1 (X X + dI p ) where 0 < d < 1 is a biasing parameter. Liu estimator has been considered by Akdeniz and Akdeniz Duran (2010), Akdeniz Duran and Akdeniz (2012), Akdeniz Duran et al. (2011), Kibria (2012) to mention a few.
123
Rank-based Liu regression
With this in hand, we define the unrestricted Liu-type R-estimator (ULRE) as βψ , β ψ (d) = Fn (d)
(3.1)
where Fn (d) = (Cn + I p )−1 (Cn + dI p ). Moreover, we suppose that the following assumption is also held: (A4) limn→∞ Fn (d) = Fd , Fd = (C + I p )−1 (C + dI p ). Incorporating the sub-space prior information Hβ = h into the ULRE, resulting in the restricted Liu-type R-estimator (RLRE) would preferably improve the performance. Then, we define the RLRE as βψ , β ψ (d) = Fn (d)
(3.2)
where β ψ is the restricted R-estimator (RRE) given by (2.2). However, when the prior information is doubtful (or not sure), one may combine the restricted and unrestricted estimators to obtain an estimator with better performance, which leads to the preliminary test’s least squares estimator. Thus, by the utilities in Saleh (2006), we define the preliminary test Liu-type R-estimator (PTLRE) as PT PT βψ , β ψ (d) = Fn (d)
(3.3)
where β ψ is the preliminary test R-estimator (PTRE) given by PT
PT 2 βψ = β ψ − ( βψ − β ψ )I (Tn ≤ χq,α ),
(3.4)
2 is an upper α-level critical where I (A) is the indicator function of the set A and χq,α 2 value of the central χ -distribution with q d.f. The PTLRE has two extreme choices, namely, the ULRE and the RLRE. A compromise approach can be suggested by using the Stein-type shrinkage structure where 2 ) is replaced by (q − 2)T −1 , q ≥ 3, to propose the Stein-type the factor I (Tn ≤ χq,α n shrinkage Liu-type R-estimator (SSLRE) as S S βψ , β ψ (d) = Fn (d)
(3.5)
where β ψ is the Stein-type shrinkage R-estimator (SSRE) given by S
S βψ = β ψ − (q − 2)( βψ − β ψ )Tn−1 .
(3.6)
123
M. Arashi et al.
Since the shrinkage factor (1 − (q − 2)Tn−1 ) in SSLRE becomes negative for Tn < (q − 2), its positive part motivates the use of the positive-rule shrinkage Liu-type R-estimator (PRSLRE) given by S+ S+ βψ β ψ (d) = Fn (d)
(3.7)
where β ψ is the positive-rule shrinkage R-estimator (PRSRE) defined as S+
S+ S β ψ − (1 − (q − 2)Tn−1 )I (Tn ≤ q − 2)( βψ − βψ ) βψ = S S = β ψ − ( βψ − β ψ )I (Tn ≤ q − 2).
(3.8)
4 Penalized estimation In this paper, we only consider the penalized least square regression methods to obtain estimators for the parameters in model (1.1). The key idea in penalized regression methods is minimizing an objection function L ρ,λ (β) in the form of L ρ,λ (β) = (y − Xβ) (y − Xβ) + λPρ (β)
(4.1)
to obtain the estimates of the parameter. The first term in the objective function is the sum of the squared error loss, the second term of (4.1) is a penalty function, and λ is a tuning parameter which controls the trade-off between two components of L ρ,λ (β). The penalty function is usually chosen as a norm on R p , such as Pρ (β) =
p
|β j |q , q > 0.
j=1
This class of estimator is called the bridge estimator, proposed by Frank and Friedman (1993).
4.1 Ridge A member of class of bridge estimators is the ridge which is obtained by considering q = 2. The ridge regression (Hoerl Kennard 1970) minimizes the residual sum of and p squares subject to an L2 -penalty, j=1 β 2j , that is,
Ridge βn
123
⎧ ⎫ ⎛ ⎞2 ⎪ ⎪ p p n ⎨ ⎬ 2 ⎝ ⎠ = arg min Xi j β j +λ βj . yi − ⎪ β ⎪ ⎩ i=1 ⎭ j=1 j=1
(4.2)
Rank-based Liu regression
4.2 LASSO The LASSO was proposed by Tibshirani (1996), which performs variable p selection and parameter estimation simultaneously thanks to the L1 -penalty, j=1 |β j |. The LASSO estimates are defined by
LASSO βn
⎧ ⎫ ⎛ ⎞2 ⎪ ⎪ p p n ⎨ ⎬ ⎝ ⎠ = arg min Xi j β j +λ |β j | . yi − ⎪ ⎪ β ⎩ i=1 ⎭ j=1 j=1
(4.3)
This estimator belongs to class of bridge estimators as q = 1. 4.3 Adaptive LASSO (ALASSO) Zou (2006) introduced the ALASSO by modifying the LASSO penalty by using adaptive weights on L1 -penalty with the regression coefficients. ALASSO The ALASSO, β are obtained by ALASSO βn
⎫ ⎧ ⎛ ⎞2 ⎪ ⎪ p p n ⎬ ⎨ ⎝ yi − = arg min Xi j β j ⎠ + λ w j |β j | , ⎪ ⎪ β ⎭ ⎩ i=1 j=1 j=1
(4.4)
where the weight function is w j =
1 ; γ > 0, ∗ |γ |β j
∗ is a root-n-consistent estimator of β. The minimization procedure for ALASSO and β j solution does not induce any computational difficulty and can be solved very efficiently, for the details see Section 3.5 in Zou (2006). 4.4 Smoothly clipped absolute deviation (SCAD) Although the LASSO method does both shrinkage and variable selection due to the nature of the L1 -penalty by setting many coefficients identically to zero, it does not possess oracle properties, as discussed in Fan and Li (2001). To overcome the inefficiency of traditional variable selection procedures, Fan and Li (2001) proposed SCAD to select variables and estimate the coefficients of variables automatically and simultaneously. This estimator is defined as
SCAD βn
⎧ ⎫ ⎛ ⎞2 ⎪ ⎪ p p n ⎨ ⎬ ⎝ yi − = arg min Xi j β j ⎠ + λ pα,λ (β j ) . ⎪ ⎪ β ⎩ i=1 ⎭ j=1 j=1
(4.5)
123
M. Arashi et al.
Here pα,λ (·) is the SCAD penalty. It is a symmetric and quadratic spline on [0, ∞) with knots at λ and αλ, whose first order derivative is given by ⎧ |β| ≤ λ ⎨ λ |β| , pα,λ (β) = − β 2 − 2αλ |λ| + λ2 / [2(α − 1)] , λ < |β| ≤ αλ ⎩ |β| > αλ. (α + 1)λ2 /2
(4.6)
Here λ > 0 and α > 2 are the tuning parameters. For α = ∞, the expression (4.6) is equivalent to the L1 -penalty.
4.5 Minimax concave penalty (MCP) Zhang (2010) introduced a new penalization method for variable selection, which is given by
MCP βn
⎫ ⎧ ⎛ ⎞2 ⎪ ⎪ p p n ⎬ ⎨ ⎝ yi − = arg min Xi j β j ⎠ + ρ(|β j |; λ) , ⎪ ⎪ β ⎭ ⎩ i=1 j=1 j=1
(4.7)
where ρ(·; λ) is the MCP penalty given by ρ(t; λ) = λ
t x 1− dx γλ + 0
where γ > 0 and λ are regularization and penalty parameters, respectively, and z + = max(0, z).
4.6 Elastic-Net The Elastic-Net was proposed by Zou and Hastie (2005) to overcome the limitations of the Lasso and ridge methods.
ENET βn
⎧ ⎫ ⎛ ⎞2 ⎪ ⎪ p p p n ⎨ ⎬ 2 ⎝ ⎠ = arg min Xi j β j + λ1 |β j | + λ2 βj , yi − ⎪ β ⎪ ⎩ i=1 ⎭ j=1 j=1 j=1 (4.8)
where λ1 is the LASSO penalty and λ2 is the ridge penalty parameter, respectively.
123
Rank-based Liu regression Table 1 Outline of the estimators Abbreviation
Comment (Extended meaning)
Notation
Formula
RE
R estimator
β˜ ψ
Eq. (2.1)
RRE
Restricted R estimator
βˆ ψ
Eq. (2.2)
PRSSRE
Positive-rule shrinkage R estimator
PT βˆ ψ S βˆ ψ S+ βˆ ψ
LRE
Liu-type R estimator
β˜ ψ (d)
Eq. (3.1)
ULRE
Unrestricted Liu-type R estimator
β˜ ψ (d)
Eq. (3.1)
RLRE
Restricted Liu-type R estimator
βˆ ψ (d)
Eq. (3.2)
PTRE
Preliminary test R estimator
SSRE
Stein-type shrinkage R estimator
PTLRE
Preliminary test Liu-type R estimator
SSLRE
Stein-type shrinkage Liu-type R estimator
PRSLRE
Positive-rule shrinkage Liu-type R estimator
Ridge
Ridge estimator
LASSO
LASSO estimator
ALASSO
Adaptive LASSO estimator
SCAD
Smoothly Clipped Absolute Deviation estimator
MCP
Minimax Concave Penalty estimator
ENET
Elastic Net estimator
MNET
MNET estimator
PT βˆ ψ (d) S βˆ ψ (d) S+ βˆ ψ (d) Ridge βˆ n LASSO βˆ n ALASSO βˆ n SCAD βˆ n MCP βˆ n ENET βˆ n MNET βˆ n
Eq. (3.4) Eq. (3.6) Eq. (3.8)
Eq. (3.3) Eq. (3.5) Eq. (3.7) Eq. (4.2) Eq. (4.3) Eq. (4.4) Eq. (4.5) Eq. (4.7) Eq. (4.8) Eq. (4.9)
4.7 MNET Huang et al. (2010) introduced the MNET estimation which use MCP penalty term instead of the L1 -penalty in (4.8).
MNET βn
⎧ ⎫ ⎛ ⎞2 ⎪ p d n d ⎨ ⎬ ⎪ ⎝ yi − = arg min Xi j β j ⎠ + ρ(|β j |; λ1 ) + λ2 β 2j . ⎪ β ⎪ ⎩ i=1 ⎭ j=1 j=1 j=1 (4.9)
Similar to the ENET of Zou and Hastie (2005), the MNET also tends to select or drop highly correlated predictors together. However, unlike the ENET, the MNET is selection consistent and equal to the oracle ridge estimator with high probability under reasonable conditions So far, many of the symbols used to represent the different estimators. Before we continue, in Table 1 we review all of estimators and notations for convenience use.
123
M. Arashi et al.
5 Asymptotic properties In this section, we derive the asymptotic distributional characteristics of the proposed ∗ Liu-type R-estimators. For this purpose, assume that, for a given R-estimator βˆ ψ of β, lim P
√
n→∞
∗ n(βˆ ψ − β) ≤ x = G p (x)
exists. Then, following (Saleh 2006, Ch. 7), we define the asymptotic distributional risk (ADR), associated with the quadratic loss function by ∗
R(βˆ ψ ; β) =
Rp
∗ x xdG p (x) = tr M(βˆ ψ ; β) ,
∗ ∗ where M(βˆ ψ ; β) is the MSE-matrix of βˆ ψ and defined as ∗ M(βˆ ψ ; β) =
Rp
xx dG p (x).
It is proved in Saleh (2006), that under the fix alternative H A in (1.2), the asymptotic distributional properties for the preliminary test and shrinkage estimators are the same as the unrestricted estimator. Hence, in order to evaluate the asymptotic risk, we consider the following sequence of local alternatives, γ Kn : Hβ = h + √ , γ = (γ1 , . . . , γq ) = 0. n
(5.1)
By making use of the techniques in (Saleh 2006, Ch. 7) and Theorem 2.1 of Arashi et al. (2014), we have the following result without proof. Theorem 1 Under the local alternatives Kn given by (5.1) and the assumptions (A1)(A4), √ n ( β ψ (d) − β), ( β ψ (d) − β), ( β ψ (d) − β ψ (d)) has asymptotic 3 p-variate normal distribution with mean (b1 , b2 , b1 − b2 ) and variance-covariance matrix τψ2 V, where b1 = −(1 − d)(C + I p )−1 β, b2 = b1 − Fd η, η = C−1 H G−1 γ ⎞ ⎛ Fd C−1 Fd Fd (C−1 − A)Fd Fd AFd V = ⎝Fd (C−1 − A)Fd Fd (C−1 − A)Fd 0 ⎠ Fd AFd 0 Fd AFd and A = C−1 H G−1 HC−1 .
123
Rank-based Liu regression
After some algebra, we have the following result for the ADR of the proposed estimators, using Theorem 2.2 of Arashi et al. (2014) and utilities in Saleh (2006). Theorem 2 Under the assumptions (A1)-(A4), the ADR of the ULRE, RLRE, PTLRE, SSLRE and PRSLRE are given by R( β ψ (d); β) = τψ2 tr(Fd C−1 Fd ) + (1 − d)2 β (C + I p )−2 β, R( β ψ (d); β) = τψ2 tr(Fd C−1 Fd ) − τψ2 tr(Fd AFd ) + η Fd Fd η + 2(1 − d)η Fd (C + I p )−1 β + (1 − d)2 β (C + I p )−2 β, PT 2 ; 2 ) R( β ψ (d); β) = τψ2 tr(Fd C−1 Fd ) − τψ2 tr(Fd AFd )Hq+2 (χq,α
+ η Fd Fd ηZ (α, 2 ) 2 + 2(1 − d)η Fd (C + I p )−1 β Hq+2 (χq,α ; 2 )
+ (1 − d)2 β (C + I p )−2 β, S R( β ψ (d); β) = τψ2 tr(Fd C−1 Fd ) − τψ2 (q − 2)tr(Fd AFd )X (2 )
+ (q − 2)η Fd Fd ηY (2 )
−2 (2 ) + 2(q − 2)(1 − d)η Fd [C + I p ]−1 β E χq+2 + (1 − d)2 β [C + I p ]−2 β, and S+ ψS (d); β) − τψ2 tr(Fd AFd ) R( β ψ (d); β) = R(β −2 2 E (1 − (q − 2)χq+2 (2 ))2 I (χq+2 (2 ) ≤ q − 2) −2 + τψ−2 η Fd Fd η E (1 − (q − 4)χq+4 (2 ))2 2 I (χq+4 (2 ) ≤ q − 4) −2 2 − 2(η Fd Fd η)E ((q − 2)χq+2 (2 ) − 1)I (χq+2 (2 ) ≤ q − 2) − 2(1 − d)η Fd [C + I p ]−1 −2 2 β E ((q − 2)χq+2 (2 ) − 1)I (χq+2 (2 ) ≤ q − 2) , where 2 is the departure parameter given by 2 = τψ−2 (Hβ − h) G−1 (Hβ − h), ! 1 −2 , (2 ) = E P E χq+2 q + 2P ! 1 −4 , E χq+2 (2 ) = E P (q + 2P)(q + 2P − 2) ! 1 −2 , E χq+4 (2 ) = E P q + 2P − 2
123
M. Arashi et al.
−4 E χq+4 (2 ) = E P
! 1 , (q + 2P)(q + 2P + 2)
and E P stands for the expectation with respect to a Poisson random variable P with mean 21 2 , Hv ·; 2 is the cumulative distribution function of the non-central chisquared distribution with non-centrality parameter 2 and v degree of freedom and −2 −4 (2 ) − (q − 2)E χq+2 (2 ) X (2 ) = 2E χq+2 −2 −2 −4 Y (2 ) = 2E χq+2 (2 )) − 2E χq+4 (2 )) + (q − 2)E χq+4 (2 ) 2 2 ; 2 ) − Hq+4 (χq,α ; 2 ). Z (α, 2 ) = 2Hq+2 (χq,α
5.1 ADR analysis In this section, we will be comparing the performance of the proposed estimators. Incorporating the spectral decomposition, C can be factorized in such as way that C = = diag(λ1 , λ2 , . . . , λ p ), where λ1 ≥ λ2 ≥ . . . ≥ λ p > 0 are the eigenvalues of C. It is easy to see that the eigenvalues of Fd = (C+I p )−1 (C+dI p ) and λ p +d λ2 +d (C + I p ) are ( λλ11+d +1 , λ2 +1 . . . , λ p +1 ) and (λ1 + 1, λ2 + 1 . . . , λ p + 1), respectively. Then, we obtain the following identities tr(Fd C−1 Fd ) = β (C + I p )−2 β =
p (λi + d)2 , λi (λi + 1)2 i=1 p i=1
θi2 ; θ = β, (λi + 1)2
p h ii (λi + d)2 tr(Fd AFd ) = , (λi + 1)2 i=1
where h ii ≥ 0 is the ith diagonal element of the matrix H = A. η Fd Fd η =
p ηi 2 (λi + d)2 , (λi + 1)2 i=1
where ηi is the ith element of vector η = η . Similarly, η Fd (C + I p )−1 β =
p θi ηi (λi + d) . (λi + 1)2 i=1
The main focus of this part, is to compare the proposed estimators with each other. Comparison of them with the ordinary rank estimators can be similarly done using
123
Rank-based Liu regression
the results of Arashi et al. (2014) and Saleh (2006). However, in order to study the benefit of utilizing the Liu-type rank estimators over their counterparts, we will be also comparing the performance of the Liu-type with ordinary rank estimators. 5.2 Comparison between SSLRE and PTLRE S PT In this case the risk difference, R( β ψ (d); β) − R( β ψ (d); β) will be non-positive if
η Fd Fd η ≥
f 1 (2 , d, α) , Z (α, 2 ) − (q − 2)Y (2 )
(5.2)
where 2 f 1 (2 , d, α) = tr(Fd AFd ) Hq+2 (χq,α ; 2 ) − (q − 2)X (2 ) −2 + 2(1 − d)τψ2 η Fd (C + I p )−1 β (q − 2)E(χq+2 (2 )) 2 − Hq+2 (χq,α ; 2 ) . Since the departure parameter 2 > 0, we assume that both the numerator and the S denominator of (5.2) are positive or negative, respectively. Then β ψ (d) dominates PT β (d) when ψ
2 ≤ 21 (2 , d, α) =
f 1 (2 , d, α) " #, Ch max [(Fd Fd )C−1 ] Z (α, 2 ) − (q − 2)Y (2 )
PT S β ψ (d) when and β ψ (d) dominates
2 > 22 (2 , d, α) =
f 1 (2 , d, α) " #. Ch min [(Fd Fd )C−1 ] Z (α, 2 ) − (q − 2)Y (2 )
where Ch min (A) and Ch max (A) are the minimum and maximum characteristic root of matrix A. S PT β ψ (d); β) as a function Now, we consider the risk difference R( β ψ (d); β) − R( of eigenvalues and define d1 (2 , α) =
f 2 (2 , α) , g1 (2 , α)
123
M. Arashi et al.
where 2 2 ; 2 ) f 2 (2 , α) = max λi ηi {Z (α, 2 ) − (q − 2)Y (2 )} − τψ2 h ii λi {Hq+2 (χq,α i
− (q − 2)X (2 )} −2 2 + 2θi ηi {Hq+2 (χq,α ; 2 ) − (q − 2)E(χq+2 (2 ))} 2
and 2 2 ; 2 ) − (q − 2)X (2 )} − ηi {Z (α, 2 ) g1 (2 , α) = min τψ2 h ii {Hq+2 (χq,α i
− (q − 2)Y (2 )} −2 2 + 2θi ηi {Hq+2 (χq,α ; 2 ) − (q − 2)E(χq+2 (2 ))} 2
Suppose d > 0, then we have the following. (1) If g1 (2 , α) > 0 and f 2 (2 , α) > 0, it follows that for each positive d with S PT d < d1 (2 , α), β ψ (d) has risk value less than that of β ψ (d). (2) If g1 (2 , α) < 0 and f 2 (2 , α) < 0, it follows that for each positive d with S PT d > d1 (2 , α), β ψ (d) has risk value less than that of β ψ (d). Under Ho , the risk difference reduces to τψ2
p h ii (λi + d)2 2 {Hq+2 (χq,α ; 0) − (q − 2)X (0)}, (λi + 1)2 i=1
2
S 2 ; 0) = χ 2 ( qχq,α ; 0) = 1 − α. Thus, the risk of β ψ (d) is smaller where Hq+2 (χq,α q+2 q+2 PT than that of β (d) when the critical value χ 2 satisfies ψ
q,α
2 χq,α ≤
q + 2 −2 χq+2 ((q − 2)X (0)). q
PT S β ψ (d). Otherwise, the risk of β ψ (d) is smaller than the risk of
Remark 1 For α = 1, we will have the comparison between SSLRE and ULRE and for α = 0, we will have comparison between SSLRE and RLRE. 5.3 Comparison between PRSLRE and PTLRE Case 1. Under the null hypothesis, Ho : Hβ = h.
123
Rank-based Liu regression
The risk difference is S+ PT β ψ (d); β) R( β ψ (d); β) − R( p h ii (λi + d)2 τψ2 2 Hq+2 (χq,α = ; 0) − (q − 2)X (0) (λi + 1)2 i=1 −2 2 (0))2 I (χq,α ≤ q − 2) ≥ 0 −E (1 − (q − 2)χq+2
for all α satisfying the condition
2 α : χq,α ≥ Hq+2 ((q − 2)X (0) −2 2 (0))2 I (χq+2 (0) ≤ q − 2 ; 0 . +E (1 − (q − 2)χq+2
(5.3)
Thus, the risk of PTLRE is smaller than that of PRLRE when the critical value 2 satisfies the relation in (5.3). However, the risk of PRLRE is smaller than that of χq,α 2 satisfies the opposite relation to (5.3). PTLRE when the critical value χq,α Case 2. When the null hypothesis does not hold. S+ PT β ψ (d); β) will be non-positive when The risk difference R( β ψ (d); β) − R( η Fd Fd η ≥
f 3 (2 , d, α) , g2 (2 , α)
(5.4)
where 2 f 3 (2 , d, α) = τψ2 tr(Fd AFd ) Hq+2 (χq,α ; 2 ) − (q − 2)X (2 ) + a1 −2 −2(1 − d)η Fd (C + I p )−1 β (q − 2)E(χq+2 (2 )) 2 −Hq+2 (χq,α ; 2 ) − a3 , and g2 (2 , α) = Z (α, 2 ) − (q − 2)Y (2 ) + a2 + 2a3 . Also, 2 ! −2 2 2 2 a1 = E 1 − (q − 2)χq+2 ( ) I χq+2 ( ) ≤ q − 2 , 2 ! −2 2 2 2 a2 = E 1 − (q − 4)χq+4 ( ) I χq+4 ( ) ≤ q − 4 , −2 2 a3 = E (q − 2)χq+2 (2 ) − 1 I χq+2 (2 ) ≤ q − 2 .
123
M. Arashi et al.
Since 2 > 0, assume that both numerator and denominator of (5.4) are positive or S+ PT negative, respectively. Then β ψ (d) dominates β ψ (d) when 2 ≤ 23 (2 , d, α) =
f 3 (2 , d, α) , Ch max [(Fd Fd )C−1 ] × g2 (2 , α)
PT S+ β ψ (d) when and β ψ (d) dominates
2 > 24 (2 , d, α) =
f 3 (2 , d, α) . Ch min [(Fd Fd )C−1 ] × g2 (2 , α)
Now, we consider the risk difference of PRLRE and PTLRE as a function of eigenvalues and define d2 (2 , α) =
f 4 (2 , α) , g3 (2 , α)
where 2 f 4 (2 , α) = max λi h ii P1 + λi ηi P2 + θi ηi∗ P3 i
and 2 g3 (2 , α) = min 2 + θi λi ηi P3 − λi h ii P1 − λi ηi P2 . i
Also, 2 P1 = Hq+2 (χq,α ; 2 ) − (q − 2)X (2 ) + a1 ,
P2 = (q − 2)Y (2 ) − a2 − 2a3 − Z (α, 2 ), −2 2 P3 = (q − 2)E χq+2 (2 ) − a3 − Hq+2 (χq,α ; 2 ) − (q − 2)X (2 ). Suppose d > 0, then we have the following statements. (1) If g3 (2 , α) > 0 and f 4 (2 , α) > 0, it follows that for each positive d with S+ PT d > d2 (2 , α), β ψ (d) has risk value less than that of β ψ (d). (2) If g3 (2 , α) < 0 and f 4 (2 , α) < 0, it follows that for each positive d with S+ PT d < d2 (2 , α), β ψ (d) has risk value less than that of β ψ (d). β ψ (d) and Remark 2 For α = 0, we obtain the superiority condition of β ψ (d) over S+ β ψ (d). for α = 1, we obtain the superiority condition of β ψ (d) over S+
123
Rank-based Liu regression
5.4 Comparison of PRLRE and SSLRE In this case, we have S+ S β ψ (d); β) R( β ψ (d); β) − R( ! 2 −2 2 (2 ) I (χq+2 (2 ) ≤ q − 2) = −τψ2 tr(Fd AFd )E 1 − (q − 2)χq+2 ! 2 −2 2 (2 ) I (χq+4 (2 ) ≤ q − 2) + τψ2 (η Fd Fd η)E 1 − (q − 2)χq+4 −2 2 (2 ) − 1 I (χq+2 (2 ) ≤ q − 2) − 2(η Fd Fd η)E (q − 2)χq+2
− 2(1 − d)η Fd (C + I p )−1 β −2 2 × E (q − 2)χq+2 (2 ) − 1 I (χq+2 (2 ) ≤ q − 2) .
(5.5)
Case 1: Suppose η Fd (C + I p )−1 β is positive, then the righthand side of (5.5) is negative, since the expectation of a positive random variable is positive. Thus, for all and d, S+ S β ψ (d); β). R( β ψ (d); β) ≤ R(
Therefore under this condition, the PRLRE not only confirms the inadmissibility of SSLRE but also provide a simple superior estimator for the ill-conditioned data. Case 2: Suppose η Fd (C + I p )−1 β is negative, then the difference in (5.5) will be positive when η Fd Fd η ≥
f 5 (2 , d) , g4 (2 )
(5.6)
where f 5 (2 , d) = 2(1 − d)η Fd (C + I p )−1 β a3 − τψ2 tr(Fd AFd ) a1 , and g4 (2 ) = − (a2 + 2a3 ) . Since 2 > 0, assume that both numerator and denominator of (5.6) are positive or S+ S negative, respectively. Then β ψ (d) dominates β ψ (d), when 2 ≤ 25 (2 , d) =
f 9 (2 , d) , Ch max [(Fd Fd )C−1 ] × g7 (2 )
123
M. Arashi et al.
and β ψ (d) dominates β ψ (d), when S
S+
2 > 26 (2 , d) =
f 9 (2 , d) . Ch min [(Fd Fd )C−1 ] × g7 (2 )
5.5 Comparison of Liu-type rank estimators with counterparts To save the space, we only give the results when the null hypothesis does not hold, which is more general. Further we only compare SSLRE with SSRE. For more details and similar comparisons, we refer to Arashi et al. (2014). The risk difference between SSRE and SSLRE is S S β ψ (d); β) = τψ2 tr(c−1 − Fd C−1 Fd ) R( β ψ ; β) − R(
− τψ2 (q − 2)tr(A − Fd AFd )X (2 ) + (q − 2)η (I − Fd Fd )ηY (2 )
−2 (2 ) + 2(q − 2)(1 − d)η Fd [C + I p ]−1 β E χq+2 + (1 − d)2 β [C + I p ]−2 β. The above difference will be greater than or equal to 0, when η (I − Fd Fd )η ≥
h 1 (2 , d) , Y (2 )
(5.7)
where h 1 (2 , d) = −τψ2 tr(c−1 − Fd C−1 Fd ) + τψ2 (q − 2)tr(A − Fd AFd )X (2 ) −2 (2 ) − 2(q − 2)(1 − d)η Fd [C + I p ]−1 β E χq+2 − (1 − d)2 β [C + I p ]−2 β. Since 2 > 0, we assume that the numerator of (5.7) is positive. Then, the SSLRE will dominate SSRE whenever 2 ≤ 27 (2 , d), where 27 (2 , d) =
h 1 (2 , d) . (q − 2)Ch max [(I − Fd Fd )C−1 ]Y (2 )
However, SSRE dominates SSLRE when 2 < 28 (2 , d) =
123
h 1 (2 , d) . (q − 2)Ch min [(I − Fd Fd )C−1 ]Y (2 )
Rank-based Liu regression
At second stage, We will be comparing these two estimators based on the biasing parameter d. For our purpose, we have S R( β ψ (d); β) = τψ2
p p (q − 2)h ii (λi + d)2 (λi + d)2 2 − τ X (2 ) ψ λi (λi + 1)2 (λi + 1)2 i=1
i=1
p (q − 2)ηi 2 (λi + d)2 + Y (2 ) (λi + 1)2 i=1
+ 2(q − 2)(1 − d)
p θi ηi (λi + d) −2 E χ () q+2 (λi + 1)2 i=1
+ (1 − d)2
p i=1
θi2 . (λi + 1)2
(5.8)
Now differentiating (5.8) with respect to the biasing parameter d, we get S ∂R( β ψ (d); β)
∂d
= τψ2
p 2(q − 2) (λi + d) − λi h ii (λi + d)X (2 ) λi (λi + 1)2 i=1
+ τψ2 λi ηi 2 (λi + d)Y (2 )
−2 (2 ) + τψ2 λi (θi ηi − θi ηi λi − 2θi ηi d)E χq+2 − (q − 2)−1 τψ2 λi (1 − d)θi2 =
τψ2
p i=1
2 2 2 h ( , α)d − h ( , α) 2 3 (λi + 1)2
where −2 (2 ) h 2 (2 , α) = min 1 − λi h ii X (2 ) + τψ−2 λi ηi 2 Y (2 ) − 2τψ−2 λi θi ηi E χq+2 i + (q − 2)−1 τψ−2 λi θi2 and h 3 (2 , α) = max λi λi h ii X (2 ) − 1 − τψ−2 λi ηi 2 Y (2 ) i − τψ−2 θi ηi (1 − λi ) + (q − 2)−1 τψ−2 θi2 . Now we define d3 (2 , α) =
h 2 (2 , α) . h 3 (2 , α)
123
M. Arashi et al.
Suppose d > 0, then we have the following. 1. If h i (2 , α) > 0, i = 2, 3, it follows that for each positive d with d > d3 (2 , α), S S β ψ (d) has risk value less than that of βψ . 2. If h i (2 , α) < 0, i = 2, 3, it follows that for each positive d with d < d3 (2 , α), S S β ψ (d) has risk value less than that of βψ . 5.6 Estimation of d Using equations (3.1) and (3.2), the risk function of β ψ (d) (in equation 2.1) can be expressed as f (d) = R( β ψ (d); β) = τψ2 tr(Fd C−1 Fd ) + (1 − d)2 β (C + I p )−2 β, =
τψ2
p p θi2 (λi + d)2 2 + (1 − d) λi (λi + 1)2 (λi + 1)2 i=1
i=1
= δ1 (d) + δ2 (d).
(5.9)
For Liu estimator one wants to find a value of d, so that the decrease in variance (δ1 (d)) is greater than the increase in squared bias (δ1 (d)). In order to show that such β ψ ; β), we will take derivative a value less than 1 exists so that R( β ψ (d); β) < R( of equation (5.9) with respect to d as, f (d) = 2τψ2
p p θi2 (λi + d) − 2(1 − d) . λi (λi + 1)2 (λi + 1)2 i=1
(5.10)
i=1
For d = 1 in (5.10), we obtain, f (d) = 2
p i=1
1 > 0, since λi ≥ 0. λi (λi + 1)
Therefore, there exists a value of d (0 < d < 1), such that R( β ψ (d); β) ≤ β ψ (d) is minimized at R( β ψ ; β). Now from (5.10), the risk of p
d=
θi2 −τψ2 i=1 (λi +1)2 p (τψ2 +λi θi2 ) i=1 λi (λi +1)2
,
where θi is the i th component of the vector, θ = β, which is defined in equation τ2 (3.2). Replacing θi2 and τψ2 by their corresponding unbiased estimate θi2 − λψi and τψ2 , we get the optimum value of d as
123
Rank-based Liu regression
⎧ ⎨ p
dopt = 1 − τψ2 ⎩
1 i=1 λi (λi +1) p θi2 i=1 (λi +1)2
⎫ ⎬ ⎭
.
Refer to Akdeniz and Ozturk (2005) to study more on the estimation of d, among others.
6 Numerical illustrations 6.1 Graphical representation In this section, we display the graphs of risk functions for the proposed estimators. Since the relationship among them are complicated, we assume that C = I p , H H = HH = I, β = 1 p and h = 0. We can rewrite the ADR formula in Theorem 2 as below: p 2 τψ (1 + d)2 + (1 − d)2 , R( β ψ (d); β) = 4 R( β ψ (d); β) = τψ2 2 p PT 2 R( β ψ (d); β) = τψ2 (1 + d)2 1 − Hq+2 (χq,α ; 2 ) 4 1 + τψ2 2 (1 + d)2 Z (α, ) 4 2 + 2(1 − d 2 )Hq+2 (χq,α ; 2 ) + (1 − d)2 p S R( β ψ (d); β) = τψ2 (1 + d)2 1 − (q − 2)X (2 ) 4 1 + τψ2 2 (q − 2)(1 + d)2 Y (2 ) 4 −2 + 2(q − 2)(1 − d 2 )E χq+2 (2 ) + (1 − d)2
1 S+ ψS (d); β) − τψ2 (1 + d)2 R( β ψ (d); β) = R(β 4 −2 2 pE (1 − (q − 2)χq+2 (2 ))2 I (χq+2 (2 ) ≤ q − 2) −2 2 + 2 E (1 − (q − 4)χq+4 (2 ))2 I (χq+4 (2 ) ≤ q − 4) −2 2 − τψ2 2 (1 + d)E (q − 2)χq+2 (2 ) − 1 I (χq+4 (2 ) ≤ q − 4) Let γ = (γ , . . . , γ ) be a q-vector. We find γ as γ γ = τψ2 2 , e.g., if q = 3, √ τψ2 = 1 and 2 = 5, then γ = 5/3 = 1.3. Figures 1, 2, 3, and 4 show the risk of the improved Liu-type rank-based estimators versus biasing parameter, d, for different values of α, τψ , and q, respectively. Similarly, these graphs have plotted based on d in Figs. 5 ,6 , 7, and 8. The graphs support our findings.
123
M. Arashi et al.
2.90
5.0
4.5
2.85
Estimators 4.0
Risk
Risk
ULRE RLRE PTLRE (0.05) PTLRE (0.15)
3.5
PTLRE (0.25) 2.80
SELRE PRLRE 3.0
2.5
2.75 0.00
0.25
0.50
0.75
1.00
0.00
0.05
0.10
d
0.15
0.20
0.25
d
Fig. 1 Risk functions of the estimator for p = 5, q = 3, 2 = 5, τψ = 1 and different α; Left plot is original output, while right plot is zoomed in τ2ψ = 1
16
τ2ψ = 2
16
τ2ψ = 3
16
Estimators ULRE RLRE 12
12
PTLRE
12
SELRE
Risk
Risk
Risk
PRLRE
8
8
8
4
4
4
0.00
0.25
0.50
d
0.75
1.00
0.00
0.25
0.50
0.75
1.00
d
0.00
0.25
0.50
0.75
1.00
d
Fig. 2 Risk functions of the estimator for p = 5, q = 3, 2 = 5, α = 0.05 and different τψ
6.2 Simulation We conduct Monte-Carlo simulation experiments to study the performances of the proposed estimators under various practical settings. In order to generate the response variables, we use yi = Xi β + τψ i , i = 1, . . . , n, where Xi has a p-variate normal distribution. The correlation between the jth and kth components of Xi equals to 0.5| j−k| and also i is one of i.i.d following distributions, (i) standard normal, (ii) Laplace, (iii) Cauchy, or (iv) t-distribution with 5 d.f. (t5 ) and also (v) the following bivariate mixture distribution:
123
Rank-based Liu regression Δ2 = 0.01 5
Δ2 = 1
Δ2 = 5
5
Estimators
5
ULRE RLRE PTLRE
4
4 4
SELRE PRLRE 3
3
2
2
1
1
Risk
Risk
Risk
3
2
1
0
0 0.00
0.25
0.50
0.75
1.00
0 0.00
0.25
d
0.50
0.75
1.00
0.00
0.25
d
0.50
0.75
1.00
d
Fig. 3 Risk functions of the estimator for p = 5, q = 3, τψ = 1, α = 0.05 and different 2 q=5
q=3 12.5
q = 10
12.5
12.5
10.0
10.0
7.5
7.5
Estimators ULRE RLRE PTLRE
10.0
SELRE
Risk
7.5
Risk
Risk
PRLRE
5.0
5.0
5.0
2.5
2.5
2.5
0.00
0.25
0.50
d
0.75
1.00
0.00
0.25
0.50
0.75
1.00
0.00
0.25
d
0.50
0.75
1.00
d
Fig. 4 Risk functions of the estimator for p = 12, 2 = 5, τψ = 1, α = 0.05 and different q
! 1 1 arctan(t) + , F (t, γ ) = (1 − γ )(t) + γ π 2 where (t) and the expression in square brackets denote the standard normal and the standard Cauchy distribution, respectively. The proportion γ is often useful to verify the effect of outliers and small values of γ lead to a contaminated normal distribution. For example, γ = 0.1 indicates 10% outliers. To find √R-estimator in Eq. (3.1), let us consider the Wilcoxon weight function as (u) = 12 (u − 1/2).
123
M. Arashi et al.
10.0
3.35
Estimators ULRE RLRE PTLRE (0.05) PTLRE (0.15)
7.5
3.33
PTLRE (0.25) SELRE
Risk
Risk
PRLRE 5.0
3.31
2.5 3.29
0.0 0.0
2.5
5.0
7.5
10.0
1.50
1.75
2.00
Δ2
2.25
2.50
Δ2
Fig. 5 Risk functions of the estimator for p = 5, q = 3, d = 0.5, τψ = 1 and different α. Left plot is original output, while right plot is zoomed in τ2ψ = 1
τ2ψ = 2
20
τ2ψ = 3
20
20
15
15
Estimators ULRE RLRE PTLRE
15
SELRE
10
5
Risk
Risk
Risk
PRLRE
10
5
0
5
0 0.0
2.5
5.0
7.5
10.0
10
0 0.0
2.5
Δ2
5.0
Δ2
7.5
10.0
0.0
2.5
5.0
7.5
10.0
Δ2
Fig. 6 Risk functions of the estimator for p = 5, q = 3, d = 0.5, α = 0.05 and different τψ
Here, we consider two cases which are: Case 1- In order to investigate the behaviour of the estimators, we define ∗ = $ $ $β − β 0 $, where β 0 = 1 , 0 and · is the Euclidean norm. If p−q q ∗ = 0, then it means that we will have β = 1 to generate p−q , 0q ∗ the response while we will have β = (1 p−q , 2, 0q−1 ) when > 0, say ∗ ∗ = 2. When we increase the number of , it indicates the degree of violation of null hypothesis. Also, it is taken n = 60, p = 12, q = 3, 5, 7 and α = 0.01, 0.05, 0.10, 0.25. Furthermore, we consider both τψ = 1, 3 and errors are only taken from F (t, 0.5). In this case we investigate the performance of suggested estimators for different values of ∗ .
123
Rank-based Liu regression d = 0.01 8
d = 0.1
Estimators
d = 0.5
8
8
6
6
ULRE RLRE PTLRE SELRE
6
4
2
Risk
Risk
Risk
PRLRE
4
2
0
2
0 0.0
2.5
5.0
7.5
4
0
10.0
0.0
2.5
Δ2
5.0
7.5
10.0
0.0
2.5
Δ2
5.0
7.5
10.0
7.5
10.0
Δ2
Fig. 7 Risk functions of the estimator for p = 5, q = 3, τψ = 1, α = 0.05 and different d q=5
q=3 15
Estimators
q=7
15
15
10
10
ULRE RLRE PTLRE SELRE PRLRE
Risk
Risk
Risk
10
5
5
5
0
0 0.0
2.5
5.0
Δ2
7.5
10.0
0 0.0
2.5
5.0
7.5
10.0
Δ2
0.0
2.5
5.0
Δ2
Fig. 8 Risk functions of the estimator for p = 12, d = 0.5, τψ = 1, α = 0.05 and different q ∗ The performance of an arbitrary estimator βˆ was evaluated by using the model error (ME) criterion which is defined by
$ ∗ $ ∗ $2 $ ME βˆ = $Xβ 0 − Xβˆ $ . For each Monte Carlo dataset with 2500 replication, we compare the proposed ULRE procedure using the median of relative model error (MRME) where the relative model error (RME) is defined as ∗ RME(βˆ ; β ψ (d)) =
∗
ME(βˆ ) . ME( β ψ (d))
123
M. Arashi et al. q=3
q=3 Estimators
MRME
1.5
RLRE
1.0
PTLRE (0.01)
0.9
PTLRE (0.05) 1.0 PTLRE (0.1)
MRME
2.0
0.8
0.7
PTLRE (0.25)
0.5
0.6
SSLRE PRLRE
0.0 0
1
2
0.5 3
0.0
0.1
0.2
Δ∗
Δ∗
q=5
q=5
0.3
0.4
0.5
0.3
0.4
0.5
0.3
0.4
0.5
2.0 1.0
0.9
0.8
1.0
MRME
MRME
1.5
0.7
0.5 0.6
0.0
0.5 0
1
2
3
0.0
0.1
0.2
Δ∗
Δ∗
q=7
q=7
2.0 1.0
0.9
MRME
MRME
1.5
1.0
0.8
0.7
0.5 0.6
0.0
0.5 0
1
2
Δ∗
3
0.0
0.1
0.2
Δ∗
Fig. 9 The MRMEs of suggested estimator for τψ = 1. Left plots are original output while right plots are zoomed in
If the MRME of an estimator is less than one, it indicates that it is superior to β ψ (d). For the error model (v), the bivariate mixture distribution, in Figs. 9 and 10, we plot the RME of the suggested estimators as a function of ∗ for n = 60, p = 12, γ = 0.5 for τψ = 1 and τψ = 3, respectively. Based on the Figs. 9 and 10, we find the below conclusions: (1) As ∗ grows up, the MRME of the RLRE increases and shows the behaviour of this estimator will be worse. This result is obvious, since near ∗ = 0, this estimator has more true information about the parameters. When the ∗ increases, the validity of information will be questionable. (2) The larger α decreases the improvement of the PTLRE. (3) The dominance relationship between the suggested estimators are
123
Rank-based Liu regression q=3
q=3 Estimators
2.0
1.0
RLRE PTLRE (0.01) PTLRE (0.05) 1.0 PTLRE (0.1)
0.9
MRME
MRME
1.5
0.8
0.7 PTLRE (0.25) 0.5 0.6
SSLRE PRLRE
0.0 0
1
2
0.5 3
0.0
0.1
0.2
∗
0.3
0.4
0.5
0.3
0.4
0.5
0.3
0.4
0.5
∗
Δ
Δ
q=5
q=5
2.0 1.0
0.9
MRME
MRME
1.5
1.0
0.8
0.7
0.5 0.6
0.0
0.5 0
1
2
3
0.0
0.1
0.2
∗
∗
Δ
Δ
q=7
q=7
2.0 1.0
0.9
MRME
MRME
1.5
1.0
0.8
0.7 0.5 0.6
0.0
0.5 0
1
2
3
∗
0.0
0.1
0.2
∗
Δ
Δ
Fig. 10 The MRMEs of suggested estimator for τψ = 3. Left plots are original output while right plots are zoomed in
(3-1) Small degree of violation, ∗ ≈ 0: RLRE PTLRE PRSLRE SSLRE (3-2) Large degree of violation, ∗ 0: PRSLRE SSLRE PTLRE RLRE where stands for domination. i.e. βˆ 1 ()βˆ 2 means ME(βˆ 1 ) < (≤)ME(βˆ 2 ). (4) When the number of zero coefficients (q) increases, the behaviour of all suggested estimators gets a big improvement in both cases.
123
M. Arashi et al.
Case 2- Similar to Johnson and Peng (2008), we consider the regression coefficients are set β = (3, 1.5, 0, 0, 2, 0, 0, 0) . Knowing a priori which coefficients are zero, the restriction matrix will be: ⎡ ⎤ ⎤ ⎡ 0 0 0 1 0 0 0 0 0 ⎢0 ⎥ ⎢0 0 0 1 0 0 0 0⎥ ⎢ ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ (6.1) H=⎢ ⎢0 0 0 0 0 1 0 0⎥ and h = ⎢0⎥ . ⎣0 ⎦ ⎣0 0 0 0 0 0 1 0⎦ 0 0 0 0 0 0 0 0 1 By using (6.1), we define the RLRE accordingly, and construct the suggested estimators. In this case, we consider n = 60, 100, p = 8, q = 5, α = 0.01, 0.05, 0.10, 0.25, τψ = 1, 3 and errors are following one of standard normal, F (t, 7.5), F (t, 0.5), F (t, 0.25), Cauchy, t5 or Laplace distribution. Also, we compare the listed estimators with LSE, ridge, R-estimate, LASSO, ALASSO, SCAD, MCP, ENET and MNET. Finally, the performance of an ∗ estimator β was evaluated by the median ME (MME). To illustrate the behavior of the suggested estimators in this case, we report the MME of them for p = 12, γ = 0.5 and τψ = 1, 2 for n = 60 and n = 100 in Tables 2 and 3, respectively.
6.3 Real data In this study, we consider the diabetes data. The dataset consists of 10 covariates: AGE, SEX, Body Mass Index (BMI), Blood Pressure (BP), and six blood serum measurements, S1, S2, S3, S4, S5, and S6 for each of n = 442 patients. The response variable is a quantitative measure of disease progression after 1 year of follow-up. The objective of this study is to examine the disease progression associated with important prognostic variables. Note that the predictors are scaled to have mean zero and unit variance before we start analyse. First, we verify the characteristics of this dataset: The Bonferonni test reports the 57th observation is an outlier; the left plot of Fig. 11 confirms it. The distribution of the residuals is depicted on the right of Fig. 11 that shows it follows a normal model. To verify the multicollinearity, the VIF values are reported in Table 4. The values show that there are multicollinearity in the variables, S1–S5. Hence, this dataset is suitably chosen for our propose. The performance of an method is assessed by the bootstrapped prediction error criteria. The average prediction errors were calculated via five-fold CV for each bootstrap replicate. In order to construct the suggested methods, we consider a two step approach here since the prior information is not available as long as there is one who knows which predictor explains significantly the response variable. In the first step, a candidate sub-model may be selected with helps to the stepwise or subset selection procedures and the model selection criteria. Here, we used the Forward selection method to pick important covariates. It shows that the variables of AGE, S3 and S6 may be ignored and
123
3
0.063 (0.009)
0.128 (0.008)
0.137 (0.008)
0.176 (0.014)
0.062 (0.008)
0.029 (0.006)
0.062 (0.008)
PRSLRE
LSE
R-estimate
Ridge
LASSO
ALASSO
ENET
1.147 (0.076)
0.317 (0.044)
0.323 (0.052)
0.347 (0.055)
0.368 (0.062)
0.605 (0.184)
ULRE
RLRE
PTLRE(α = 0.01)
PTLRE(α = 0.05)
PTLRE(α = 0.10)
PTLRE(α = 0.25)
0.028 (0.006)
0.072 (0.008)
SSLRE
MCP
0.070 (0.022)
PTLRE(α = 0.25)
0.018 (0.006)
0.042 (0.007)
PTLRE(α = 0.15)
0.028 (0.006)
0.040 (0.006)
PTLRE(α = 0.05)
SCAD
0.039 (0.006)
PTLRE(α = 0.01)
MNET
0.134 (0.009)
0.036 (0.005)
RLRE
1
Normal
ULRE
Methods
τψ
1.070 (0.239)
0.496 (0.090)
0.480 (0.086)
0.465 (0.060)
0.453 (0.057)
1.534 (0.153)
0.409 (0.181)
0.416 (0.189)
0.245 (0.138)
0.488 (0.184)
0.357 (0.156)
0.481 (0.176)
0.857 (0.244)
0.178 (0.016)
1.140 (0.393)
0.092 (0.014)
0.102 (0.016)
0.124 (0.029)
0.057 (0.010)
0.055 (0.010)
0.053 (0.007)
0.051 (0.006)
0.177 (0.017)
F (t, 0.75)
1.044 (0.288)
0.542 (0.130)
0.498 (0.111)
0.456 (0.065)
0.435 (0.064)
1.765 (0.196)
1.354 (0.296)
1.349 (0.295)
1.145 (0.289)
1.495 (0.295)
1.327 (0.319)
1.478 (0.342)
1.959 (0.311)
0.213 (0.024)
2.685 (0.592)
0.101 (0.009)
0.111 (0.012)
0.120 (0.033)
0.063 (0.015)
0.058 (0.013)
0.053 (0.008)
0.051 (0.008)
0.207 (0.023)
F (t, 0.5)
Table 2 The MMEs based on 50 replications of listed estimator for n = 60 ∗
0.904 (0.170)
0.799 (0.152)
0.773 (0.142)
0.792 (0.153)
0.785 (0.145)
1.926 (0.286)
1.630 (0.516)
1.572 (0.552)
1.090 (0.546)
1.345 (0.524)
1.480 (0.539)
1.391 (0.471)
1.868 (0.495)
0.231 (0.033)
2.792 (0.708)
0.110 (0.016)
0.135 (0.017)
0.103 (0.021)
0.090 (0.019)
0.087 (0.018)
0.089 (0.019)
0.089 (0.018)
0.220 (0.032)
F (t, 0.25)
2.244 (0.449)
1.836 (0.470)
1.582 (0.395)
1.357 (0.280)
1.332 (0.267)
4.878 (0.589)
4.982 (1.281)
4.641 (1.236)
3.902 (0.969)
4.010 (0.818)
4.719 (1.243)
4.324 (1.318)
4.594 (1.002)
0.575 (0.060)
9.292 (5.284)
0.244 (0.040)
0.261 (0.036)
0.255 (0.051)
0.211 (0.051)
0.184 (0.042)
0.158 (0.031)
0.156 (0.030)
0.554 (0.066)
Cauchy
0.996 (0.255)
0.744 (0.177)
0.633 (0.110)
0.526 (0.109)
0.515 (0.098)
1.741 (0.094)
0.072 (0.010)
0.075 (0.010)
0.057 (0.010)
0.146 (0.014)
0.084 (0.010)
0.146 (0.014)
0.292 (0.028)
0.200 (0.012)
0.236 (0.017)
0.100 (0.010)
0.101 (0.011)
0.114 (0.029)
0.085 (0.020)
0.072 (0.013)
0.060 (0.013)
0.058 (0.012)
0.201 (0.012)
t5
0.729 (0.119)
0.591 (0.083)
0.567 (0.080)
0.534 (0.073)
0.533 (0.071)
1.496 (0.223)
0.075 (0.014)
0.074 (0.013)
0.051 (0.011)
0.141 (0.018)
0.090 (0.014)
0.142 (0.018)
0.276 (0.031)
0.174 (0.026)
0.238 (0.020)
0.086 (0.008)
0.104 (0.011)
0.088 (0.013)
0.069 (0.009)
0.066 (0.009)
0.062 (0.008)
0.062 (0.008)
0.173 (0.026)
Laplace
Rank-based Liu regression
123
123
0.620 (0.067)
0.537 (0.073)
1.151 (0.070)
1.230 (0.075)
0.857 (0.062)
0.554 (0.076)
0.392 (0.050)
0.558 (0.076)
0.276 (0.041)
0.333 (0.042)
0.341 (0.045)
SSLRE
PRSLRE
LSE
R-estimate
Ridge
LASSO
ALASSO
ENET
MNET
SCAD
MCP
Normal
Methods
3.849 (0.931)
3.774 (0.889)
3.148 (0.538)
3.487 (0.522)
3.881 (0.723)
3.920 (0.727)
3.956 (0.731)
1.600 (0.142)
10.26 (3.539)
0.801 (0.131)
0.899 (0.142)
F (t, 0.75)
8.737 (0.922)
8.743 (0.988)
6.877 (0.806)
7.125 (1.062)
8.300 (1.210)
8.344 (1.000)
7.772 (0.921)
1.916 (0.210)
24.16 (5.324)
0.873 (0.074)
0.958 (0.108)
F (t, 0.5)
7.719 (1.091)
7.195 (1.025)
6.070 (1.009)
6.409 (1.133)
7.234 (1.111)
6.735 (1.037)
7.771 (1.321)
2.076 (0.295)
25.12 (6.374)
0.960 (0.142)
1.212 (0.146)
F (t, 0.25)
13.987 (2.062)
13.739 (1.981)
10.836 (1.674)
11.18 (1.628)
14.148 (2.037)
13.639 (1.919)
12.757 (1.302)
5.173 (0.548)
83.62 (47.557)
2.122 (0.329)
2.283 (0.303)
Cauchy
1.197 (0.168) 1.193 (0.179)
1.079 (0.145)
0.829 (0.183)
1.195 (0.126)
1.191 (0.168)
1.275 (0.162)
1.413 (0.152)
1.569 (0.227)
2.140 (0.184)
0.741 (0.068)
0.885 (0.095)
Laplace
1.075 (0.127)
0.913 (0.132)
1.278 (0.134)
1.146 (0.151)
1.317 (0.128)
1.675 (0.134)
1.799 (0.109)
2.123 (0.150)
0.870 (0.093)
0.874 (0.104)
t5
∗ The numbers in parentheses are the corresponding standard errors estimated by using the bootstrap with B = 500 re-samplings on the 50 MEs
τψ
Table 2 continued
M. Arashi et al.
3
0.037 (0.006)
0.072 (0.008)
0.079 (0.007)
0.151 (0.010)
0.042 (0.005)
0.019 (0.003)
0.042 (0.005)
PRSLRE
LSE
R-estimate
Ridge
LASSO
ALASSO
ENET
0.702 (0.065)
0.209 (0.022)
0.209 (0.020)
0.219 (0.025)
0.240 (0.034)
0.321 (0.074)
ULRE
RLRE
PTLRE(α = 0.01)
PTLRE(α = 0.05)
PTLRE(α = 0.10)
PTLRE(α = 0.25)
0.017 (0.004)
0.044 (0.006)
SSLRE
MCP
0.038 (0.010)
PTLRE(α = 0.25)
0.012 (0.002)
0.027 (0.004)
PTLRE(α = 0.10)
0.017 (0.004)
0.026 (0.004)
PTLRE(α = 0.05)
SCAD
0.025 (0.004)
PTLRE(α = 0.01)
MNET
0.080 (0.007)
0.025 (0.004)
RLRE
1
Normal
ULRE
Methods
τψ
0.383 (0.076)
0.316 (0.052)
0.297 (0.052)
0.259 (0.046)
0.256 (0.043)
0.779 (0.065)
0.225 (0.080)
0.231 (0.070)
0.199 (0.086)
0.391 (0.118)
0.255 (0.101)
0.396 (0.117)
0.659 (0.164)
0.101 (0.008)
0.701 (0.173)
0.057 (0.006)
0.060 (0.007)
0.070 (0.011)
0.047 (0.01)
0.035 (0.007)
0.033 (0.006)
0.033 (0.006)
0.102 (0.008)
F (t, 0.75)
0.357 (0.090)
0.283 (0.042)
0.259 (0.036)
0.261 (0.038)
0.261 (0.035)
0.870 (0.090)
1.119 (0.400)
1.147 (0.399)
0.982 (0.371)
1.348 (0.371)
1.227 (0.388)
1.405 (0.409)
1.915 (0.429)
0.129 (0.015)
2.392 (0.779)
0.067 (0.009)
0.068 (0.010)
0.077 (0.024)
0.054 (0.014)
0.042 (0.007)
0.039 (0.006)
0.036 (0.005)
0.128 (0.014)
F (t, 0.5)
Table 3 The MMEs based on 50 replications of listed estimator for n = 100 ∗
0.763 (0.256)
0.512 (0.169)
0.394 (0.058)
0.375 (0.035)
0.374 (0.034)
1.496 (0.183)
2.439 (0.501)
2.412 (0.531)
2.119 (0.507)
2.478 (0.572)
2.351 (0.611)
2.514 (0.568)
2.920 (0.703)
0.174 (0.020)
4.091 (1.447)
0.078 (0.012)
0.108 (0.012)
0.074 (0.012)
0.061 (0.012)
0.055 (0.009)
0.053 (0.008)
0.053 (0.008)
0.167 (0.022)
F (t, 0.25)
1.308 (0.198)
1.108 (0.167)
1.043 (0.188)
0.919 (0.193)
0.926 (0.187)
2.584 (0.274)
5.094 (1.439)
4.812 (1.351)
4.068 (1.156)
4.386 (1.048)
4.921 (1.444)
4.760 (1.238)
5.084 (1.127)
0.266 (0.035)
9.109 (3.037)
0.119 (0.018)
0.125 (0.018)
0.106 (0.022)
0.090 (0.014)
0.091 (0.014)
0.089 (0.012)
0.089 (0.012)
0.262 (0.030)
Cauchy
0.640 (0.160)
0.316 (0.055)
0.293 (0.046)
0.283 (0.043)
0.283 (0.040)
0.965 (0.081)
0.038 (0.010)
0.038 (0.010)
0.023 (0.006)
0.076 (0.010)
0.044 (0.004)
0.077 (0.010)
0.199 (0.020)
0.111 (0.017)
0.141 (0.013)
0.058 (0.007)
0.063 (0.008)
0.059 (0.010)
0.048 (0.009)
0.044 (0.008)
0.042 (0.007)
0.037 (0.006)
0.110 (0.015)
t5
0.427 (0.138)
0.309 (0.083)
0.279 (0.052)
0.269 (0.044)
0.258 (0.042)
0.940 (0.068)
0.047 (0.006)
0.048 (0.006)
0.037 (0.009)
0.094 (0.009)
0.051 (0.009)
0.094 (0.010)
0.208 (0.013)
0.097 (0.010)
0.140 (0.021)
0.048 (0.011)
0.050 (0.009)
0.051 (0.014)
0.034 (0.009)
0.031 (0.008)
0.031 (0.008)
0.030 (0.007)
0.095 (0.009)
Laplace
Rank-based Liu regression
123
123
0.360 (0.046)
0.332 (0.041)
0.655 (0.051)
0.719 (0.077)
0.579 (0.024)
0.367 (0.026)
0.221 (0.020)
0.367 (0.027)
0.157 (0.025)
0.206 (0.028)
0.208 (0.026)
SSLRE
PRSLRE
LSE
R-estimate
Ridge
LASSO
ALASSO
ENET
MNET
SCAD
MCP
Normal
Methods
2.501 (0.879)
2.435 (0.721)
2.088 (0.668)
2.363 (0.604)
2.440 (0.857)
2.429 (0.721)
2.945 (0.752)
0.791 (0.071)
4.804 (1.747)
0.406 (0.061)
0.448 (0.058)
F (t, 0.75)
11.17 (1.868)
11.01 (2.023)
7.797 (1.857)
7.953 (1.765)
10.71 (1.986)
10.64 (2.087)
9.131 (2.092)
0.894 (0.085)
49.06 (17.96)
0.380 (0.050)
0.460 (0.081)
F (t, 0.5)
13.47 (1.364)
12.48 (1.926)
8.817 (1.685)
8.929 (1.839)
13.31 (1.357)
12.47 (2.005)
10.26 (1.639)
1.541 (0.196)
66.65 (33.76)
0.741 (0.147)
0.902 (0.193)
F (t, 0.25)
13.65 (1.661)
13.17 (1.688)
10.02 (1.787)
10.33 (1.603)
13.42 (1.854)
13.07 (1.767)
12.26 (1.261)
2.642 (0.288)
78.36 (25.13)
1.319 (0.192)
1.479 (0.185)
Cauchy
0.512 (0.148) 0.521 (0.143)
0.457 (0.107)
0.320 (0.081)
0.676 (0.068)
0.478 (0.116)
0.695 (0.066)
1.017 (0.097)
0.964 (0.064)
1.441 (0.109)
0.430 (0.086)
0.491 (0.091)
Laplace
0.438 (0.096)
0.384 (0.085)
0.768 (0.060)
0.530 (0.117)
0.777 (0.054)
1.095 (0.073)
0.983 (0.074)
1.303 (0.109)
0.481 (0.095)
0.549 (0.102)
t5
∗ The numbers in parentheses are the corresponding standard errors estimated by using the bootstrap with B = 500 re-samplings on the 50 MEs
τψ
Table 3 continued
M. Arashi et al.
0.4
Rank-based Liu regression
0.3 0.2
Density
0.0
0.1
Standardized Residuals
2.5
0.0
−2.5
−3 −2
0
−2
−1
0
1
2
3
Standardazied residuals
2
Theoritical Quantiles
Fig. 11 Left: the qq-plot for the diabates datasets; Right: the distribution of the residuals Table 4 The VIF values of the diabates dataset Variable
AGE
SEX
BMI
BP
S1
S2
S3
S4
S5
S6
VIF
1.22
1.28
1.51
1.46
59.20
39.19
15.40
8.89
10.07
1.48
the other variables can be used a sub model. Hence, in the second step, the suggested methods can be constructed by using this candidate sub-model. We consider the unknown regression parameters by the helps of Forward selection method, β true = (0, −11.3, 25.2, 15.9, −29.3, 16.8, 0, 6.4, 33.6, 0) , as the true parameter in the bootstrap method. In the following, K -fold cross validation is used to obtain an estimate of the prediction errors of the model. In a K -fold cross validation, the dataset # divided " is randomly into K subsets of roughly equal size. One subset is left aside, Xtest , ytest , termed as test set, while the remaining K − 1 subsets, called the training set, are used to fit train model. The result estimator is called βˆ . The fitted model is then used to predict the responses of the test data set. Finally, prediction errors (PE) are obtained by taking the squared deviation of the observed and predicted values in the test set, i.e. $ $2 PEk = $Xktest β true − yˆ ktest $ ;
k = 1, 2, . . . , K ,
train where yˆ ktest = Xktest βˆ k . The process is repeated for all K subsets and the prediction errors (PE) are combined. To account for the random variation of the cross validation, the process is reiterated N times and is estimated the median prediction error (MPE) that is given by
123
123
0.64
PRSLRE
0.02
(2.60)
− 28.93
14.82
0.53
7.53
33.47
2.43
27.62
1.00
SSLRE
S1
S2
S3
S4
S5
S6
MPE
RMPE
25.44
BMI
(1.42)
6.06
S3
S4
(1.57)
(2.47)
10.29
− 2.75
S2
(0.61)
(3.11)
16.13
− 22.65
BP
S1
(0.65)
(0.48)
(0.52)
− 0.01
− 12.65
AGE
SEX
(0.39)
(0.55)
(1.22)
(1.43)
(1.39)
(2.14)
(0.65)
(0.53)
25.25
16.02
BMI
BP
0.04
(0.01)
6.30
− 2.75
11.32
− 23.86
16.11
25.37
− 12.67
17.92
0.06
31.88
(1.26)
(0.46)
(1.62)
(1.61)
(0.56)
(0.64)
(0.52)
(0.10)
(0.30)
(0.01)
(0.92)
(1.12)
(0.16)
− 3.06
5.80
(1.39)
(1.19)
(0.56)
(0.64)
(0.52)
10.35
− 22.86
16.23
25.42
− 12.65
(0.43)
(0.52)
− 0.80
− 12.68
AGE
SEX
RLRE
ULRE
Table 5 Estimations of Diabetes data∗
8.63
4.46
21.94
− 37.95
16.09
25.20
− 12.79
− 0.85
R-estimate
0.65
17.92
0.07
31.89
5.81
− 3.06
10.35
− 22.86
16.23
25.42
− 12.65
0.04
PTLRE1
(1.50)
(1.67)
(2.74)
(3.38)
(0.53)
(0.65)
(0.53)
(0.43)
(0.30)
(0.01)
(0.92)
(1.12)
(0.16)
(1.40)
(1.19)
(0.56)
(0.64)
(0.52)
(0.01)
7.47
0.80
15.67
− 28.56
15.49
24.84
− 11.35
− 0.42
LSE
0.65
17.92
0.07
31.89
5.81
− 3.06
10.35
− 22.86
16.23
25.42
− 12.65
0.04
PTLRE2
(1.28)
(1.27)
(1.93)
(2.25)
(0.50)
(0.59)
(0.49)
(0.42)
(0.30)
(0.01)
(0.92)
(1.12)
(0.16)
(1.40)
(1.19)
(0.56)
(0.64)
(0.52)
(0.01)
(0.52) 5.46
(0.81)
(0.49)
− 2.38 − 8.57
(0.49)
(0.45)
(0.54)
(0.45)
(0.39)
(0.30)
(0.01)
(0.92)
(1.12)
(0.16)
(1.40)
(1.19)
(0.56)
(0.64)
(0.52)
(0.01)
− 5.67
14.81
24.05
− 10.43
− 0.10
Ridge
0.65
17.92
0.07
31.89
5.81
− 3.06
10.35
− 22.86
16.23
25.42
− 12.65
0.04
PTLRE3
6.30
0.00
14.11
− 26.92
15.28
24.89
− 11.09
− 0.24
LASSO
0.65
18.08
0.07
31.89
5.81
− 3.06
10.36
− 22.88
16.22
25.42
− 12.65
0.04
PTLRE4
(1.33)
(0.91)
(1.16)
(1.31)
(0.50)
(0.60)
(0.49)
(0.35)
(0.30)
(0.01)
(0.93)
(1.13)
(0.16)
(1.41)
(1.20)
(0.56)
(0.64)
(0.52)
(0.01)
M. Arashi et al.
25.01
0.90
MPE
RMPE
(0.31)
(1.80)
(0.32)
(2.12)
(1.02)
0.84
23.32
3.17
31.35
6.32
0.00
14.18
− 26.97
15.31
24.88
(0.30)
(0.52)
(1.11)
(1.32)
(0.98)
(1.12)
(1.37)
(0.50)
(0.59)
(0.49)
0.79
21.94
1.98
30.21
4.64
0.00
13.12
− 25.10
15.48
25.00
− 11.17
0.00
(0.34)
− 0.27 (0.35)
(1.45) (0.55)
MNET
0.98
27.03
2.37
36.93
R-estimate
ENET
(0.33)
(0.36)
− 11.10
0.71
19.66
0.12
32.19
PRSLRE
∗ The numbers in parentheses are the corresponding standard errors
0.73
S5
RMPE
(0.67)
5.53
31.58
S4
1.84
0.00
S3
20.11
(1.50)
12.64
S6
(2.01)
− 25.10
S1
S2
MPE
(0.54)
25.26
15.21
BMI
BP
(0.62)
(0.01)
(0.49)
0.00
− 10.61
AGE
SEX
ALASSO
(1.16)
0.19
(1.39)
(0.75)
31.96
S5
S6
SSLRE
Table 5 continued
(0.28)
(0.91)
(2.10)
(2.61)
(4.09)
(6.02)
(8.11)
(0.51)
(0.62)
(0.52)
(0.18)
1.09
29.98
3.31
32.33
LSE (1.10)
0.80
22.18
1.84
30.11
3.58
0.00
12.25
− 24.08
15.52
25.04
− 11.21
0.00
SCAD
(0.43)
(0.53) 1.50
41.38
3.86
22.53
Ridge
(0.11)
(0.30)
(0.83)
(2.45)
(2.56)
(4.72)
(6.92)
(9.31)
(0.53)
(0.62)
(0.49
(0.55) (0.34)
(0.48)
0.82
22.56
1.80
30.11
3.89
0.00
12.74
− 24.55
15.53
25.06
− 11.20
0.00
MCP
0.85
23.51
3.17
31.37
LASSO (1.10)
(0.30)
(0.96)
(2.30)
(2.62)
(4.53)
(6.84)
(9.08)
(0.53)
(0.64)
(0.51)
(0.13)
(0.30)
(0.52)
Rank-based Liu regression
123
M. Arashi et al.
MPE = median
K K 1 k 1 k PE1 , . . . , PE N , K K k=1
k=1
where PEik is the prediction error of considering kth test set in ith iteration. The performance of the arbitrary estimator, βˆ 1 , with respect to the full model estimator, β˜ ψ (d), is obtained by the below Efficiency (Eff) formula, that is defined as
RMPE(βˆ ; β˜ ψ (d)) =
MPE(βˆ ) . MPE(β˜ ψ (d))
If the value of RMPE is less than 1, then βˆ performs better than β˜ ψ (d). Our results are based on N = 500 case resampled bootstrap samples. In Table 5, we report the estimations and RMPE of each methods. Based on this table, the restricted Liu-typed R estimator performs better than other estimator. In comparison with the penalized estimators, the positive rule Stein-type shrinkage Liu-typed R estimator has less median prediction error. Along with the penalized estimators, the adaptive LASSO is the best estimator, MNET estimator stands in the second position.
7 Conclusions In this study, we proposed Liu-type estimation using a rank based score test when the assumptions of the LSE estimation are not valid. Furthermore, we defined the preliminary and Stein-type estimation in the context of Liu regression, and investigated the asymptotic properties of the listed estimators. We conducted a Monte Carlo simulation to investigate the behavior of proposed estimators when a selected sub-model may or may not be a true model. Moreover, we compared the proposed estimators with LSE, R-estimate and some penalty estimators, which are well-know in the literature, when the errors follow normal, Cauchy, t5 , Laplace and bivariate mixture distribution. Finally, a real data example of diabetes is analyzed. The result of data analysis is very consistent with theoretical and numerical analysis. In conclusion, we suggest to use Liu-type rank based pretest and shrinkage estimators when the design matrix is ill-conditioned or the errors are far away from normal or the data contain some outliers. Acknowledgements We would like to thank two anonymous referees for their valuable and constructive comments which significantly improved the presentation of the paper and led to put many details. First author Mohammad Arashi’s work is based on the research supported in part by the National Research Foundation of South Africa (Grant NO. 109214). Third author S. Ejaz Ahmed is supported by the Natural Sciences and the Engineering Research Council of Canada (NSERC).
References Ahmed SE (2014) Penalty, shrinkage and pretest strategies: variable selection and estimation. Springer, New York
123
Rank-based Liu regression Akdeniz F, Akdeniz Duran E (2010) Liu-type estimator in semiparametric regression models. J Stat Comput Simul 80(8):853–871 Akdeniz F, Ozturk F (2005) The distribution of stochastic shrinkage biasing parameters of the Liu type estimator. Appl Math Comput 163(1):29–38 Akdeniz Duran E, Akdeniz F (2012) Efficiency of the modified jackknifed Liu-type estimator. Stat Pap 53(2):265–280 Akdeniz Duran E, Akdeniz F, Hu H (2011) Efficiency of a Liu-type estimator in semiparametric regression models. J Comput Appl Math 235(5):1418–1428 Arashi M, Kibria BMG, Norouzirad M, Nadarajah S (2014) Improved preliminary test and Stein-rule Liu estimators for the ill-conditioned elliptical linear regression model. J Multi Anal 124:53–74 Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360 Frank I, Friedman J (1993) A statistical view of some chemometrics regression tools (with discussion). Technometrics 35:109–148 Hettmansperger TP, McKean JW (1998) Robust nonparametric statistical methods. Arnold, London Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation in non-orthogonal problems. Technometrics 12(3):55–67 Huang J, Breheny P, Ma S, Zhang CH (2010) The MNET method for variable selection (Unpublished Technical Report) Kibria BMG (2004) Performance of the shrinkage preliminary test ridge regression estimators based on the conflicting of W, LR and LM tests. J Stat Comput Simul 74(11):703–810 Kibria BMG (2012) On some Liu and ridge type estimators and their properties under the ill-conditioned gaussian linear regression model. J Stat Comput Simul 82(1):1–17 Liu K (1993) A new class of biased estimate in linear regression. Commun Stat Theor Meth 22(2):393–402 Johnson BA, Peng L (2008) Rank-based variable selection. J Nonparameter Stat 20(3):241–252 Malthouse EC (1999) Shrinkage estimation and direct marketing scoring model. J Interact Mark 13(4):10–23 Puri ML, Sen PK (1986) A note on asymptotic distribution free tests for sub hypotheses in multiple linear regression. Ann Stat 1:553–556 Roozbeh M, Arashi M (2013) Feasible ridge estimator in partially linear models. J Multi Anal 116:35–44 Saleh AKMdE (2006) Theory of preliminary test and Stein-type estimation with applications. Wiley, New York Saleh AKMdE, Kibria BMG (2011) On some ridge regression estimators: a nonparametric approach. J. Nonparametric Stat 23(3):819–851 Saleh AKMdE, Shiraishi T (1989) On some R and M-estimation of regression parameters under restriction. J Jpn Stat Soc 19(2):129–137 Sengupta D, Jammalamadaka SR (2003) Linear models: an integrated approach. World Scientific Publishing Company, Singapore Stein C (1956) Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. Proc Third Berkeley Symp 1:197–206 Tabatabaey SMM, Saleh AKMdE, Kibria BMG (2004) Estimation strategies for parameters of the linear regression model with spherically symmetric distributions. J Stat Res 38(1):13–31 Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc Ser B Stat Methodol 58(1):267–288 Yüzba¸sı B, Ahmed SE, Güngör M (2017) Improved penalty strategies in linear regression models. REVSTAT-Stat J 15(2):251–276 Yüzba¸sı B, Asar Y, Sık ¸ MS, ¸ Demiralp A (2017) Improving estimations in quantile regression model with autoregressive errors. Therm Sci. https://doi.org/10.2298/TSCI170612275Y Xu J, Yang H (2012) On the Stein-type Liu estimator and positive-rule Stein-type Liu estimator in multiple linear regression models. Commun Stat Theor Meth 41(5):791–808 Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894– 942 Zou H (2006) The adaptive LASSO and its oracle properties. J Am Stat Assoc 101(476):1418–1429 Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol 67(2):301–320
123