Gene Network Construction Tool Kit @ QBRC

Neighborhood Selection

Neighborhood selection (NS) separately solves the lasso problem and identifies edges with nonzero estimated regression coefficients for each node with tuning parameter $λ_i(\alpha)$. The NS method is asymptotically consistent in identifying the neighborhood of each node when the neighborhood stability condition is satisfied.

To be specific, for each node $i \in V = \{1,2,...,p\}$, NS solves the following lasso problem $$\hat{\beta}^{i,\lambda} = \underset{\beta \in \mathbb{R}^p: \beta_i = 0}{argmin} \frac{1}{2}\left \|X_i - X\beta\right \|^2_2 + \lambda\left \|\beta\right \|_1,$$ where $\left \| x \right \|_{2}^{2} = \sum_{i=1}^{p}x_{i}^{2}$ and $\left \| x \right \|_{1} = \sum_{i=1}^{p}\left | x_{i} \right |$ for $x \in \mathbb{R}^p$. With the estimate $\hat{\beta}^{i,\lambda}$, NS identifies the neighborhood of the node $i$ as $N_i(\lambda) = \{ k | \hat{\beta}_{k}^{i,\lambda} \neq 0 \}$, which defines an edge set $E_{i}^{\lambda} = \left \{ \left ( i, j\right ) | j \in N_{i}\left ( \lambda\right )\right \}$. Choice of the tuning parameter $λ_i(\alpha)$ for the $i$th node is given by $$\lambda(\alpha) = \left \| X_{i} \right \|_{2}\tilde{\Phi}^{-1}(\frac{\alpha}{2p^{2}})$$ where $\tilde{\phi} = 1 - \phi$ and $\phi$ is the distribution function of the standard normal distribution. With this choice of $\lambda_i(\alpha)$ for $i=1,2,...,p$, the probability of falsely identifying edges in the network is bounded by the level $\alpha$. We implement NS with R package CDLasso provided by the authors.

Reference:
1. Donghyeon Yu, Johan Lim, Xinlei Wang, Faming Liang, and Guanghua Xiao. "Enhanced construction of gene regulatory networks using hub gene information." BMC bioinformatics 18.1 (2017): 186.
2. Meinshausen, Nicolai, and Peter Bühlmann. "High-dimensional graphs and variable selection with the lasso." The annals of statistics 34, no. 3 (2006): 1436-1462.


Note:
Change the $\alpha$ level to control the false positive rate $(\alpha > 0)$. A larger $\alpha$ will give you more estimated edges, but with lower confidence. If you don't know how to choose a value, use the default one.

Data & parameters (Required)

Gene expression data:

Example

Alpha:

Enter the code:


User information (optional)

Name:

Organization:

Email:

You will be notified through e-mail when you submit your job.