GeNeCK: Gene Network construction tool kit

Neighborhood Selection

Neighborhood selection (NS) separately solves the lasso problem and identifies edges with nonzero estimated regression coefficients for each node with tuning parameter $λ_i(\alpha)$. The NS method is asymptotically consistent in identifying the neighborhood of each node when the neighborhood stability condition is satisfied.

To be specific, for each node $i \in V = \{1,2,...,p\}$, NS solves the following lasso problem $$\hat{\beta}^{i,\lambda} = \underset{\beta \in \mathbb{R}^p: \beta_i = 0}{argmin} \frac{1}{2}\left \|X_i - X\beta\right \|^2_2 + \lambda\left \|\beta\right \|_1,$$ where $\left \| x \right \|_{2}^{2} = \sum_{i=1}^{p}x_{i}^{2}$ and $\left \| x \right \|_{1} = \sum_{i=1}^{p}\left | x_{i} \right |$ for $x \in \mathbb{R}^p$. With the estimate $\hat{\beta}^{i,\lambda}$, NS identifies the neighborhood of the node $i$ as $N_i(\lambda) = \{ k | \hat{\beta}_{k}^{i,\lambda} \neq 0 \}$, which defines an edge set $E_{i}^{\lambda} = \left \{ \left ( i, j\right ) | j \in N_{i}\left ( \lambda\right )\right \}$. Choice of the tuning parameter $λ_i(\alpha)$ for the $i$th node is given by $$\lambda(\alpha) = \left \| X_{i} \right \|_{2}\tilde{\Phi}^{-1}(\frac{\alpha}{2p^{2}})$$ where $\tilde{\phi} = 1 - \phi$ and $\phi$ is the distribution function of the standard normal distribution. With this choice of $\lambda_i(\alpha)$ for $i=1,2,...,p$, the probability of falsely identifying edges in the network is bounded by the level $\alpha$. We implement NS with R package CDLasso provided by the authors.

Reference:
1. Donghyeon Yu, Johan Lim, Xinlei Wang, Faming Liang, and Guanghua Xiao. "Enhanced construction of gene regulatory networks using hub gene information." BMC bioinformatics 18.1 (2017): 186.
2. Meinshausen, Nicolai, and Peter Bühlmann. "High-dimensional graphs and variable selection with the lasso." The annals of statistics 34, no. 3 (2006): 1436-1462.

Note:
Change the $\alpha$ level to control the false positive rate $(\alpha > 0)$. A larger $\alpha$ will give you more estimated edges, but with lower confidence. If you don't know how to choose a value, use the default one.

Data & parameters (Required)

Gene expression data:		Example	Submit Form requirements Expression data requirements: Only CSV file is accepted here, and the maximum size is 12MB. The first row will be used as gene name. The rest of the row must be numeric type. Each sample (row) must contain the same number of columns (genes) as the first row. Each sample will be scaled to have mean 0 and standard deviation 1. Download demo data moderate size: small size: Close
Alpha:
Enter the code:


User information (optional)

Name:
Organization:
Email:
	You will be notified through e-mail when you submit your job.

	submit

GeNeCK

Gene Network Construction Tool Kit @ QBRC

Network Inference Method

Incorporate Hub Gene

Integrative Methods

Neighborhood Selection