SiGN-Proc Manual

Introduction

SiGN-Proc is a CUI-based (command line) tool to process gene network files, including converting file formats, extracting subnetworks, coloring the specified nodes, and so on.

In SiGN-Proc, you specify filters to process networks. Many filters are available in SiGN-Proc. One filter reads a network from a file. One extracts a sub network. You can specify multiple filters. In such a case, a network processed by a filter is passed to the next filter.

SiGN-Proc for Linux x86-64 is available at DOWNLOAD page of SiGN-BN.

Note: This document is currently under development.

Synopsis

Direct execution on the computation node (interactive job) of the HGC supercomputer system

~tamada/sign/signproc [ options ] [ filters... ]

Execution as a Grid Engine job on the HGC supercomputer system

qsub [ GE options ] ~tamada/sign/signbn-hcbs.sh --bin signproc [ options ] [ filters ]

GE options

See the manual for SiGN-BN.

Options

--help

Show help message.

Example

Here are some examples. Specify the following as filters above when you execute SiGN-Proc.

File format conversion

--read type=sgn3,file=network.sgn3 --output type=csml,file=network.csml

This example converts the network file network.sgn3 written in the SGN3 format into the CSML format file network.csml. See File Formats for details of the available file formats.

Subnetwork extraction

--read type=csml,file=network.csml --subnet node=IL6,dist=1 --output type=csml,file=subnet.csml

This extracts IL6 and its parents and children from network.csml.

Filters

The filter can be specified by --filter_name followed by its key=value style arguments concatenated by commas. White spaces can be inserted after the camma.

Execute "signproc --help filter" for the full list of available filters, and "signproc --help filter,filter_name" for the detailed description of filter_name.


Edge Property filter (--edgeprop)

This removes all edges except for ones whose properties specified by name satisfy the condition specified by op and value.

name=property_name

type=property_type

Property type such as int, double, string, etc...

op= { eq | ge | gt | le | lt }

Operator: the values correspond to equal, greater or equal, greater, less or equall, and less.

value= value

String expression of a value to be compared.

noprop= { stop | remove | ignore }

What to do if the property is not found in an edge.

Examples:

--edgeprop name=BS.Prob,type=double,op=gt,value=0.5

This removes edges whose bootstrap probabilities are less than or equal to 0.5.

--edgeprop name=up/down,type=string,op=eq,value=up

This leaves edges that are estimated as up-regulated ones.


Read filter (--read)

This filter reads a network, data frame, or dataset from a file, and then passes it to the next filter.

type= { edf | frame | network_format }

Type of a file to read. See File Formats for the network file format.

file= file_name


Subnet filter (--subnet)

The subnet filter extracts a sub network based on a node names given by a file or a list of node names.

file=file_name

File name containing a list of node names. Each row in a file corresponds to a node name. The file can be a tab-separated file. If so, the column position can be specified by the col option below.

col=n

Column position of the node list file. The first (left-most) columns is 1 (default).

node=node1:node2:...

Instead of specifying a list of nodes by a file, you can give them directly by this option. If this is specified, file option is ingored.

dist=n

Distance of nodes and edges to include from the nodes of the above list. The distance of the adjacent nodes connected to a specified node is 1 (default).

type=induce

If this is specified, only edges connecting nodes that exist in a list are returned.

Output filter (--output)

Output a network into a file. This passes the received network to the next filter without modifying or processing it. Therefore, by using multiple output filters, users can output a network in files in various formats.

file=file_name

Output file name. The path can be included in file_name.

type=network_format

See File Formats for the available network file format. By default, csml is assumed.

args=\{key1=value1,...\}

Arguments of the file format.

BS filter (--bs)

The BS filter is to compile the bootstrapped networks into a single network. This expects that the file names are consecutively numbered with the fixed prefix. For example, by default, file_name_prefix.000001, file_name_prefix.000002, ..., file_name_prefix.001000. This simply ignores the files that are not found. The edges whose bootstrap probabilities are greater than the threshold will be included in the output network.

prefix=file_name_prefix

The prefix of the file names to process. This filter expects that the file names are in the form of file_name_prefix.000000 where 000000 is a 6-digit bootstrap ID.

dynamic

If specified, bootstrapped networks to process are expected to be the dynamic model.

ed=n

The beginning index (ID) of the file names to process. By default, n = 1.

bg=n

The ending index (ID) of the file names to process. By default, n = 1000.

th=threshold

The threshold of the bootstrap probability. Edges whose bootstrap probabilities are greater than the threshold will be included in the resultant network. By default, threshold = 0.05.

ver=n

Version. 1 and 2 are available. Do not mind details ;-). By default, n = 2.
^ Go to Top

Node Color filter (--nodecolor)

file=file_name

The line-by-line file containing the list of node names to color. The column position of the node names in a line can be specified by the col argument below.

col=n

color=r:g:b [:a ]

Color of the node by RGB. The integer value ranging 0 to 255 is acceptable for r (red), g (green), and b (blue). The alpha blending (a) can also be specified optionally. If it is not specified a=255 is assumed by default.

Comp filter (--comp)

The Comp filter is to compare two network structures. This filter does not change the network. Instead, it compares the network with another one specified by the arguments. After the comparison, it prints the comparison result. This regards the network given by the arguments as a true network of the demanded one and counts the number of true positive (TP), false positive (FP), true negative (TN) and false negative (FN) edges of the network passed by SiGN-Proc.

The TP edges are ones that exist in the both network. The FP edges are ones that exist only in the original network passwd by SiGN-Proc. The FN edges are ones that exist only in the true network given by the arguments.

The TP, FP, FN edges can be saved as separate network files by specifying the tp, fp, and fn arguments.

file=file_name

The file name of the true network to read and compare with.

type=file_type

The type of the file format to read. See File Formats for the available file formats.

args=\{key1=value1,...\}

Arguments of the file format.

tp=file_name

If specified, TP edges, i.e., edges that exist in the both networks, are saved as a separate network in a file file_name.

tptype=file_type

File type of the TP network.

tpargs=\{key1=value1,...\}

Arguments of the file format for the TP network.

fp=file_name

If specified, FP edges, i.e., edges that exist only in the network passed by SiGN-Proc, are saved as a separate network in a file file_name.

fptype=file_type

File type of the FP network.

fpargs=\{key1=value1,...\}

Arguments of the file format for the FP network.

fn=file_name

If specified, FN edges, i.e., edges that exist only in the network given by the file argument above, are saved as a separate network in a file file_name.

fntype=file_type

File type of the FN network.

fnargs=\{key1=value1,...\}

Arguments of the file format for the FN network.

Score filter (--score)

The score filter calculates the network score. This filter does not change the network structure at all. This prints the calculated score to the standard output.

data=file_name

The file name of the input data matrix.

score_args=\{key1=value1,...\}

Arguments for the score function.

dynamic

Specifies to use dynamic model.

mem=memory_size

Specifies the memory size in MiB for the score calculation.

Change History

ver. 0.22.7 (2021-03-29 Mon)

ver. 0.22.6 (2020-10-08 Thu)

ver. 0.22.5 (2019-07-12 Fri)

ver. 0.21.1 (2018-02-07 Wed)

ver. 0.19.0 (2014-08-21 Thu)

ver. 0.16.0 (2014-01-21 Tue)


Copyright © 2012-2021
SiGN Project members.
All Rights Reserved.
Contact: Yoshinori Tamada <tamada ATMARK ytlab.jp>