Command Line Interface
CLEP commands.
clep
Run clep.
clep [OPTIONS] COMMAND [ARGS]...
classify
Perform machine-learning classification.
clep classify [OPTIONS]
Options
- --data <data>
Required Path to tab-separated gene expression data file
- --out <out>
Required Path to the output folder
- --model <model>
Required Choose a classification model
- Options
logistic_regression | elastic_net | svm | random_forest | gradient_boost
- --optimizer <optimizer>
Required Optimizer used for classifier.
- Options
grid_search | random_search | bayesian_search
- --cv <cv>
Number of cross validation steps
- Default
5
- -m, --metrics <metrics>
Metrics that should be tested during cross validation (comma separated)
- Options
explained_variance | r2 | max_error | neg_median_absolute_error | neg_mean_absolute_error | neg_mean_squared_error | neg_mean_squared_log_error | neg_root_mean_squared_error | neg_mean_poisson_deviance | neg_mean_gamma_deviance | accuracy | roc_auc | roc_auc_ovr | roc_auc_ovo | roc_auc_ovr_weighted | roc_auc_ovo_weighted | balanced_accuracy | average_precision | neg_log_loss | neg_brier_score | adjusted_rand_score | homogeneity_score | completeness_score | v_measure_score | mutual_info_score | adjusted_mutual_info_score | normalized_mutual_info_score | fowlkes_mallows_score | precision | precision_macro | precision_micro | precision_samples | precision_weighted | recall | recall_macro | recall_micro | recall_samples | recall_weighted | f1 | f1_macro | f1_micro | f1_samples | f1_weighted | jaccard | jaccard_macro | jaccard_micro | jaccard_samples | jaccard_weighted
- --randomize
Randomize sample labels to test the stability of and effectiveness of the machine learning algorithm
embedding
List Vectorization methods available.
clep embedding [OPTIONS] COMMAND [ARGS]...
evaluate
Perform Evaluation of the Embeddings.
clep embedding evaluate [OPTIONS]
Options
- --data <data>
Required Path to a set of binned files
- --label <label>
Required Label for the set of binned files
generate-network
Generate Network for the given data.
clep embedding generate-network [OPTIONS]
Options
- --data <data>
Required Path to tab-separated gene expression data file
- --out <out>
Required Path to the output folder
- --method <method>
The method used to generate the network
- Default
interaction_network
- Options
pathway_overlap | interaction_network | interaction_network_overlap
- --kg <kg>
Path to the Knowledge Graph file in tsv format if Interaction Network method is chosen
- --gmt <gmt>
Path to the gmt file if Pathway Overlap method is chosen
- --network_folder <network_folder>
Path to the folder containing all the knowledge graph files if Interaction Network Overlap method is chosen
- --intersect_thr <intersect_thr>
Threshold to make edges in Pathway Overlap method
- Default
0.1
- -rs, --ret_summary
Flag to indicate if the edge summary for patients must be created.
- Default
False
- --jaccard_thr <jaccard_thr>
Threshold to make edges in Interaction Network Overlap method
- Default
0.1
kge
Perform knowledge graph embedding.
clep embedding kge [OPTIONS]
Options
- --data <data>
Required Path to tab-separated gene expression data file
- --design <design>
Required Path to tab-separated experiment design file
- --out <out>
Required Path to the output folder
- --all_nodes
Use this tag to return all nodes (not just patients)
- Default
False
- -m, --model_config <model_config>
Required The configuration file for the model used for knowledge graph embedding in JSON format
- --train_size <train_size>
Size of the training data for the knowledge graph embedding model
- Default
0.8
- --validation_size <validation_size>
Size of the validation data for the knowledge graph embedding model
- Default
0.1
sample-scoring
List Single Sample Scoring methods available.
clep sample-scoring [OPTIONS] COMMAND [ARGS]...
limma
Limma-based Single Sample Scoring
clep sample-scoring limma [OPTIONS]
Options
- --data <data>
Required Path to tab-separated gene expression data file
- --design <design>
Required Path to tab-separated experiment design file
- --out <out>
Required Path to the output folder
- --alpha <alpha>
Family-wise error rate
- Default
0.05
- --method <method>
Method used for testing and adjustment of P-Values
- Default
fdr_bh
- --control <control>
Annotated value for the control samples (must start with an alphabet)
- Default
Control
radical-search
Radical Searching based Single Sample Scoring
clep sample-scoring radical-search [OPTIONS]
Options
- --data <data>
Required Path to tab-separated gene expression data file
- --design <design>
Required Path to tab-separated experiment design file
- --out <out>
Required Path to the output folder
- --control <control>
Annotated value for the control samples (must start with an alphabet)
- Default
Control
- --threshold <threshold>
Percentage of samples considered as ‘extreme’ on either side of the distribution
- Default
2.5
- -rs, --ret_summary
Flag to indicate if the edge summary for patients must be created.
- Default
False
- -cb, --control_based
Run Radical Searching where the scoring is based on the control population instead of entire dataset
ssgsea
ssGSEA based Single Sample Scoring
clep sample-scoring ssgsea [OPTIONS]
Options
- --data <data>
Required Path to tab-separated gene expression data file
- --design <design>
Required Path to tab-separated experiment design file
- --out <out>
Required Path to the output folder
- --gs <gs>
Required Path to the .gmt geneset file
z-score
Z-Score based Single Sample Scoring
clep sample-scoring z-score [OPTIONS]
Options
- --data <data>
Required Path to tab-separated gene expression data file
- --design <design>
Required Path to tab-separated experiment design file
- --out <out>
Required Path to the output folder
- --control <control>
Annotated value for the control samples (must start with an alphabet)
- Default
Control
- --threshold <threshold>
Threshold for choosing patients that are ‘extreme’ w.r.t. the controls. If the z_score of a gene is greater than this threshold the gene is either up or down regulated.
- Default
2.0