Command Line Interface¶
CLEP commands.
clep¶
Run clep.
clep [OPTIONS] COMMAND [ARGS]...
classify¶
Perform machine-learning classification.
clep classify [OPTIONS]
Options
-
--data
<data>
¶ Required Path to tab-separated gene expression data file
-
--out
<out>
¶ Required Path to the output folder
-
--model
<model>
¶ Required Choose a classification model
- Options
logistic_regression|elastic_net|svm|random_forest|gradient_boost
-
--optimizer
<optimizer>
¶ Required Optimizer used for classifier.
- Options
grid_search|random_search|bayesian_search
-
--cv
<cv>
¶ Number of cross validation steps
- Default
5
-
-m
,
--metrics
<metrics>
¶ Metrics that should be tested during cross validation (comma separated)
- Options
explained_variance|r2|max_error|neg_median_absolute_error|neg_mean_absolute_error|neg_mean_squared_error|neg_mean_squared_log_error|neg_root_mean_squared_error|neg_mean_poisson_deviance|neg_mean_gamma_deviance|accuracy|roc_auc|roc_auc_ovr|roc_auc_ovo|roc_auc_ovr_weighted|roc_auc_ovo_weighted|balanced_accuracy|average_precision|neg_log_loss|neg_brier_score|adjusted_rand_score|homogeneity_score|completeness_score|v_measure_score|mutual_info_score|adjusted_mutual_info_score|normalized_mutual_info_score|fowlkes_mallows_score|precision|precision_macro|precision_micro|precision_samples|precision_weighted|recall|recall_macro|recall_micro|recall_samples|recall_weighted|f1|f1_macro|f1_micro|f1_samples|f1_weighted|jaccard|jaccard_macro|jaccard_micro|jaccard_samples|jaccard_weighted
-
--randomize
¶
Randomize sample labels to test the stability of and effectiveness of the machine learning algorithm
embedding¶
List Vectorization methods available.
clep embedding [OPTIONS] COMMAND [ARGS]...
evaluate¶
Perform Evaluation of the Embeddings.
clep embedding evaluate [OPTIONS]
Options
-
--data
<data>
¶ Required Path to a set of binned files
-
--label
<label>
¶ Required Label for the set of binned files
generate-network¶
Generate Network for the given data.
clep embedding generate-network [OPTIONS]
Options
-
--data
<data>
¶ Required Path to tab-separated gene expression data file
-
--out
<out>
¶ Required Path to the output folder
-
--method
<method>
¶ The method used to generate the network
- Default
interaction_network
- Options
pathway_overlap|interaction_network|interaction_network_overlap
-
--kg
<kg>
¶ Path to the Knowledge Graph file in tsv format if Interaction Network method is chosen
-
--gmt
<gmt>
¶ Path to the gmt file if Pathway Overlap method is chosen
-
--network_folder
<network_folder>
¶ Path to the folder containing all the knowledge graph files if Interaction Network Overlap method is chosen
-
--intersect_thr
<intersect_thr>
¶ Threshold to make edges in Pathway Overlap method
- Default
0.1
-
-rs
,
--ret_summary
¶
Flag to indicate if the edge summary for patients must be created.
- Default
False
-
--jaccard_thr
<jaccard_thr>
¶ Threshold to make edges in Interaction Network Overlap method
- Default
0.1
kge¶
Perform knowledge graph embedding.
clep embedding kge [OPTIONS]
Options
-
--data
<data>
¶ Required Path to tab-separated gene expression data file
-
--design
<design>
¶ Required Path to tab-separated experiment design file
-
--out
<out>
¶ Required Path to the output folder
-
--all_nodes
¶
Use this tag to return all nodes (not just patients)
- Default
False
-
-m
,
--model_config
<model_config>
¶ Required The configuration file for the model used for knowledge graph embedding in JSON format
-
--train_size
<train_size>
¶ Size of the training data for the knowledge graph embedding model
- Default
0.8
-
--validation_size
<validation_size>
¶ Size of the validation data for the knowledge graph embedding model
- Default
0.1
sample-scoring¶
List Single Sample Scoring methods available.
clep sample-scoring [OPTIONS] COMMAND [ARGS]...
limma¶
Limma-based Single Sample Scoring
clep sample-scoring limma [OPTIONS]
Options
-
--data
<data>
¶ Required Path to tab-separated gene expression data file
-
--design
<design>
¶ Required Path to tab-separated experiment design file
-
--out
<out>
¶ Required Path to the output folder
-
--alpha
<alpha>
¶ Family-wise error rate
- Default
0.05
-
--method
<method>
¶ Method used for testing and adjustment of P-Values
- Default
fdr_bh
-
--control
<control>
¶ Annotated value for the control samples (must start with an alphabet)
- Default
Control
radical-search¶
Radical Searching based Single Sample Scoring
clep sample-scoring radical-search [OPTIONS]
Options
-
--data
<data>
¶ Required Path to tab-separated gene expression data file
-
--design
<design>
¶ Required Path to tab-separated experiment design file
-
--out
<out>
¶ Required Path to the output folder
-
--control
<control>
¶ Annotated value for the control samples (must start with an alphabet)
- Default
Control
-
--threshold
<threshold>
¶ Percentage of samples considered as ‘extreme’ on either side of the distribution
- Default
2.5
-
-rs
,
--ret_summary
¶
Flag to indicate if the edge summary for patients must be created.
- Default
False
-
-cb
,
--control_based
¶
Run Radical Searching where the scoring is based on the control population instead of entire dataset
ssgsea¶
ssGSEA based Single Sample Scoring
clep sample-scoring ssgsea [OPTIONS]
Options
-
--data
<data>
¶ Required Path to tab-separated gene expression data file
-
--design
<design>
¶ Required Path to tab-separated experiment design file
-
--out
<out>
¶ Required Path to the output folder
-
--gs
<gs>
¶ Required Path to the .gmt geneset file
z-score¶
Z-Score based Single Sample Scoring
clep sample-scoring z-score [OPTIONS]
Options
-
--data
<data>
¶ Required Path to tab-separated gene expression data file
-
--design
<design>
¶ Required Path to tab-separated experiment design file
-
--out
<out>
¶ Required Path to the output folder
-
--control
<control>
¶ Annotated value for the control samples (must start with an alphabet)
- Default
Control
-
--threshold
<threshold>
¶ Threshold for choosing patients that are ‘extreme’ w.r.t. the controls. If the z_score of a gene is greater than this threshold the gene is either up or down regulated.
- Default
2.0