Metadata
All objects
All metadata objects support the following fields:
* Mandatory
Projects
The following metadata fields are recognized by different interfaces for Projects:
Project Features
Metadata with additional information and features about the project:
Field | Supported values | Description |
---|---|---|
comments | String | Free-form comments |
description | String | Free-form description |
name* | Name‡ |
* Mandatory
Project System Metadata
Metadata entries automatically set by MiGA:
Field | Supported values | Description |
---|---|---|
datasets* | Array of String | List of datasets in the project |
type* | String |
* Mandatory
‡ By default the base name of the project path
Project Flags
Metadata entries that trigger specific behaviors in MiGA:
Field | Supported values | Description |
---|---|---|
ref_project | Path | Project with reference taxonomy {1} |
db_proj_dir | Path | Directory containing database projects {1} {2} |
tax_pvalue | Float [0,1] | Max p-value to transfer taxonomy (def: 0.1) |
haai_p | String | hAAI engine {3} (def: fastaai) |
aai_p | String | AAI engine {3} (def: diamond) |
ani_p | String | ANI engine {3} (def: fastani) |
max_try | Integer | Max number of task attempts (def: 10) |
aai_save_rbm | Boolean | Should RBMs be saved for OGS analysis? |
ogs_identity | Float [0,100] | Min RBM identity for OGS (def: 80) |
clean_ogs | Boolean | If false, keeps ABC (clades only) |
run_clades | Boolean | Should clades be estimated from distances? |
gsp_ani | Float [0,100] | ANI limit to propose gsp clades (def: 95) |
gsp_aai | Float [0,100] | AAI limit to propose gsp clades (def: 90) |
gsp_metric | String | Metric to propose clades: |
ess_coll | String | Collection of essential genes to use {4} |
min_qual | Float (or 'no') | Min. genome quality (or no filter; def: 25) |
distances_checkpoint | Integer | Comparisons before storing data (def: 10) |
{1} This path can be either absolute or relative to the project's path.
{2} This is the location of the databases used by db_project. If not set, it is assumed to be the parent folder of the current project.
{3} Supported values:
blast
,blat
,diamond
(only for hAAI and AAI),fastani
(only for ANI),no
(only for hAAI), andfastaai
(only for hAAI).{4} One of:
dupont_2012
(default), orlee_2019
Project Hooks
Additionally, hooks can be defined for projects as arrays of arrays containing the action name and the arguments (if any). For example, one can define:
or
Supported events:
on_create()
: When createdon_load()
: When loadedon_save()
: When savedon_add_dataset(object)
: When a dataset is added, with nameobject
on_unlink_dataset(object)
: When dataset with nameobject
is unlinkedon_result_ready(object)
: When any result is ready, with keyobject
on_result_ready_{result}()
: Whenresult
is readyon_processing_ready()
: When processing is complete
Supported hooks:
run_lambda(lambda, args...)
run_cmd(cmd)
Datasets
The following metadata fields are recognized by different interfaces for Datasets:
Dataset Features
Metadata with additional information and features about the dataset:
Field | Supported values | Description |
---|---|---|
tax | MiGA::Taxonomy | Taxonomy of the dataset |
quality | String | Description of genome quality |
trna_count | Integer | Number of tRNA elements detected |
trna_aa | Integer | Number of distinct AA with tRNA elements |
dprotologue | String | Taxonumber in the Digital Protologue DB |
ncbi_tax_id | String | Linking ID(s) {1} for NCBI Taxonomy |
ncbi_nuccore | String | Linking ID(s) {1} for NCBI Nucleotide |
ncbi_asm | String | Linking ID(s) {1} for NCBI Assembly |
ebi_embl | String | Linking ID(s) {1} for EBI EMBL |
ebi_ena | String | Linking ID(s) {1} for EBI ENA |
web_assembly | String | URL to download assembly |
web_assembly_gz | String | URL to download gzipped assembly |
see_also | String | Link(s) {1} in the format text:url |
is_type | Boolean | If it is type material |
is_ref_type | Boolean | If it is reference material {2} |
type_rel | String | Relationship to type material |
suspect | Array(String) | Flags indicating a suspect dataset |
{1} Multiple values can be provided separated by commas or colons
{2} This is not a valid type, but it represents the closest available dataset to material that is unavailable and unlikely to ever become available. See also Federhen, 2015, NAR
Dataset System Metadata
Metadata entries automatically set by MiGA:
Field | Supported values | Description |
---|---|---|
type* | String | |
ref | Boolean | |
inactive | Boolean | If auto-processing should stop |
metadata_only | Boolean | Dataset with metadata but without input data |
status | String | Proc. status: complete, incomplete, inactive |
_step | String | For internal control of processing |
_try_ | Integer | For internal control of processing |
| String | Deprecated |
* Mandatory
Dataset Flags
Metadata entries that trigger specific behaviors in MiGA:
Field | Supported values | Description |
---|---|---|
run_ | Boolean | Forces running or not |
db_project | Path | Project to use as database |
dist_req | Array of String | Run distances against these datasets* |
* When searching best-matching datasets, include these datasets even if they are not visited using the medoid tree
Dataset Hooks
Additionally, hooks can be defined for datasets as arrays of arrays containing the action name and the arguments. See above (project hooks) for examples.
Supported events:
on_load()
: When loadedon_save()
: When savedon_remove()
: When removedon_inactivate()
: When inactivatedon_activate()
: When activatedon_result_ready(object)
: When any result is ready, with keyobject
on_result_ready_{result}()
: Whenresult
is readyon_preprocessing_ready()
: When preprocessing is complete
Supported hooks:
run_lambda(lambda, args...)
clear_run_counts()
run_cmd(cmd)
Last updated