Metadata
Last updated
Last updated
All metadata objects support the following fields:
Field | Supported values | Description |
---|---|---|
* Mandatory
The following metadata fields are recognized by different interfaces for Projects:
Metadata with additional information and features about the project:
Field | Supported values | Description |
---|---|---|
* Mandatory
Metadata entries automatically set by MiGA:
* Mandatory
‡ By default the base name of the project path
Metadata entries that trigger specific behaviors in MiGA:
{1} This path can either be absolute or relative to the project's path
{2} This is the location of the databases used by db_project. If not set, it is assumed to be the parent folder of the current project
{3} Supported values:
blast
,blat
,diamond
(only for hAAI and AAI),fastani
(only for ANI),no
(only for hAAI), andfastaai
(only for hAAI){4} One of:
dupont_2012
(default), orlee_2019
Additionally, hooks can be defined for projects as arrays of arrays containing the action name and the arguments (if any). For example, one can define:
or
Supported events:
on_create()
: When created
on_load()
: When loaded
on_save()
: When saved
on_add_dataset(object)
: When a dataset is added, with name object
on_unlink_dataset(object)
: When dataset with name object
is unlinked
on_result_ready(object)
: When any result is ready, with key object
on_result_ready_{result}()
: When result
is ready
on_processing_ready()
: When processing is complete
Supported hooks:
run_lambda(lambda, args...)
run_cmd(cmd)
The following metadata fields are recognized by different interfaces for Datasets:
Metadata with additional information and features about the dataset:
{1} Multiple values can be provided separated by commas or colons
{2} This is not a valid type, but it represents the closest available dataset to material that is unavailable and unlikely to ever become available. See also Federhen, 2015, NAR
Metadata entries automatically set by MiGA:
* Mandatory
Metadata entries that trigger specific behaviors in MiGA:
{1} By default, it uses its own project as database. The path can be absolute or relative to the parent folder of the project
{2} When searching best-matching datasets, include these datasets even if they are not visited using the medoid tree
Additionally, hooks can be defined for datasets as arrays of arrays containing the action name and the arguments. See above (project hooks) for examples.
Supported events:
on_load()
: When loaded
on_save()
: When saved
on_remove()
: When removed
on_inactivate()
: When inactivated
on_activate()
: When activated
on_result_ready(object)
: When any result is ready, with key object
on_result_ready_{result}()
: When result
is ready
on_preprocessing_ready()
: When preprocessing is complete
Supported hooks:
run_lambda(lambda, args...)
clear_run_counts()
run_cmd(cmd)
Field | Supported values | Description |
---|---|---|
Field | Supported values | Description |
---|---|---|
Field | Supported values | Description |
---|---|---|
Field | Supported values | Description |
---|---|---|
Field | Supported values | Description |
---|---|---|
created*
Date of creation
updated*
Date of last update
comments
String
Free-form comments
description
String
Free-form description
name*
Name‡
datasets*
Array of String
List of datasets in the project
type*
String
ref_project
Path
Project with reference taxonomy {1}
db_proj_dir
Path
Directory containing database projects {1} {2}
tax_pvalue
Float [0,1]
Max p-value to transfer taxonomy (def: 0.1)
haai_p
String
hAAI engine {3} (def: fastaai)
aai_p
String
AAI engine {3} (def: diamond)
ani_p
String
ANI engine {3} (def: fastani)
max_try
Integer
Max number of task attempts (def: 10)
aai_save_rbm
Boolean
Should RBMs be saved for OGS analysis?
ogs_identity
Float [0,100]
Min RBM identity for OGS (def: 80)
clean_ogs
Boolean
If false, keeps ABC (clades only)
run_clades
Boolean
Should clades be estimated from distances?
gsp_ani
Float [0,100]
ANI limit to propose gsp clades (def: 95)
gsp_aai
Float [0,100]
AAI limit to propose gsp clades (def: 90)
gsp_metric
String
Metric to propose clades: ani
(def), aai
ess_coll
String
Collection of essential genes to use {4}
min_qual
Float (or 'no')
Min. genome quality (or no filter; def: 25)
distances_checkpoint
Integer
Comparisons before storing data (def: 10)
tax
MiGA::Taxonomy
Taxonomy of the dataset
quality
String
Description of genome quality
trna_count
Integer
Number of tRNA elements detected
trna_aa
Integer
Number of distinct AA with tRNA elements
dprotologue
String
Taxonumber in the Digital Protologue DB
ncbi_tax_id
String
Linking ID(s) {1} for NCBI Taxonomy
ncbi_nuccore
String
Linking ID(s) {1} for NCBI Nucleotide
ncbi_asm
String
Linking ID(s) {1} for NCBI Assembly
ebi_embl
String
Linking ID(s) {1} for EBI EMBL
ebi_ena
String
Linking ID(s) {1} for EBI ENA
web_assembly
String
URL to download assembly
web_assembly_gz
String
URL to download gzipped assembly
see_also
String
Link(s) {1} in the format text:url
is_type
Boolean
If it is type material
is_ref_type
Boolean
If it is reference material {2}
type_rel
String
Relationship to type material
suspect
Array(String)
Flags indicating a suspect dataset
type*
String
ref
Boolean
inactive
Boolean
If auto-processing should stop
metadata_only
Boolean
Dataset with metadata but without input data
status
String
Proc. status: complete, incomplete, inactive
_step
String
For internal control of processing
_try_step
Integer
For internal control of processing
user
String
Deprecated
run_step
Boolean
Forces running or not step
db_project
Path
Project to use as database {1}
dist_req
Array of String
Run distances against these datasets {2}