Metadata
Last updated
Last updated
All metadata objects support the following fields:
* Mandatory
The following metadata fields are recognized by different interfaces for Projects:
Metadata with additional information and features about the project:
* Mandatory
Metadata entries automatically set by MiGA:
* Mandatory
‡ By default the base name of the project path
Metadata entries that trigger specific behaviors in MiGA:
° By default:
blast+
. Other supported values:blast
,blat
,diamond
(except for ANI), andfastani
(only for ANI),no
(only for hAAI). If usingdiamond
and/orfastani
, the corresponding software must be installed. Important: These defaults will change in v1.0 to:blast+
for hAAI,diamond
for AAI, andfastani
for ANI.+ One of:
dupont_2012
(default), orlee_2019
Additionally, hooks can be defined for projects as arrays of arrays containing the action name and the arguments (if any). For example, one can define:
or
Supported events:
on_create()
: When created
on_load()
: When loaded
on_save()
: When saved
on_add_dataset(object)
: When a dataset is added, with name object
on_unlink_dataset(object)
: When dataset with name object
is unlinked
on_result_ready(object)
: When any result is ready, with key object
on_result_ready_{result}()
: When result
is ready
on_processing_ready()
: When processing is complete
Supported hooks:
run_lambda(lambda, args...)
run_cmd(cmd)
The following metadata fields are recognized by different interfaces for Datasets:
Metadata with additional information and features about the dataset:
‡ Multiple values can be provided separated by commas or colons
° This is not a valid type, but it represents the closest available dataset to material that is unavailable and unlikely to ever become available. See also Federhen, 2015, NAR
Metadata entries automatically set by MiGA:
* Mandatory
Metadata entries that trigger specific behaviors in MiGA:
* When searching best-matching datasets, include these datasets even if they are not visited using the medoid tree
Additionally, hooks can be defined for datasets as arrays of arrays containing the action name and the arguments. See above (project hooks) for examples.
Supported events:
on_load()
: When loaded
on_save()
: When saved
on_remove()
: When removed
on_inactivate()
: When inactivated
on_activate()
: When activated
on_result_ready(object)
: When any result is ready, with key object
on_result_ready_{result}()
: When result
is ready
on_preprocessing_ready()
: When preprocessing is complete
Supported hooks:
run_lambda(lambda, args...)
clear_run_counts()
run_cmd(cmd)
Field
Supported values
Description
created*
Date of creation
updated*
Date of last update
Field
Supported values
Description
comments
String
Free-form comments
description
String
Free-form description
name*
Name‡
Field
Supported values
Description
datasets*
Array of String
List of datasets in the project
type*
String
Field
Supported values
Description
ref_project
Path
Project with reference taxonomy
db_proj_dir
Path
Directory containing database projects
tax_pvalue
Float [0,1]
Max p-value to transfer taxonomy (def: 0.05)
aai_p
String
Value of aai.rb -p° on AAI (def: blast+)
haai_p
String
Value of aai.rb -p° on hAAI (def: blast+)
ani_p
String
Value of ani.rb -p° on ANI (def: blast+)
max_try
Integer
Max number of task attempts (def: 10)
aai_save_rbm
Boolean
Should RBMs be saved for OGS analysis?
ogs_identity
Float [0,100]
Min RBM identity for OGS (def: 80)
clean_ogs
Boolean
If false, keeps ABC (clades only)
run_clades
Boolean
Should clades be estimated from distances?
gsp_ani
Float [0,100]
ANI limit to propose gsp clades (def: 90)
gsp_aai
Float [0,100]
AAI limit to propose gsp clades (def: 95)
gsp_metric
String
Metric to propose clades: ani
(def), aai
ess_coll
String
Collection of essential genes to use+
min_qual
Float (or 'no')
Min. genome quality (or no filter; def: 25)
Field
Supported values
Description
tax
MiGA::Taxonomy
Taxonomy of the dataset
quality
String
Description of genome quality
dprotologue
String
Taxonumber in the Digital Protologue DB
ncbi_tax_id
String
Linking ID(s)‡ for NCBI Taxonomy
ncbi_nuccore
String
Linking ID(s)‡ for NCBI Nucleotide
ncbi_asm
String
Linking ID(s)‡ for NCBI Assembly
ebi_embl
String
Linking ID(s)‡ for EBI EMBL
ebi_ena
String
Linking ID(s)‡ for EBI ENA
web_assembly
String
URL to download assembly
web_assembly_gz
String
URL to download gzipped assembly
see_also
String
Link(s)‡ in the format text:url
is_type
Boolean
If it is type material
is_ref_type
Boolean
If it is reference material°
type_rel
String
Relationship to type material
suspect
Array(String)
Flags indicating a suspect dataset
Field
Supported values
Description
type*
String
ref
Boolean
inactive
Boolean
If auto-processing should stop
metadata_only
Boolean
Dataset with metadata but without input data
status
String
Proc. status: complete, incomplete, inactive
_step
String
For internal control of processing
_try_step
Integer
For internal control of processing
user
String
Deprecated
Field
Supported values
Description
run_step
Boolean
Forces running or not step
db_project
Path
Project to use as database
dist_req
Array of String
Run distances against these datasets*