-
Notifications
You must be signed in to change notification settings - Fork 9
Description
I'm trying to use the VariantAnnotation package to annotate a VCF file from nf-core/sarek (UCSC style) using AnnotationHub EnsDb objects (Ensembl style), where I encountered the following issues:
- The VariantAnnotation package does not accept
EnsDbobjects. I raised this (Allow 'ANY' txdb, eg. EnsDb objects Bioconductor/VariantAnnotation#74), but it will probably not get fixed; I worked around this by providing my own S4 method in my code - The
seqlevelsStyles()do not match
For point (2), I can change the EnsDb style to UCSC:
ens106 = AnnotationHub::AnnotationHub()[["AH100643"]]
seqlevelsStyle(ens106) = "UCSC"and then either load UCSC-style genome or change the Ensembl-style genome to UCSC:
asm = BSgenome.Hsapiens.NCBI.GRCh38::BSgenome.Hsapiens.NCBI.GRCh38
seqlevelsStyle(asm) = "UCSC"
asm = BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38However, this does not work, because the genome() of the EnsDb object is GRCh38, while the one of the assembly is hg38, raising an assertion error in VariantAnnotation. So this needs an additional line changing the internal state of the S4 genome object (we can't change the EnsDb object):
genome(ens106)[] = "hg38"
# Error: unable to find an inherited method for function 'seqinfo<-' for signature 'x = "EnsDb"'
asm@seqinfo@genome[] = "GRCh38" # works, but is messing with internals
# this was previously the default, but they explicitly changed itI'm not sure what a good solution is here. It seems to be that a check if they genomes are identical (as performed by VariantAnnotation) is reasonable. I'm raising this issue more to document it rather than suggesting a change in ensembldb.