Skip to content

Clarify fasta header naming with uc to biom #3

@dridk

Description

@dridk

I trying to do a simple test , but I don't understand how fasta header are proccess.
For exemple, I have One sample test.fa with the following reads :

>A_sample1
AGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACA
>A_sample2
ATGGTCGTATATATATGGTCGTATATATATGGTCGTATATATATGGTCGTATATATATGGTCGTATATATATGGTCGTATATAT
>A_sample3
ATGGTCGTGTCGTGTCGTGTCGTATATATATCGGTCGTGTCGTGTCGTGTCGTGTCGTATGTCGTGTCGTGTCGTGTCGTATAT
>A_sample4
ATGGTCGTGTCGTGTCGTGTCGTATATATATCGGTCGTGTCGTGTCGTGTCGTGTCGTATGTCGTGTCGTGTCGTGTCGTATAT
>A_sample5
ATACGTGTATGATATGCGGTGTAATACGTGTATGATATGCGGTGTAATACGTGTATGATATGCGGTGTAATACGTGTATGATAT
>A_sample6
ATACGTGTATGATATGCGGTGTAATACGTGTATGATATGCGGTGTAATACGTGTATGATATGCGGTGTAATACGTGTATGATAT
>A_sample7
AGAACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACA
>A_sample8
AGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACA
>A_sample9
AGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACAAGATACA
>A_sample10
ATGGTCGTGTCGTGTCGTGTCGTATATATATCGGTCGTGTCGTGTCGTGTCGTGTCGTATGTCGTGTCGTGTCGTGTCGTATAT
>A_sample11
ATGGTCGTGTCGTGTCGTGTCGTATATATATCGGTCGTGTCGTGTCGTGTCGTGTCGTATGTCGTGTCGTGTCGTGTCGTATAT
>A_sample12
ATGGTCGTGTCGTGTCGTGTCGTATATATATCGGTCGTGTCGTGTCGTGTCGTGTCGTATGTCGTGTCGTGTCGT

I cluster them using :

vsearch --cluster_fast test.fa --id 0.97 --centroids centroids.fa --sizeout --uc test.uc --relabel_sha1 --relabel_keep

Now I want to convert them to biom using your script :

python create_otu_table_from_uc_file.py -i test.uc -o test.biom

I get the following error :

Error in uc file formating. Check for spaces in sample IDs and to make sure there is a semicolon after sample IDs.
First line with issue:
S       0       84      *       *       *       *       *       A1      *
100.0%
Writing table...

I thinks fasta header should keep a rule, but I don't know how... Could you make me a simple exemple to make me understand ?
Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions