BacQuerya

Search for isolate names (ENA, SRA), species, metadata (country, serotype) or any combination of the above, flexibly.

BacQuerya is currently 'enhanced' for the following species, and has additional linked, searchable gene and sequence data:

  • Streptococcus pneumoniae

Isolate search
  • SearchingSearch through a flexible index of samples and their metadata:
    • An accession ID associated with isolate in external databases (e.g. NCBI BioSample, ENA numbers, other sample name, GCF number).
    • A species (e.g. Streptococcus pneumoniae or E coli).
    • A country name (e.g. Nepal).
    • A strain or serotype name.

    You can search as with a search engine, tolerating mispellings and combining terms (e.g. 'streptococcus pnuemoniae nepal 23F').

    Results are returned ordered by match to your query. Within this, we try to return 'high quality' samples such as reference sequences or those with uncontaminated assemblies at the top of the results.

  • Filtering

    You can filter the results directly using the provided toggles, search again after adjusting them. Select 'exact matches' to remove any 'fuzzy' matches to your query.

  • Downloading sequences

    You can get the download links for results by clicking 'Download all sequences'. Above 100 sequences, this will be sent by email, or after a short wait.

    This is a work in progress, and we will add more functionality to get data out of BacQuerya in the near future.

  • Isolate Overview

    Clicking on an individual search result will open an isolate overview page, summarising available metadata for that isolate. These include: the species, accession IDs linked to external databases, download links for assemblies or read sets if available, metadata retrieved from the NCBI BioSample database and additional metadata extracted from other information sources. A JSON file with this metadata can be downloaded.

    For enhanced species
    • Assembly statistics, and histograms of these in the species.
    • Contaimination, as measured by mash screen.
    • A searchable list of genes (see below).
Genes (enhanced species only)
  • SearchingSearch through a flexible index of clusters of orthologous genes:
    • A gene name or alias.
    • Annotated gene function.

    Gene cluster have been defined using panaroo.

  • Gene Overview

    Clicking on a result will open a gene overview page, summarising metadata for the gene of interest. The 'Names/Aliases' field displays all publicly seen gene identifiers for this gene and the 'Description(s)' all publicly seen functional annotations.

    Population level information includes gene count and frequency, and a sequence alignment viewer (rendered as an image). This can be scaled to get an overview of the amount and position of variation, and only SNP sites selected. This alignment currently includes one sample from each strain (as defined by PopPUNK).

    An inverse lookup table of isolates with this gene in this species is shown below, similar to the top level isolate results.

Sequence (enhanced species only)
  • Searching

    Genes can also be searched through using a nucleotide sequence query. Search sequences must be at nucleotides >=31bp long (as the index was built with 31-mers).

    Sequences are queried using a COBS index by exact k-mer matching. Search results are ranked in descending order by the proportion of matching k-mers between the query sequence and the sequence of the indexed gene and search results link to the gene overview pages.

Study
  • Searching

    Studies can be searched by selecting the 'Study' tab and searching for a title, author, DOI or study topic. Presently this is an interface to PubMed search, so has the same features and results.

  • Study Overview

    Clicking on a search result will load the metadata for that study, retrieved using the CrossRef API).

  • Submitting Supplementary Data (in development)BacQuerya will eventually be expanded to link studies and the isolates they contain (and vice-versa). This is not yet automated, but if you'd like to help, you can submit lists of isolate accessions for returned studies.
Authors and Contributors

Dr John LeesDaniel Anderson, and Bruhad Dave

Search for bacterial genomes and metadata

Search
Examples: 'streptococcus pneumoniae nepal 23F', 'e coli IAI1'
Filters