Skip to main content
OpenBio provides access to major biological and chemical databases through natural language queries.

Protein Data Bank (PDB)

Purpose: 3D protein structures Available queries:
  • Text search by protein names or descriptions
  • Search by resolution, method, organism
  • Sequence similarity search
  • Structure similarity search
  • Get details for specific PDB IDs
  • Extract sequences from structures
Example: “Find structures of human hemoglobin”

UniProt

Purpose: Protein sequences and functional annotations Available queries:
  • Search by protein name, gene name, or keywords
  • Search by protein function or activity
  • Find proteins with specific domains
  • Filter by organism
  • Get detailed feature annotations
Example: “Find proteins with kinase activity in humans”

AlphaFold DB

Purpose: Access AlphaFold protein structure predictions and annotations Available queries:
  • Get AlphaFold predictions for UniProt accessions
  • View structure quality metrics (pLDDT scores)
  • Access structure file URLs (PDB, CIF formats)
  • Get structure summaries with coverage information
  • Retrieve AlphaFold Missense (AM) annotations
  • View predicted aligned error (PAE) data
Features:
  • Quality metrics: Global confidence scores and per-residue pLDDT distributions
  • Structure files: Direct links to download PDB, CIF, and PAE files
  • Coverage information: Sequence coverage and confidence scores
  • Annotations: Missense variant predictions and functional annotations
Example queries:
  • “Get AlphaFold structure for P01308”
  • “Show me AlphaFold predictions for human insulin”
  • “What’s the AlphaFold structure quality for P12345?”
  • “Get AlphaFold annotations for P01308”
AlphaFold DB contains millions of predicted protein structures. Structures are identified by UniProt accession numbers. Use UniProt search first to find the accession, then query AlphaFold DB for the structure.

ChEMBL

Purpose: Bioactive drug-like compounds and bioactivity data Available queries:
  • Search compounds by name, structure, or properties
  • Search by biological target
  • Filter by bioactivity (IC50, Ki, EC50)
  • Filter by molecular properties
  • Filter by clinical development stage
Example: “Find kinase inhibitors with IC50 < 100nM”

PubChem

Purpose: Chemical compounds and substances Available queries:
  • Find compounds by name or identifier
  • Find structurally similar compounds
  • Find compounds containing specific substructures
  • Calculate molecular properties
  • Access bioactivity assay data
Example: “Show me the structure of aspirin”

PubMed / Literature

Purpose: Scientific literature database Available queries:
  • Search papers by keywords, topics, or research areas
  • Find papers by specific authors
  • Filter by journal or publication date
  • Filter by article type (review, clinical trial, etc.)
  • Find citing papers or references
Example: “Find papers about CRISPR-Cas9 from 2024”

Other Databases

PDBe: Enhanced PDB data with additional annotations
  • Search with enhanced metadata and ligand interactions
MyGene: Gene annotation and information
  • Search genes by name or symbol
  • Get genomic location and cross-references
MyDisease: Disease annotations and associations
  • Search diseases and associated genes/variants
MyVariant: Genetic variant annotation
  • Search variants and get clinical interpretations
  • Get population frequency data
MyTaxon: Taxonomic information
  • Search organisms and get taxonomic lineage
MyChemInfo: Chemical annotation and drug information
  • Get drug annotations and chemical properties
ArXiv / BioRxiv: Preprint servers
  • Search preprints before peer review
Web Search: General scientific web search
  • Search scientific websites and documentation

Query Tips

  • Use specific identifiers (PDB: 1ABC, UniProt: P12345, PubMed: PMID:12345678)
  • Always specify the organism (e.g., “human insulin” not “insulin”)
  • Start broad and narrow down with filters
  • OpenBio can query multiple databases in one conversation