Skip to content

Lakehouse Tenants

A registry of all research programs and datasets onboarded to the KBase BER Data Lakehouse.

23
Registered Tenants
63
Total Databases

AI ALE

ANL
Database for AI-driven adaptive lab evolution of ADP1. Holds observed mutations crossed with conditions, passages, and strains. Key mutations are linked to specific phenotypes like growth with a novel DAHP biosynthesis enzyme and growth on a range of methoxylated aromatic compounds.
Databases (1)
  • aiale_dataset1

Arkin Lab

Leverages quantitative measurements, precision genetics, and model-driven experimentation to predict, control, and design biological function in the context of these webs.
Databases (1)
  • arkinlab_microbeatlas

BERVO data

BioEPIC data described using the Biological and Environmental Research Variable Ontology.
Databases (3)
  • bervodata_chess
  • bervodata_fao_soils
  • bervodata_hwsd2

BRaVE BREAD

ANL
Bioenergy crop–pathogen interactions, including sorghum anthracnose (Colletotrichum sublineola), to support predictive modeling of pathogenicity and biocontrol.
Databases (0)

No databases registered yet.

ENIGMA

Create predictive models of the impacts of microbial communities on critical processes within ecosystems.
Databases (2)
  • enigma_coral
  • enigma_genome_depot_enigma

ESE

Developing a distributed platform that will help map the molecular determinants of microbial functions back to genomes, accelerating scalable predictive biodesign approaches.
Databases (1)
  • ese_ganymede

Global Users

Shared
Shared datasets and reference genomes accessible to all K-BERDL users.
Databases (18)
  • globalusers_aisynbio_test_1
  • globalusers_carbon_source_phenotypes
  • globalusers_demo_shared
  • globalusers_demo_test
  • globalusers_demo_test_1
  • globalusers_demo_test_2
  • globalusers_gapmind_pathways
  • globalusers_genomes_test1
  • globalusers_kepangenome_parquet_1
  • globalusers_nmdc_core_test
  • globalusers_nmdc_core_test2
  • globalusers_nmdc_core_test3
  • globalusers_nmdc_core_test4
  • globalusers_nmdc_flattened_biosamples
  • globalusers_ontology2
  • globalusers_ontology_source_2
  • globalusers_phenotype_ontology_1
  • globalusers_phenotype_parquet_1

ideas

Databases (1)
  • ideas_dataset1

KBase

Community-driven research platform for systems biology.
Databases (9)
  • kbase_genomes
  • kbase_ke_pangenome
  • kbase_msd_biochemistry
  • kbase_ontology_source
  • kbase_phenotype
  • kbase_uniprot
  • kbase_uniref100
  • kbase_uniref50
  • kbase_uniref90

KBase Knowledge Engine Science

LBNL
Provides experimental phenotypes, protein structures, and microbial ecology data for linking genotype to function.
Databases (10)
  • kescience_alphafold
  • kescience_bacdive
  • kescience_fitnessbrowser
  • kescience_interpro
  • kescience_mgnify
  • kescience_paperblast
  • kescience_pdb
  • kescience_pubmed
  • kescience_test_mika
  • kescience_webofmicrobes

LAMBDA

By unifying structural biology data across BER-supported imaging resources, LAMBDA will transform how researchers discover, integrate, and analyze multimodal datasets. This will accelerate AI-driven insights into protein conformations, the relationship between genes and their expression, biological function and dynamics, and cells and their environment.
Databases (0)

No databases registered yet.

Microbial Discovery Forge

LBNL
AI co-scientist and research observatory for K-BERDL-scale microbial discovery — powered by reusable skills, shared memory, and scalable data.
Databases (0)

No databases registered yet.

Microbial Ecosystems Lab

Use microbiome knowledge to better manage ecosystem function.
Databases (0)

No databases registered yet.

National Energy Technology Laboratory

Currently contains the NETL Produced Water DNA database.
Databases (1)
  • netl_pw_dna

National Microbiome Data Collaborative

Enabling microbiome science by connecting data, people, and ideas.
Databases (4)
  • nmdc_arkin
  • nmdc_flattened_biosamples
  • nmdc_func_annot_freshwater_rivers
  • nmdc_ncbi_biosamples

OPAL

LBNL, ANL, ORNL, PNNL :material-web: opal-doe.org
The Orchestrated Platform for Autonomous Laboratories (OPAL) is a multi-laboratory DOE project to turn biological discovery into a self-driving process — combining AI, robotics, and automated experimentation to accelerate breakthroughs across biology, biotechnology, and energy science.
Databases (0)

No databases registered yet.

Phage Foundry

Interdisciplinary capability building a high-throughput platform for rapid design and development of countermeasures to combat emerging pathogens.
Databases (7)
  • phagefoundry_acinetobacter_genome_browser
  • phagefoundry_ecoliphages_genomedepot
  • phagefoundry_ecoliphagesgenomedepot
  • phagefoundry_klebsiella_genome_browser_genomedepot
  • phagefoundry_paeruginosa_genome_browser
  • phagefoundry_pviridiflava_genome_browser
  • phagefoundry_strain_modelling

Planet Microbe

Enabling the discovery and integration of oceanographicomics, environmental and physiochemical data layers.
Databases (2)
  • planetmicrobe_planetmicrobe
  • planetmicrobe_planetmicrobe_raw

Plant Microbe Interfaces

Focuses on revealing the genomic bases underpinning the selection of symbiotic plant-microbe partnerships, assessing how the physical and chemical environment structures the host plant's microbiome, and determining how microbial community composition and host genetics combine to respond to environmental challenges.
Databases (0)

No databases registered yet.

Soil Microbiome Science

Advancing our understanding of how soil microbial communities respond to—and affect—changing environmental conditions.
Databases (0)

No databases registered yet.

PROTECT

UC Berkeley
Pro/Prebiotic Regulation for Optimized Treatment and Eradication of Clinical Threats (PROTECT) is pioneering microbiome engineering to design probiotic communities that prevent lung infections by pathogens.
Databases (2)
  • protect_genomedepot
  • protect_integration

Reference Data

KBase Team
Collection of reference data that are available to all users and regularly updated by the KBase team.
Databases (0)

No databases registered yet.

United States Geological Survey

Sharing produced water and river geochemistry data for cross-tenant analysis.
Databases (1)
  • usgs_produced_waters