Glow Logo
latest
  • Introduction to Glow
  • Getting Started
  • Variant Data Manipulation
    • Read and Write VCF, Plink, and BGEN with Spark
    • Read Genome Annotations (GFF3) as a Spark DataFrame
    • Create a Genomics Delta Lake
    • Variant Quality Control
    • Sample Quality Control
    • Liftover
    • Variant Normalization
    • Split Multiallelic Variants
    • Merging Variant Datasets
    • Utility Functions
  • Tertiary Analysis
  • Troubleshooting
  • Blog Posts
  • Additional Resources
  • Python API
Glow
  • Docs »
  • Variant Data Manipulation
  • Edit on GitHub

Variant Data ManipulationΒΆ

Glow offers functionalities to extract, transform and load (ETL) genomic variant data into Spark DataFrames, enabling seamless manipulation, filtering, quality control and transformation between file formats.

  • Read and Write VCF, Plink, and BGEN with Spark
    • VCF
    • BGEN
    • PLINK
  • Read Genome Annotations (GFF3) as a Spark DataFrame
    • Schema
  • Create a Genomics Delta Lake
    • VCF to Delta Lake table notebook
  • Variant Quality Control
    • Notebook
  • Sample Quality Control
    • Computing user-defined sample QC metrics
  • Liftover
    • Create a liftOver cluster
    • Coordinate liftOver
    • Variant liftOver
  • Variant Normalization
    • normalize_variants Transformer
    • Usage
    • Options
    • normalize_variant Function
  • Split Multiallelic Variants
    • Usage
  • Merging Variant Datasets
    • Aggregating INFO fields
    • Joint genotyping
  • Utility Functions
    • Struct transformations
    • Spark ML transformations
    • Variant data transformations
Next Previous

© Copyright 2019, Glow Authors Revision 796c98e6.

Built with Sphinx using a theme provided by Read the Docs.