Glow Top-Level Functions¶
-
glow.glow.
register
(session: pyspark.sql.session.SparkSession)[source]¶ Register SQL extensions and py4j converters for a Spark session.
- Parameters
session – Spark session
Example
>>> import glow >>> glow.register(spark)
-
glow.glow.
transform
(operation: str, df: pyspark.sql.dataframe.DataFrame, arg_map: Dict[str, Any] = None, **kwargs) → pyspark.sql.dataframe.DataFrame[source]¶ Apply a named transformation to a DataFrame of genomic data. All parameters apart from the input data and its schema are provided through the case-insensitive options map.
There are no bounds on what a transformer may do. For instance, it’s legal for a transformer to materialize the input DataFrame.
- Parameters
operation – Name of the operation to perform
df – The input DataFrame
arg_map – A string -> any map of arguments
kwargs – Named arguments. If the arg_map is not specified, transformer args will be pulled from these keyword args.
Example
>>> df = spark.read.format('vcf').load('test-data/1kg_sample.vcf') >>> piped_df = glow.transform('pipe', df, cmd=["cat"], input_formatter='vcf', output_formatter='vcf', in_vcf_header='infer')
- Returns
The transformed DataFrame