Lib¶
Async Tools¶
Date Tools¶
-
bamboo.lib.datetools.
parse_timestamp_query
(query, schema)[source]¶ Interpret date column queries as JSON.
-
bamboo.lib.datetools.
recognize_dates
(df, schema=None)[source]¶ Convert data columns to datetimes.
Check if object columns in a dataframe can be parsed as dates. If yes, rewrite column with values parsed as dates.
If schema is passed, convert columes to datetime if column in schema is of type datetime.
Parameters: - df – The DataFrame to convert columns in.
- schema – Schema to define columns of type datetime.
Returns: A DataFrame with column values convert to datetime types.
JSON Tools¶
Mongo Utilities¶
-
bamboo.lib.mongo.
_is_invalid_for_mongo
(key)[source]¶ Return if string is invalid for storage in MongoDB.
-
bamboo.lib.mongo.
df_mongo_decode
(df, keep_mongo_keys=False)[source]¶ Decode MongoDB reserved keys in this DataFrame.
-
bamboo.lib.mongo.
dump_mongo_json
(obj)[source]¶ Dump JSON using BSON conversion.
Args:
Parameters: obj – Datastructure to dump as JSON. Returns: JSON string of dumped obj.
-
bamboo.lib.mongo.
key_for_mongo
(key)[source]¶ Encode illegal MongoDB characters in string.
Base64 encode any characters in a string that cannot be MongoDB keys. This includes any ‘$’ and any ‘.’. ‘$’ are supposed to be allowed as the non-first character but the current version of MongoDB does not allow any occurence of ‘$’.
Parameters: key – The string to remove characters from. Returns: The string with illegal keys encoded.
-
bamboo.lib.mongo.
remove_mongo_reserved_keys
(_dict)[source]¶ Remove any keys reserved for MongoDB from _dict.
Check for MONGO_ID in stored dictionary. If found replace with unprefixed, if not found remove reserved key from dictionary.
Args:
Parameters: _dict – Dictionary to remove reserved keys from. Returns: Dictionary with reserved keys removed.
Readers¶
Schema¶
-
class
bamboo.lib.schema_builder.
Schema
[source]¶ -
_resluggable_column
(column, labels_to_slugs, dframe)[source]¶ Test if column should be slugged.
- A column should be slugged if:
The column is a key in labels_to_slugs and
- The column is not a value in labels_to_slugs or
- The column label is not equal to the column slug and
- The slug is not in the dframe‘s columns
Parameters: - column – The column to reslug.
- labels_to_slugs – The labels to slugs map (only build once).
- dframe – The DataFrame that column is in.
-
labels_to_slugs
¶ Build dict from column labels to slugs.
-
rebuild
(dframe, overwrite=False)[source]¶ Rebuild a schema for a dframe.
Parameters: - dframe – The DataFrame whose schema to merge with the current schema.
- overwrite – If true replace schema, otherwise update.
-
rename_map_for_dframe
(dframe)[source]¶ Return a map from dframe columns to slugs.
Parameters: dframe – The DataFrame to produce the map for.
-
set_olap_type
(column, olap_type)[source]¶ Set the OLAP Type for this column of schema.
Only columns with an original OLAP Type of ‘measure’ can be modified. This includes columns with Simple Type integer, float, and datetime.
Parameters: - column – The column to set the OLAP Type for.
- olap_type – The OLAP Type to set. Must be ‘dimension’ or ‘measure’.
Raises: ArgumentError if trying to set the OLAP Type of an column whose OLAP Type was not originally a ‘measure’.
-
Schema Builder¶
-
bamboo.lib.schema_builder.
_slugify_columns
(column_names)[source]¶ Convert list of strings into unique slugs.
Convert non-alphanumeric characters in column names into underscores and ensure that all column names are unique.
Parameters: column_names – A list of strings. Returns: A list of slugified names with a one-to-one mapping to column_names.