Lib

Async Tools

bamboo.lib.async.call_async(function, *args, **kwargs)[source]

Potentially asynchronously call function with the arguments.

Parameters:
  • function – The function to call.
  • args – Arguments for the function.
  • kwargs – Keyword arguments for the function.

Date Tools

bamboo.lib.datetools.parse_timestamp_query(query, schema)[source]

Interpret date column queries as JSON.

bamboo.lib.datetools.recognize_dates(df, schema=None)[source]

Convert data columns to datetimes.

Check if object columns in a dataframe can be parsed as dates. If yes, rewrite column with values parsed as dates.

If schema is passed, convert columes to datetime if column in schema is of type datetime.

Parameters:
  • df – The DataFrame to convert columns in.
  • schema – Schema to define columns of type datetime.
Returns:

A DataFrame with column values convert to datetime types.

JSON Tools

exception bamboo.lib.jsontools.JSONError[source]

For errors while parsing JSON.

bamboo.lib.jsontools.df_to_json(df)[source]

Convert DataFrame to a list of dicts, then dump to JSON.

bamboo.lib.jsontools.df_to_jsondict(df)[source]

Return DataFrame as a list of dicts for each row.

bamboo.lib.jsontools.get_json_value(value)[source]

Parse JSON value based on type.

bamboo.lib.jsontools.series_to_jsondict(series)[source]

Convert a Series to a dictionary encodable as JSON.

Mongo Utilities

bamboo.lib.mongo._is_invalid_for_mongo(key)[source]

Return if string is invalid for storage in MongoDB.

bamboo.lib.mongo.df_mongo_decode(df, keep_mongo_keys=False)[source]

Decode MongoDB reserved keys in this DataFrame.

bamboo.lib.mongo.dict_for_mongo(_dict)[source]

Encode all keys in _dict for MongoDB.

bamboo.lib.mongo.dump_mongo_json(obj)[source]

Dump JSON using BSON conversion.

Args:

Parameters:obj – Datastructure to dump as JSON.
Returns:JSON string of dumped obj.
bamboo.lib.mongo.key_for_mongo(key)[source]

Encode illegal MongoDB characters in string.

Base64 encode any characters in a string that cannot be MongoDB keys. This includes any ‘$’ and any ‘.’. ‘$’ are supposed to be allowed as the non-first character but the current version of MongoDB does not allow any occurence of ‘$’.

Parameters:key – The string to remove characters from.
Returns:The string with illegal keys encoded.
bamboo.lib.mongo.remove_mongo_reserved_keys(_dict)[source]

Remove any keys reserved for MongoDB from _dict.

Check for MONGO_ID in stored dictionary. If found replace with unprefixed, if not found remove reserved key from dictionary.

Args:

Parameters:_dict – Dictionary to remove reserved keys from.
Returns:Dictionary with reserved keys removed.
bamboo.lib.mongo.reserve_encoded(string)[source]

Return encoding prefixed string.

bamboo.lib.mongo.value_for_mongo(value)[source]

Ensure value is a format acceptable for a MongoDB value.

Parameters:value – The value to encode.
Returns:The encoded value.

Readers

Schema

class bamboo.lib.schema_builder.Schema[source]
_resluggable_column(column, labels_to_slugs, dframe)[source]

Test if column should be slugged.

A column should be slugged if:
  1. The column is a key in labels_to_slugs and

  2. The column is not a value in labels_to_slugs or
    1. The column label is not equal to the column slug and
    2. The slug is not in the dframe‘s columns
Parameters:
  • column – The column to reslug.
  • labels_to_slugs – The labels to slugs map (only build once).
  • dframe – The DataFrame that column is in.
labels_to_slugs

Build dict from column labels to slugs.

rebuild(dframe, overwrite=False)[source]

Rebuild a schema for a dframe.

Parameters:
  • dframe – The DataFrame whose schema to merge with the current schema.
  • overwrite – If true replace schema, otherwise update.
rename_map_for_dframe(dframe)[source]

Return a map from dframe columns to slugs.

Parameters:dframe – The DataFrame to produce the map for.
classmethod safe_init(arg)[source]

Make schema with potential arg of None.

set_olap_type(column, olap_type)[source]

Set the OLAP Type for this column of schema.

Only columns with an original OLAP Type of ‘measure’ can be modified. This includes columns with Simple Type integer, float, and datetime.

Parameters:
  • column – The column to set the OLAP Type for.
  • olap_type – The OLAP Type to set. Must be ‘dimension’ or ‘measure’.
Raises:

ArgumentError if trying to set the OLAP Type of an column whose OLAP Type was not originally a ‘measure’.

Schema Builder

bamboo.lib.schema_builder._slugify_columns(column_names)[source]

Convert list of strings into unique slugs.

Convert non-alphanumeric characters in column names into underscores and ensure that all column names are unique.

Parameters:column_names – A list of strings.
Returns:A list of slugified names with a one-to-one mapping to column_names.
bamboo.lib.schema_builder.filter_schema(schema)[source]

Remove not settable columns.

bamboo.lib.schema_builder.make_unique(name, reserved_names)[source]

Return a slug ensuring name is not in reserved_names.

Parameters:
  • name – The name to make unique.
  • reserved_names – A list of names the column must not be included in.
bamboo.lib.schema_builder.schema_from_dframe(dframe, schema=None)[source]

Build schema from the DataFrame and a schema.

Parameters:
  • dframe – The DataFrame to build a schema for.
  • schema – Existing schema, optional.
Returns:

A dictionary schema.

Utilities

bamboo.lib.utils.combine_dicts(*dicts)[source]

Combine dicts with keys in later dicts taking precedence.

bamboo.lib.utils.is_float_nan(num)[source]

Return True is num is a float and NaN.

bamboo.lib.utils.replace_keys(original, mapping)[source]

Recursively replace any keys in original with their values in mappnig.

Parameters:
  • original – The dictionary to replace keys in.
  • mapping – A dict mapping keys to new keys.
Returns:

Original with keys replaced via mapping.