Understanding Figures and Mappings

After the pipeline steps are complete, you have a DataFrames that contains data you wish to visualize. However, you still have not told White Label Data how to visualize your data. Perhaps you want to create a bar chart or line chart, draw a map, of create some HTML representation of that data. In order to accomplish this, you need to build a figure that contains the data, layout, and any other attributes needed to fully draw your visualization.

What is a figure?

A figure is a set of data and attributes that is in a format expected by the visualization library. White Label Data includes built-in visualization libraries for Plotly, Mapbox, single value indicators, and filters. Each of these built-in libraries will expect a specific figure structure. For example, a Plotly visualization that intends to draw a line chart, will expect a data series with x and y attributes containing a list of values to plot. Likewise, a Mapbox visualization expects data to be in GeoJSON format in order to draw the map. A single value indicator is more simple and expects only a couple simple values such as value.

Each type of visualization will place unique requirements on what attributes are needed. White Label Data is designed to be flexible about allowing any type of figure structure, in order to support any visualization library. If you are not an advanced user, you will likely make use of existing figures. We call these existing, reuasble figures base types, and they provide a template for various common types of charts such as bar charts, line charts, and maps. When using a figure from a base layer, all you need to do is to map results from the pipeline DataFrame into the figure using simple rules.

If you want to customize your own visualizations, you will need to get a little messy with some JSON.

Defining a Figure

A figure is defined, in JSON format, in the <figure> section of a Layer file. Here is an example of a bar chart figure without any data from a DataFrame:

{
    "data": [
        {
            "type": "bar",
            "marker" : {
                "color" : "#8b4367"
            }
        }
    ],
    "layout" : {
        "showlegend": false,
        "plot_bgcolor" : "rgba(0,0,0,0)",
        "paper_bgcolor" : "rgba(0,0,0,0)",
        "font" : {
            "color" : "gray"
        },
        "xaxis" : {
            "tickfont" : {
                "size" : 10
            },
            "color" : "white"
        },
        "yaxis" : {
            "range" : [0, 650],
            "tickfont" : {
                "size" : 11
            },
            "color" : "white"
        },
        "margin" : {
          "l" : 50,
          "r" : 50,
          "t":0,
          "b":100
       }
    }
}

It contains a set of attributes about colors, margins, and font sizes, but no data values to create bars. More than likely, all you will need to do is modify existing figures to change styling settings. If you want to create these figures yourself, you can learn more about the structure of the above Plotly figure example in Create a Plotly Visualization.

What is a mapping?

A mapping is a rule that specifies how to connect data from your DataFrame to the figure. It typically specifes the name of one of more DataFrame columns and a location within the figure to place that data. Each visualization type (bar chart, etc) will place unique requirements on what attributes must be mapped. However, to create a visualization that works, you will need to know which attributes a visualization requires and map those attributes with rules.

While this can seem complicated to understand, it’s really just a matter of getting your data into the figure. Before you map your data into the figure, it contains only static information such as colors and layout. After the data is mapped in, it also contains the points, lines, and values neeeded to draw the visualization. Most importantly, the final mapped figure fully describes the visualization, and no other info is needed.

Query Rendering

You will find mapping rules in the <mapping> section of your Layer file. There are two types of mappings: rules and attributes.

Mapping Attributes

Mapping attributes are typically used in base layers to define the set of attributes that are required for the visualization to work. For existing base layers, these attributes are already defined for you for that base layer. This is where a chart, such as a bar chart, indicates that it needs an xaxis and yaxis column and a map base layer tells us that it needs latitude and longitude. For example, in a base layer, you might see:

{
    "attributes" : [
        {
            "name" : "latitude",
            "type" : "array",
            "figure_path" : ["data", 0, "lat"]
        },
        {
            "name" : "longitude",
            "type" : "array",
            "figure_path" : ["data", 0, "lon"]
        }
    ]
}

The above says there is an attribute called latitude that requires a rule (see below). Any layer that depends on this layer is responsible for creating a rule that maps an ‘array’ or column to this attribute. The figure path tells White Label Data where to put the array once you map it in. In this case, it’s say that the data should go under the data section in the JSON figure, first series, using a property name of lat. We didn’t make up that location. It’s required by the visualization that’s consuming it, in this case Plotly. They have defined a structure for where the latitude data should be in their figures in order to be mapped.

Mapping Rules

As a consumer of a base layer, your visualization is on the hook to map in all the required attributes defined for the visualization type. Using the above example, you would create a layer that is dependent on a base layer, and provides the following mapping rules:

{
    "rules" : [
        {
            "attribute" : "latitude",
            "column_name" : "latitude"
        },
        {
            "attribute" : "longitude",
            "column_name" : "longitude"
        }
    ]
}

These rules are mapping a column name from a pipeline’s DataFrame to an attribute defined in the base layer. The base layer attributes, in turn, specify where to put the column, which is an array, in the final figure. When all the rules are processed for a given visualization, the figure is complete and can be sent to the client for rendering using Javascript and HTML.

Mapping Options

Here are the options for mapping attributes that can apply to either attributes or mappings, as specified:

Option When To Use Description
name Attributes The name of the attribute that will be used in mapping rules.
type Attributes or Rules The datatype of the data that is required. Choices are array, literal, list_of_arrays, single-value, alternating_array,geojson, dataframe_as_records, drilldown_url, and map_center.
figure_path Attributes or Rules A list of paths to walk to get to the location where the data should be mapped. Because JSON is a nested structure, each item walks into the nested structures. For example, ["data", 0, "lat"] means that the code looks first for a property named “data” at the top level. It then looks for index 0, and within index 0 it maps data under a property called lat.
attribute_name Rules The name of the attribute you are mapping to.
dataframe Rules Optional. Specifies the name of the dataframe to use. If no dataframe is specified, it will use the last one created by the pipeline.
column_name Attributes or Rules The name of the column to map from the DataFrame. Applies to array, single-value, and drilldown_url types. For drilldown_url, it will encode and map a Plotly-specific drilldown column.
column_name_list Attributes or Rules A list of column names to map. Used with list_of_arrays type. For example, this is used to list the columns that should be included in a table.
odd_value alternating_array An alternating array is an array that contains alternating, hard-coded values. This is useful for creating a table with alternating row colors. Specifies the value for odd members of the array.
even_value alternating_array Specifies the value for even members of the array.
geometry_column_name geojson Specifies that this column contains a GeoJSON geometry.
point_lat_column_name geojson Specifies that this is a latitude column that should be used to create GeoJSON for plotting points.
point_lon_column_name geojson Specifies that this is a longitude column that should be use to cretae GeoJSON for plotting points.
property_column_name_list geojson A list of columns to include in the properties section when generating GeoJSON.
lat_column_name map_center Finds the average latitude from a column and maps into Plotly visualization. Useful for centering a map.
lon_column_name map_center Finds the average longitude from a column and maps into a Plotly visualization. Useful for centering a map.

Note: The mapping type dataframe_as_records includes the entire DataFrame table in the figure. This is useful for HTML visualizations that want to walk the table directly and build the DOM in the HTML markup itself. See Create a Custom Visualization.