Examples
Data Aggregation

Data Aggregation

Overview

In the previous section (Analyze CSV File), we described an example of uploading the City of New York's publicly available Air Quality (opens in a new tab) dataset published by City of New York and described an example of performing a simple analysis.

Here is an example of converting the resulting Python cell into a table and performing an aggregation.

Preparation

As a preliminary preparation, select the Python cell created in Analyse CSV File and:

  1. press the '+' mark (Add cell) button and then Table to convert the Python execution results into a table. Python Cell

  2. The 'Create Table' modal is displayed. here enter the table slug and comments as follows and press the 'Create' button. Create Table

  3. We have now saved the actual data from the Python cell analysis as a table. Python Table

  4. To perform data aggregation on this table, create an SQL cell by pressing SQL from the button with the '+' symbol (Add cell) with a cell in the table selected. SQL Cell

Data aggregation

When the ">_ Open Editor" button is pressed with the SQL cell created in the preliminary preparation in the previous section selected, the following editing screen is displayed. As an example, data aggregation using SQL is performed here. Editor

Approach

The table 'manhattan_air_quality' contains the average nitrogen dioxide (NO2) concentration per year, as analyzed in the Python cell. This can be used to perform more specific data aggregation.

In this section, we will assess the variation in NO2 concentrations from year to year and build a query to gain further insight into the trend.

Query description in SQL cells

Having already done the basic pre-processing for the tabulation, we paste the following query into the edit window to calculate the annual change in NO2 concentrations (year-on-year). This will allow you to tabulate the transition of NO2 concentrations in the Manhattan area. Codeblock

This SQL query uses the LAG function to obtain the NO2 concentration of the previous year and calculate the difference with the current year. As a result, you can see the increase or decrease in NO2 concentration from year to year.

Run the query

After pasting the query into the edit screen, press the 'Run' button at the top right of the screen to execute it. The results of the run are displayed in a tabular format at the bottom of the edit screen and you can now enumerate the NO2 concentrations for each year, the previous year's concentrations and the annual changes. Run Query