Data Wrangling: Population and Air Quality Analysis
Analysis of Population and Air Quality
From August to December 2024 our team utilized data analysis tools (such as Python) to better understand how population density relates to air quality in California and Texas' largest metropolitan areas. Air Quality Index (AQI) (Real-time API Service) and Government of California agency data (Real-time API Service) ([https://www.airnow.gov/relationships/aor.html](https://www.airnow.gov/relationships/aor.html)) were used for Urbanization related impacts of Air Pollution. Data wrangling, visualizations and statistical analyses of these datasets yielded numerous key findings including that "Industrial and Agricultural Sources have a greater impact on overall Air Quality than merely Urbanizing the City through Population growth."
Ultimately this analysis indicates an increase in the importance of implementing Sustainable Urbanization and therefore, to the extent to which Urbanization occurs here in California; as well as an overall need for improved Access to that Data on Urbanization so as to provide the ability for Evidence Based Research to be applied to the Urban Suppression of some or all Cities in the US.

Python Learning
February 4, 2025 at 8:03:53 AM
Analysis of Population and Air Quality
Between August 2024 and December 2024, our team embarked on a data project that explored the correlation between air pollution and population density in major cities spread across California and Texas. Utilizing Python programming, our team combined air quality indicators and population figures to determine how urbanization affects air quality.
Project Scope
Objective
To establish the relationship between population size and density and air quality parameters.
Data Sources
Air Quality Data: It is gathered by a public API, emphasizing 10 cities in California and Texas.
Population Statistics: Collected from various datasets in the public domain.
Key Steps and Approach
Data Wrangling
Used pandas for data cleaning and manipulation.
Consolidated the datasets on air quality and population.
Exploratory Data Analysis
Used Matplotlib and Seaborn for visualization.
Used NumPy for statistical computations.
API Handling
Handled API requests through Requests.
Handled rate limits and optimised data retrieval operations.
Geospatial Insights
Used Leveraged Geopandas for mapping the variations of Population Density and AQI.
Key Findings
Population density is a contributing factor to air quality, although other factors like industry and agriculture are of great influence.
In some high-density cities, the air was cleaner, which implied good environmental policies.
This study reinforced the importance of sustainable urban planning and industry regulation.
Impact & Insights
This project has shown how data science techniques can be used to deliver insights into environmental problems. Results suggest the importance of a multi-faceted approach, including the use of policies, technological implementation, and sustainable methods.
Analysis of Population and Air Quality
Between August 2024 and December 2024, our team embarked on a data project that explored the correlation between air pollution and population density in major cities spread across California and Texas. Utilizing Python programming, our team combined air quality indicators and population figures to determine how urbanization affects air quality.
Project Scope
Objective
To establish the relationship between population size and density and air quality parameters.
Data Sources
Air Quality Data: It is gathered by a public API, emphasizing 10 cities in California and Texas.
Population Statistics: Collected from various datasets in the public domain.
Key Steps and Approach
Data Wrangling
Used pandas for data cleaning and manipulation.
Consolidated the datasets on air quality and population.
Exploratory Data Analysis
Used Matplotlib and Seaborn for visualization.
Used NumPy for statistical computations.
API Handling
Handled API requests through Requests.
Handled rate limits and optimised data retrieval operations.
Geospatial Insights
Used Leveraged Geopandas for mapping the variations of Population Density and AQI.
Key Findings
Population density is a contributing factor to air quality, although other factors like industry and agriculture are of great influence.
In some high-density cities, the air was cleaner, which implied good environmental policies.
This study reinforced the importance of sustainable urban planning and industry regulation.
Impact & Insights
This project has shown how data science techniques can be used to deliver insights into environmental problems. Results suggest the importance of a multi-faceted approach, including the use of policies, technological implementation, and sustainable methods.