Finance Data Warehouse project

Course: Data Warehousing (ISM 6208)

Timeline: February 2026 - March 2026

Project Type: Coursework

Technologies Used:
SQL PostgreSQL
Project Description

The S&P 500 Sector Volatility Data Warehouse is a three-schema PostgreSQL dimensional model designed to analyze how macroeconomic conditions, political environment, and corporate events affect S&P 500 sector volatility over time. The architecture separates raw staging (src), star schema analytics (dw), and pre-computed summaries (agg) across two fact tables at different grains, stock-day and sector-day, supported by five dimension tables including a slowly changing stock dimension and conformed macro and political dimensions that join on year_month and date ranges to bridge the mismatch between daily stock and monthly FRED data. Extended with seven advanced analytic SQL queries, the project demonstrates the full breadth of PostgreSQL window functions including ROLLUP and CUBE for hierarchical sector aggregations, RANK and DENSE_RANK on fed funds rate data to expose rank gap behavior under ties, NTILE binning to classify sectors into stable volatility tiers, and a golden cross moving average strategy computing 50-day and 200-day rolling windows with insufficient data filtering for AAPL, JPM, and XOM. Three creative queries apply CORR across all sector pairs revealing unusually high post-crisis correlations that undermine diversification assumptions, LAG and LEAD to compare rate changes against volatility movements across time, and FIRST_VALUE and LAST_VALUE with unbounded window frames to surface ETFC's 865% penny stock recovery as the standout performer of the period.

Ember
Ember Your AI portfolio guide

Want me to explain this project in simpler terms?

Course Information
Data Warehousing

ISM 6208

This course provides a thorough foundation in designing and implementing enterprise data warehouses for business intelligence and analytics. We'll study fundamental architectures including star schemas, snowflake schemas, and dimensional modeling techniques that optimize data warehouses for analytical queries. The curriculum covers designing fact and dimension tables, managing slowly changing dimensions, and building ETL (Extract, Transform, Load) processes to integrate data from multiple source systems. We'll compare traditional data warehouse approaches with modern data lake architectures, learning the trade-offs and appropriate use cases for each. Significant emphasis is placed on data quality, governance, and metadata management, the foundational work that makes data warehouses valuable for decision-making. Topics also include OLAP cubes, data marts, and performance optimization for complex analytical workloads. The hands-on components involve actually designing warehouse schemas and implementing dimensional models, providing practical experience with the tools and techniques used in production environments. This course complements my other database and analytics coursework by focusing specifically on how to structure and manage data for business intelligence, reporting, and analytical applications at enterprise scale.

View All Courses