Fantasy castle overlooking a quiet town
Back to Work

Nasdaq Verafin

Datalake Developer Intern

Contributed to an enterprise lakehouse platform that ingested, transformed, validated, and prepared high-volume financial data into standardized, linked, warehouse-ready datasets.

May 2024 - Aug 2024 & Jan 2025 - Aug 202518 Hebron Way, St. John's, NL, Canada
LakehouseWarehouse PrepSparkAWS

Chapter I

The Journey Begins

The internship is presented as a journey through a data lake: calm on the surface, but supported by layered systems, processing logic, and cloud infrastructure underneath.

Field Note

A field note from the archive

I worked on the warehouse data preparation layer of a regulated fintech lakehouse platform, with emphasis on source-domain transformation, schema standardization, cross-domain data splitting, object identifier linking, incremental/versioned outputs, and validation workflows.

Company

Nasdaq Verafin

Focus

Cloud data systems

Location

18 Hebron Way, St. John's, NL, Canada

Chapter II

The Work Behind the Magic

A grimoire-style archive of the main engineering themes from the internship: warehouse preparation, cross-domain splitting, validation, and cloud-based debugging.

Grimoire Page

Applied Spells

Processing and orchestration

Spell 01

Primary Record

Standardized, linked outputs

Warehouse Data Preparation

Converted transformed source-domain data into standardized warehouse-ready records with consistent schemas, record keys, and version/precombine fields.

ScalaSparkSchema Mapping

Spell 02

Relationship Seal

Embedded records to target domains

Cross-Domain Splitting

Built and validated split logic that extracted embedded domain data, mapped it to the correct target schema, and preserved links back to the source records.

NormalizationObject IDsRelationships

Grimoire Page

Protective Wards

Lake structure and reliability

Spell 03

Integrity Ward

Tests, fixtures, and notebook checks

Validation and Reconciliation

Checked schemas, counts, null patterns, duplicates, identifiers, relationships, dates, and lake-versus-database outputs to catch transformation issues early.

Unit TestsPipeline TestsJupyterSQL

Spell 04

Operations Codex

Distributed jobs and build pipelines

Cloud Debugging and CI/CD

Investigated failed builds and cloud pipeline runs by reading CI output, execution histories, Spark job details, and distributed processing logs.

AWSEMR/SparkCloudWatchJenkins

Chapter III

Arcane Systems

The technical stack is treated like a codex of disciplines: implementation languages, AWS infrastructure, Spark processing, lakehouse storage, validation tools, and CI/CD practices.

Discipline 01

Languages

4

Implementation and validation languages used across data engineering workflows.

ScalaPythonSQLJava

Discipline 02

Cloud and Orchestration

5

AWS services used for object storage, orchestration, monitoring, metadata, and managed compute.

AWS S3AWS LambdaAWS EMRStep FunctionsCloudWatch

Discipline 03

Data Processing and Storage

4

Distributed processing and lakehouse-oriented storage patterns for large financial datasets.

Apache SparkApache ParquetHudi-style storageAWS Glue

Discipline 04

Warehouse and Validation

4

Tools and practices used to confirm reliable warehouse-ready outputs.

PostgreSQL/RDSJupyter notebooksFixture testsLake-to-database reconciliation

Discipline 05

Engineering Workflow

5

Professional development practices used to ship and validate changes safely.

GitGradleJenkinsPull requestsCode review

Chapter IV

Map of the Datalake

The workflow follows a south-to-north route across the map, turning heterogeneous financial records into standardized, linked, validated, warehouse-ready lakehouse datasets.

Frieren world map used as the datalake journey map

Chapter V

The Fellowship

The team structure behind the data platform group during my work term.

Org chart of the Nasdaq Verafin data platform team

Chapter VI

Memories from the Journey

A small visual archive using the work-term assets already present in the project.

Nasdaq Verafin datalake work snapshot

Project Snapshot

A visual record connected to the work term experience.

Work term memory

Work Term Memory

A personal image from the internship experience.

Nasdaq Verafin logo

Company Seal

The organization behind the experience.

N

Final Chapter

At Journey’s End

I had an excellent time working with the team! Looking back, one year of work passed by in the blink of an eye. I’m grateful for the opportunity to contribute and to learn so much along the way.