- CFA Exams
- 2025 Level II
- Topic 1. Quantitative Methods
- Learning Module 7. Big Data Projects
- Subject 1. Steps in Executing a Data Analysis Project
Why should I choose AnalystNotes?
Simply put: AnalystNotes offers the best value and the best product available to help you pass your exams.
Subject 1. Steps in Executing a Data Analysis Project PDF Download
Big data is defined by the 4Vs:
- Volume: huge amount of data.
- Variety: the array of available data sources.
- Velocity: the high speed of accumulation of data.
- Veracity: the credibility and reliability of different data sources.
The main steps for traditional ML model building are:
- conceptualization of the problem: state the problem, define objectives, identify useful data points, and conceptualize the model. It is like a blueprint.
- data collection: search for and download the raw data from one or multiple sources.
- data preparation and wrangling: cleansing and organizing raw data into a consolidated format.
- data exploration
- model training
For textual ML model building, the first four steps differ somewhat from those used in the traditional model:
- text problem formulation
- text curation
- text preparation and wrangling
- text exploration
- model training
Note the last step is the same for both: model training.
User Contributed Comments 0
You need to log in first to add your comment.

Thanks again for your wonderful site ... it definitely made the difference.

Craig Baugh
My Own Flashcard
No flashcard found. Add a private flashcard for the subject.
Add