TDSP Data Science lifecycle for Artificial Intelligence / Data Science projects

Nikesh Pahuja
3 min readOct 23, 2021

For my Artificial Intelligence / Data Science projects, I have found the TDSP Data Science lifecycle to be the most helpful and detailed.

Fig 1. TDSP Data Science lifecycle (Source: Microsoft)

There are a lot of data science lifecycles that one can use to accomplish Artificial Intelligence / Data Science projects. Examples of such lifecycles include CRISP-DM, KDD, and TDSP. At a high level, they have a lot in common. However, I have found The Team Data Science Process (TDSP) Data Science lifecycle from Microsoft to be most detailed and helpful in my projects.

TDSP is an agile and iterative methodology to build and deploy predictive analytics solutions. It helps to improve teamwork. TDSP covers the many stages of the project and allows iterative execution with flexibility.

TDSP provides a Data Science lifecycle to accomplish the development of artificial intelligence/ data science projects. This lifecycle is primarily helpful for Intelligent applications that deploy machine learning or artificial intelligence models for predictive analytics. Additionally, one can apply certain parts of this lifecycle to Exploratory data science projects or improvised analytics.

TDSP lifecycle includes the following four major stages of the Artificial Intelligence / Data Science project (Fig 1)

Business Understanding

This stage includes building an understanding of the business problem, stakeholder needs, and the use case. It may require reading through the knowledge base and project documents, and meetings / interviews with the stakeholders.

Data Acquisition and Understanding

This stage includes understanding and acquiring the data sources, pipeline, frequency, environments, and data structure. It further includes data exploration. It may also include data cleaning, wrangling, and transformation.

Modeling

After Data Acquisition and Understanding stage comes the Data modeling stage. This stage includes Feature engineering steps such as Data transformation and feature selection. The next step is model training. It includes training the model and tuning its hyperparameters. The last step in this stage is model evaluation. It can be achieved via multiple methods such as cross-validation, etc.

Deployment

The model deployment includes deploying and serving the model. For example, it might involve deploying on the cloud or integrating the model as part of an application. It would also include monitoring the performance of the model.

During the monitoring phase, if any defects or trouble reports are generated, the team might have to iterate through some or all of the above stages and steps in the lifecycle.

Customer acceptance

Once the defects have been fixed, and the model is deployed as per customer satisfaction, the project comes to an end. In some instances, models are continuously improved. In that case, after the first deployment, the team will iterate through the above lifecycle continuously or until all the project phases are completed.

For more information on the TDSP process and its implementation for your specific project, please see this link: Team Data Science Process

References:

https://en.wikipedia.org/wiki/Data_mining

--

--