The Talent500 Blog
AI and Machine Learning: The impact on Data Science in 2023 1

AI and Machine Learning: The impact on Data Science in 2023

30th November 2022 will go down in the history of Technology as one of the most seismic days. OpenAI announced the launch of ChatGPT, a natural language processing tool driven by AI capable of having human-like conversations. It’s a state-of-the-art langage model which can generate human-like responses to natural language prompts. It raced to 1 million active users within 5 days of its launch, a feat no other technological advancement had even come close to achieving.

The days following the launch saw significant impact on various professions and fields alike. The attention of the world was also drawn to Artificial Intelligence and Machine Learning and what more was possible using AI and ML. But whilst the world was wondering, the field of AI, ML and Data Science itself was impacted By ChatGPT.

ChatGPT’s impact on AI, machine learning, and data science is an example of how different fields can drive each other’s development and create new opportunities for innovation.

What is Artificial Intelligence, Machine Learning and Data Science?

Artificial Intelligence refers to the development of intelligent machines that can perform tasks that typically require human intelligence, such as speech recognition, problem-solving, and decision-making.  Machine learning is a subset of AI that involves training algorithms to learn patterns in data and make predictions based on new data inputs.  Data science is another subset of AI that involves extracting insights from data using statistical and computational techniques.

AI, machine learning, and data science are closely related fields that intersect in many ways. Machine learning is a fundamental component of AI, as it enables algorithms to learn from data and make predictions based on new inputs. Data science is also a key part of AI, as it involves extracting insights from data that can be used to inform decision-making by intelligent systems.

Together, AI, machine learning, and data science form a powerful combination for building intelligent systems that can learn from data and adapt to new situations. 

Impact Of ChatGPT on Data Science, AI and ML

With the understanding of what AI, ML and Data Science are, let’s now move onto understanding how ChatGPT can be leveraged in certain scenarios to boost productivity and improve output of engineers.

ChatGPT For Dataset Ideas

Finding the right datasets for ideas we wish to test is an important task. However foraging the internet for datasets is tedious manual tasks which can be avoided thanks to ChatGPT. Here is a live demo where I asked for suggestions for datasets for Image Classification problem 

AI and Machine Learning: The impact on Data Science in 2023 2

ChatGPT For A/B Testing

A/B testing is a popular way to test products by comparing two versions of the same products to understand which one is better received by the users. Designing a good A/B test can  determine the success and failure of a campaign. ChatGPT can be immensely helpful in designing  effective A/B tests.

Here is ChatGPT helping us to design a concrete A/B test strategy for an hypothetical ecommerce design change.

AI and Machine Learning: The impact on Data Science in 2023 3

ChatGPT To Generate Data

Not only can ChatGPT recommend datasets but for smaller tests or quick scenarios it can also help generate data. Let’s ask ChatGPT to generate fake data of footballers and some stats. Here is the result.

AI and Machine Learning: The impact on Data Science in 2023 4

ChatGPT For Learning

If you paid close attention to the previous prompt I also suggested to ChatGPT that ‘I wish to implement an ML algorithm to predict the best player’. The response clearly shows that along with data generation it can also help us get an idea of which algorithm is best suited for certain problems. This is definitely a game changer, especially for beginners. 

AI and Machine Learning: The impact on Data Science in 2023 5

ChatGPT For Teaching 

Mentoring beginners is a key task to ensure the next generation of Machine learning engineers, Data Scientists can take over the mantle effectively from the seniors. However not everyone who is a skilled data scientist is a skilled teacher. ChatGPT can come in handy in such scenarios. Here is how a prompt and its response looks like.

AI and Machine Learning: The impact on Data Science in 2023 6

ChatGPT For Analyzing Data

A common challenge that ML engineers, Data Scientists or Data Analysts may sometimes face is that the dataset is too complex and it’s hard to understand. In the pre-ChatGPT era the approach to solve this problem would be to spend hours doing Exploratory Data Analysis(EDA) but not anymore. We can quickly copy paste the data into ChatGPT and ask it to explain it to us. Be careful and responsible here and make sure you do not leak any sensitive data.

For the sake of explanation here is the UFC Fighters dataset from Kaggle. Now, for a person who has no knowledge of UFC this may be hard to relate to and understand. So, we can always ask ChatGPT for help.

AI and Machine Learning: The impact on Data Science in 2023 7

Here is the initial response where it clearly explains what each column means and also its significance.

AI and Machine Learning: The impact on Data Science in 2023 8

AI and Machine Learning: The impact on Data Science in 2023 9

Going one step further, asking it to generate descriptive stats of the data is also possible and it’s smart enough to tell us that not columns are numerical and only generates descriptive stats where it’s possible.

AI and Machine Learning: The impact on Data Science in 2023 10

ChatGPT For Communicating With Business Stakeholders

A key responsibility of data scientists is leasing with business stakeholders to communicate insights revealed by data. However, speaking technical jargons in a business meeting is less than ideal and can often lead to gaps as business people may not understand them.

ChatGPT can be effectively leveraged to bridge the gap in communication between technical and business teams. Let’s continue the example we used above and ask ChatGPT what kind of data visualization can be drawn on the dataset to explain to business.

AI and Machine Learning: The impact on Data Science in 2023 11

Taking this one step further, ChatGPT can easily provide code for various plots. Here is it showing me code to write bar plots using Matplotlib for the UFC fighters dataset.

AI and Machine Learning: The impact on Data Science in 2023 12

ChatGPT For Explaining Code

Taking over code bases can be challenging as everyone has their own styles and understanding which they use. Looking at certain code and spending hours trying to figure it out is less than ideal as the same time can be used for better tasks.

For example consider this implementation of Logistic Regression, a popular model used for classification and prediction. For a new person, understanding this can be difficult, so we can ask ChatGPT to make this task a bit easier.

AI and Machine Learning: The impact on Data Science in 2023 13

AI and Machine Learning: The impact on Data Science in 2023 14

This is just the tip of the iceberg and only a few parts of the job so AI/ML Engineers and data scientists that ChatGPT has changed. It’s clear to see the positive impact it’s having. Many tasks which in the pre-ChatGPT era would’ve taken hours can now be done in a matter of minutes and the time freed up can be put to use for solving business problems and doing more creative tasks.

Other tasks which can  definitely be improved and automated with the help of ChatGPT are

  1. Summarizing texts and long papers. 
  2. Writing assistant to write simple summaries for complex jargons and concepts
  3. Code translation from one language to another i.e. Python to R and vice versa, Python to JavaScript and vice versa.
  4. Debugging to resolve errors instead of having to go to stackoverflow and other forums

Limitations Of ChatGPT

Whilst it’s easy to rave on about the advantages and the impact of ChatGPT it still has its own limitations and we should be careful when using it. Some of its limitations have been highlighted below

Failure to solve complex tasks
ChatGPT is good at doing simple tasks like generating simple code snippets, finding bugs in existing code but it is not capable of doing complex coding tasks which require human intelligence.

Bias

ChatGPT, like all other AI models, is trained on data available on the internet. As such it may be prone to bias and hence the results should be treated with caution.

Data Security Concerns

When working for large corporations, the data is often private and confidential. This can include code base and datasets. So exposing this to an AI mode like ChatGPT will be less than ideal. Many organizations are bound to have valid concerns with respect to their data.

Out Of Distribution Sample

Language models, including this one, may produce incorrect results or nonsensical responses when working with texts that are dissimilar from those seen during the training phase. This is referred to as an “out of distribution sample” and can result in poor performance

Conclusion

It’s clear now that there are going to be two kinds of people, one who leverages AI smartly to become more productive and second who lets AI take over the job and we all definitely want to be in the first bracket. ChatGPT is a smart second brain and should be used in that capacity only. It is not yet a replacement to humans be it data scientists or machine learning engineers.

It can be effectively used to delegate tasks which are monotonous, repetitive and do not require human judgment.  It also does really well in bridging gaps in communication between technical and business teams by acting as a mediator. 

Another important point which is often overlooked is ChatGPT serves as a single platform for idea generation, problem solving, debugging and brainstorming. There is no longer a need to jump between Google, Stackoverflow and other forums to search for answers to questions, everything is available in a single place. The amount of time and mental bandwidth it saves monmental. 

In summary, ChatGPT has fast become a second brain and a digital intern. With responsible usage it can serve as an valuable asset in the repertoire of data scientists and machine learning engineers.

 

0
Jayadeep Karale

Jayadeep Karale

Hi, I am a Software Engineer with passion for technology.
My specialization's include Python Machine Learning/AI Data Visualization Software Engineering. I am a Tech educator helping people learn via Twitter, LinkedIn, YouTube.

Add comment