What is Data Science? What Data Scientists Do?
Data Science is the art of turning data into decisions by combining machine learning, software development, and traditional research to solve real-world problems.
I know I know... But Explain Practically & Simply, So I know what Data Scientists DO!?
I hear You!
Look at this Image & Read the concepts in Data Science before you scroll further.
Now:
Lets Start with our School Memories....
Your Old school wants to find out which students are likely to fail in final exams, so they can give extra support early. But How can they find out?
Of course, they can ask teachers. But teachers may have some favorite students, Some enemy students and some Own assumptions.
Assuming who is better student and who is worst student randomly is a Risky Calculation. It cannot be always Correct.
So, your School Needs a Data Scientist. Because Data Scientist Works with Data, Not with Assumptions.
A Data Scientist who never even met the Students before, can make Right Calculations and Give Correct results because He Trust Data, not his Own Assumption.
So, Now they called you & you came to your Old School & Your Project Starts.
Traditional Research:
First, you will gather all the Student Marks from Class Test, Mid term test, Monthly test etc.. from Teachers for All Students.
You will take Attendance record, Assignment Record, Assignment Marks, Punctuality Record and Behavior Records from Teachers.
You wont ask the Teachers to tell all this. You will ask Teachers to give written records for Data Credibility (Quality).
You will Copy all this Information into a Excel File for further processing.
Now, all these are called Traditional Research. You are doing Research about Students.
But, What if this Data is not available. What if the teachers are saying that, they don't have proper data in hands for all students.
No worries. We have...
Software Development:
Wait!
It is not that High level coding and programming Work. You are not creating a JAVA or Android Software Here.
You are creating a Software like System, a Small App like System. That's it.
You will use Excel or another App like School Management Software, Mobile Apps or anything to build a System that stores daily Student data.
A Data Scientist does not have to build the full software by themselves.
They should take help from IT or software developers by explaining your needs.
Now this System Collect and Manage all Student data like Marks, Attendance and Behavior of Students Daily.
Your Teachers can use this App to enter the students Punctuality, Attendance and Behavior. Your Students can also use this App to provide feedback on teaching methods of teachers, Feedback on Hard Subjects and any other issues they are facing.
Now, you will get the data directly from Students and Teachers using this App System. This is a Credible Data, because it is coming directly from people to you.
Now...
Machine Learning:
A Simple Machine Learning Model is used to find which students are at risk of losing exam, based on the data you have in hands.
How you do it, lets see here:
1. You have data of students. Now you will split the data into 2 parts.
If there are 1000 Students in your School. You will take 800 Students Data Separately and 200 Students Data Separately.
You got it right? (80% vs 20%)
2. This 800 Students Data (80%) you will use for training the Model. How will you train the model? Don't worry. we will see that soon.
3. Remaining 200 students Data (20%) you will use testing the model. How will you test the model? We will see that too.
You are taking 80% of the data to teach Machine Learning model, how the data is present and how the students are performing.
You are using remaining 20% of the data to test if Machine Learning model is able to predict correct Future data.
Examples:
You will learn cooking for 2 Years. After 2 Years, One day you will buy food from hotel + you also cook. Now you compare the taste and see whether your cooking matches with hotel food.
You will do exercise for years. After few years, you will compare yourself with Models and Actors to see if your Physique matches with them.
So, what is happening here - First you are training yourself 80% time, then you are testing yourself in 20 Minutes. Right?
Training is Important. Testing is Equally Important.
The Same Exact Method.
For this Machine Learning Model Training and Testing, You will use Python Platform. You can use Some Libraries (Readymade Coding) to do this Machine Learning calculations.
The Model will first learn from 80% data and test with 20% data. This will give Accuracy in Predicting Future!
Now, using this Machine Learning Results, You can See what will be the Students Performance for Next few months, for the Final exam.
By Using machine learning, You are not waiting to see who fails the Exam. You are finding out early, so we can make sure the student will Escape Failure.
You will use Graphs, Flow Chart to see this Future.
Not only Python, you can do Machine Learning in R language, Excel, Google sheets etc. etc.. But Python is Most Preferred here.
You Need Knowledge of your Department / Domain , Computer technologies / Coding Experience & Mathematic Concepts to be a Better Data Scientist.
This is just a Simple Example From Education Sector.
Data Scientists are Highly Needed in Government, Research, NASA, ISRO, Defense Organizations, World Meteorological Organizations, IT and Customer Service Companies, Automobile Companies, Aerospace Companies, Marine Departments, Public and Private Sector Companies etc..
Please ask me if any clarification needed. You can also provide your Feedback in Comments.
Comments
Post a Comment