Mobile Price Prediction using ML Algorithm


In this tutorial, I have implemented a Mobile Price Prediction using different Machine Learning Algorithms. This project will classify the price range of the mobile price. The price ranges from 0-3. We’ll discuss the price range in the dataset. It's a classification problem. Now I have trained a mobile price classification using 3 ML algorithms. This model classifies the range of the mobile based on the different parameters like from camera, touch screen, cores, battery, clock speed, internal memory, battery capacity, etc. After training the model using 3 algorithms, I compared all the models using the graph.  

About dataset:

Bob has started his own mobile company. He wants to give tough fights to big companies like Apple, Samsung etc.

He does not know how to estimate the price of mobiles his company creates. In this competitive mobile phone market you cannot simply assume things. To solve this problem he collects sales data of mobile phones of various companies.

Bob wants to find out some relation between features of a mobile phone(eg:- RAM,Internal Memory etc) and its selling price. But he is not so good at Machine Learning. So he needs your help to solve this problem.

In this problem you do not have to predict the actual price but a price range indicating how high the price is.

Features : 

  1. battery_power : Total energy a battery can store in one time measured in mAh
  2. blue : Has bluetooth or not
  3. clock_speed : speed at which microprocessor executes instructions
  4. dual _sim : Has dual sim support or not
  5. fc : Front Camera megapixels
  6. four_g : Has 4G or not
  7. int_memory : Internal Memory in Gigabytes
  8. m_dep : Mobile Depth in cm
  9. mobile_wt : Weight of mobile phone
  10. n_cores : Number of cores of processor
  11. pc : Primary Camera megapixels
  12. px_height : Pixel Resolution Height
  13. px_width : Pixel Resolution Width
  14. ram : Random Access Memory in Megabytes
  15. sc_h : Screen Height of mobile in cm
  16. sc_w : Screen Width of mobile in cm
  17. talk_time : longest time that a single battery charge will last when you are
  18. three_g : Has 3G or not
  19. touch_screen : Has touch screen or not
  20. wifi : Has wifi or not
  21. price_range : This is the target variable with values of 0(low cost), 1(medium cost), 2(high cost) and 3(very high cost).

Let’s start :-

First of all, import all required libraries like pandas, matplotlib, etc. These libraries are used to load, preprocess and visualize the dataset.

Then load the training and testing dataset using the read_csv function of the pandas module and store into the separate variable train and test.

Now write a code to display the all number of rows and columns.

Now this displays the top 5 rows of training and testing data set using the head function. Head function by default displays the 5 rows. We can increase it by passing an integer value.

Test data set contains an extra feature “id” which is of no use, so drop the “id” feature from the testing data set using the drop method.

Now display our target feature “price range” and as you can see in the graph, 0, 1, 2 and 3 all have 500 rows. It means there is no imbalanced dataset.

Now check the shape of the data set. As you can see, the train dataset contains 2000 rows and 21 columns and the test dataset contains 1000 rows and 20 columns.

Now check that the training data set contains null values or not using the isnull() and sum() method.

Now check the information of the data set using the Info() method.

Now describe the training dataset.

Now check the correlation of the features for the knowledge only. Because there is no need to check correlation. All features are necessary to predict the price range of the mobile.

Then check the outliers in the dataset. But no outliers are present.

Now split the dataset into the independent and dependent features.

Then split the dataset into the training and testing to evaluate the model using the train_test_split method.

Then apply the standardization on the training and testing datasets. Standardization makes all the features' value in a particular range (0-1).

Now load the Decision Tree Classifier from sklearn library and define the DecisionTreeClassifier and train with the X_train and Y_train dataset. Then test the model using the X_test dataset. Then check the accuracy score of the Decision Tree Classifier. As you can see, the accuracy score is approx 83%.

Now load the Support Vector Classifier and define the SVC and train with the X_train and Y_train dataset. Then test the model using the X_test dataset. Then check the accuracy score of the Support Vector Classifier. As you can see, the accuracy score is approx 88%.

Now load the Logistic Regression and define the LogisticRegression and train with the X_train and Y_train dataset. Then test the model using the X_test dataset. Then check the accuracy score of the Logistic Regression. As you can see, the accuracy score is approx 95%.

Now visualize the accuracy score using the bar plot method of matplotlib. The final best performing model is Logistic Regression. Then test Logistic Regression with a separate test dataset.

Source Code 

  1. Go to my GitHub and fork or download the repo: Mobile Price Prediction
  2. Open .ipnyb file in jupyter notebook.
  3. Now you can run.

Video Tutorial 

Thank You !!!!!!!!

If you have any doubts, Please let me know

Post a Comment (0)
Previous Post Next Post