Analyze and build a machine learning model to forecast EV car prices
Vehicles powered by electricity have been existed since the beginning of the automobile industry — several of the early 19th-century automobiles were powered by electrons. However, their true potential was not realized until Toyota began mass-producing the Prius hybrid on a worldwide scale 20 years ago. Less than a decade later, Tesla unveiled the Roadster, its all-electric sports vehicle, and received a $465 million Department of Energy loan, allowing it to begin manufacturing of its all-electric sedans. The debt has now been repaid, and Tesla is now valued at seven times that of General Motors.
In this story, I’ll analyze and build a machine learning model to forecast vehicle prices. I’m going to utilize the EVs — One Electric Vehicle Dataset from Kaggle for this. This dataset was most recently updated in October of 2020.
What’s the most efficient electric car ?
In truth, if you only make short trips, the maximum range may be completely meaningless — especially if you can charge up often. In such scenario, efficient energy usage may be more appealing to you, since you may find an electric car that requires a bit more charging but saves you money.
What exactly does kilo meters per kWh mean ?
Wh/km (Wh/mi in the United States) is the most widely used measure for EV efficiency. This is a fast and dirty method for calculating the amount of energy (Wh) necessary to drive an automobile 1 unit of distance (1 km or 1 mi).
I chose electric vehicles with a range of more than 250whkm. As seen in the graph above, there are seven automobiles with an efficiency more than 250whkm. The Kia e-Soul 64 kWh is the most fuel-efficient vehicle. This category also contains three types of Tesla cybertrucks.
Which electric car charges the quickest ?
The batteries used in today’s EVs are made up of thousands of lithium-ion cells that can store and release energy thousands of times. Each of these cells is made up of two electrodes separated by a liquid electrolyte: a metal cathode and a graphite anode. During charging, lithium ions pass through the liquid from the cathode to the anode, filling gaps between the graphite layers like wooden blocks in a Jenga tower. How rapidly the battery charges is determined by the rate at which lithium ions travel from the cathode to the anode.
As shown in the graph above, there are eight automobiles in this graph. I only chose automobiles with charging speeds of greater than 800 km/h. Dual motor models, like as the Tesla Model 3 and Model Y, offer the quickest charging capabilities.
In addition, all of the top eight automobiles have Type 2 CCS connectors.
Which electric vehicle has the longest range?
When many people think about electric cars, the first word that comes to mind is ‘range.’ This is quickly followed by mental pictures of running out of juice on the M25 or confronted with a bank of dead charging stations in a service station in the middle of nowhere.
The range of an electric vehicle (EV) on a single charge is one of the most significant hurdles to entry for many potential buyers, but it does not have to be a problem. The current generation of electric vehicles has enough range to suit most drivers’ everyday needs; many EV owners will just charge at home and rarely have to rely on public infrastructure.
Many automobiles have a range of more than 300 kilometers. In the picture above, all vehicles with a range of more than 300 kilometers are grouped by power train difference.
There are seven vehicles with a range of more than 500 kilometers. However, take a look at the Tesla Roadster. It has an amazing range of 970 kilometers.
Now it’s time to build a model and compare electric car prices. The primary goal of this model is to provide the following information: body type, plug type, acceleration time, efficiency, charging speed, range, seat count, and top speed.
First, import the required modules and datasets.
import pandas as pdimport numpy as np
data = pd.read_csv('ElectricCarData_Clean.csv')data.info()
View some basic statistical details like percentile, mean, std.
data.describe()
Remove any superfluous columns from the dataframe. We don’t need the brand and model columns in this situation. since we will only forecast pricing based on characteristics.
data = data.drop(['Brand', 'Model'], axis=1)
Following that, we will transform category variables into dummy/indicator variables. I’m going to utilize the get dummies() function to do this.
data1 = pd.get_dummies(data, drop_first=True)data1
Corr() will be used to determine the pairwise correlation of all columns in the dataframe. Any na values are automatically filtered out. It is ignored for any non-numeric data type columns in the dataframe.
data1.corr()
Following that, we must divide the dataset into two sets. one for testing and one for training.
x = data1.iloc[:,1:]y = data1.iloc[:, 0]
ExtraTreesRegressor helps us to assess the relevance of features and, as a result, eliminate the less significant features. Extra Trees generates a huge number of unpruned decision trees from the training dataset. In the case of regression, predictions are produced by averaging the prediction of the decision trees, but in the case of classification, predictions are made by utilizing majority voting.
from sklearn.ensemble import ExtraTreesRegressormodel = ExtraTreesRegressor()model.fit(x, y)
The fitted property feature importances_ gives feature importances, which are computed as the mean and standard deviation of impurity reduction accumulation inside each tree.
model.feature_importances_
Now it’s time to optimize the hyperparameters. A hyperparameter is a parameter that controls the learning process. Other factors, on the other hand, are determined via training.
n_estimetors = [int(x) for x in np.linspace(start= 100, stop= 1200, num= 12)]max_features = ['auto', 'sqrt']max_depth = [int(x) for x in np.linspace(5, 30, num= 6)]min_samples_split = [2,5,10,15,100]min_samples_leaf = [1,2,5,10]
Grid
grid = {'n_estimators': n_estimetors,'max_features': max_features,'mis_samples_split': min_samples_split,'min_samples_leaf': min_samples_leaf}print(grid)
The data was then divided into two parts: training and testing.
from sklearn.model_selection import train_test_splitx_train, x_test, y_train, y_test = train_test_split(x, y , random_state= 0, test_size=0.2)
Let’s train the model…
from sklearn.ensemble import RandomForestRegressormodel = RandomForestRegressor()hyp = RandomizedSearchCV(estimator = model,param_distributions=grid,n_iter=10,scoring= 'neg_mean_squared_error',cv=5,verbose = 2, random_state = 42,n_jobs = 1)
Fit the x, y train sets
regr.fit(x_train, y_train)
Get output
y_pred = regr.predict(x_test)
y_pred
Let’s compute the root-mean-square error. The square root of the mean of the square of all errors is the root mean squared error (RMSE). The root mean square error (RMSE) is an useful measure of accuracy, but it should only be used to compare prediction errors of various models or model configurations for a single variable, not between variables, because it is scale-dependent.
from sklearn.metrics import r2_score,mean_squared_errormse = mean_squared_error(y_test, y_pred)rmse = np.sqrt(mse)rmse
The error was 0.688133946063998. This is quite useful for forecasting. As a result, our prediction model is now ready to forecast prices.
Conclusion
So the only question is, do range and efficiency affect electric car prices? Looking at the graph below, we can see that there is a relation between price and range. The greater the range, the higher the price. However, the Hyundai Kona electric car costs around 40,795 euros and has a range of 400 kilometers.
The major component of an electric vehicle is the battery pack. Obviously, the cost is also affected by the battery pack. There are four types of battery packs used in electric cars. Lithium-Ion Batteries, Nickel-Metal Hydride Batteries, Lead-Acid Batteries, and Ultracapacitors are examples of these.
So, in this article, we looked at how other components impact the price of an electric car and developed a model to estimate the price of an electric car based on body type, plug type, acceleration time, efficiency, charging speed, range, seat count, and top speed.