INTRODUCTION TO INITIAL LIQUIDITY OFFERING DEVELOPMENT

Initial Liquidity Offering is a new type of fundraiser mechanism where projects and startups raise funds by selling tokens on DeFi based decentralized mechanisms without involving the process of ICO(…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




INTRODUCTION

A company wants to automate the loan eligibility process (real time) based on customer detail provided, while filling online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others. To automate this process, they have given a problem to identify the customers segments, those are eligible for loan amount so that they can specifically target these customers.

It’s a classification problem. We have to predict whether a person is eligible for getting a loan or not based on the information we have been provided with about that person.

Data Source

We build two classifiers for this data set: a Decision tree and a Random forest. Then, we choose a suitable metric to compare the performance of these models. An appropriate metric, in this case, would be ‘Precision’. Suppose a customer is classified as ‘Yes’ that is the person is eligible for getting a loan wheras in reality the person is not(False Positive). Another scenario can be, that a person who is eligible for getting a loan has been classified as not eligible(False Negative). The former scenario is more dangerous from a bank’s perspective.

The precision of a model is given by

and the Recall is given by

Now, let us look at the data type of the variables.

Next, we look for whether there are missing values in the data.

We have missing values both in categorical and numerical variables. For categorical variables we fill the missing values with their respective modes(the value with the highest frequency) including ‘Credit History’ and ‘Loan Amount Term’. For numerical variables we fill the missing values with the median (since it contains outliers).

Most of the loan borrowers have loan amount term of 360 months that is 30 years.

Following is the visualization of the categorical variables when the column Loan Status is present.

Based on the graphs we can say that,

2. Married people have been more likely to get a loan.

3. People with 2 dependents have been most likely to get a loan, then people with no dependent and lastly people with one dependent and 3 dependents.

4. Proportion of getting a loan is greater for graduate people than non-graduate people.

5. Loan Status for both self-employed and not self-employed people are almost same.

6. Approval of loan for semiurban people have been more and then urban and rural people respectively.

(Level of Significance for all the tests are 0.05)

Based on the tests performed above we conclude:

So, there is no significance difference in proportion of loan approval for between Rural and Urban population.

There is no significance difference among the proportion of loan approval based on dependents.

The following graphs show the distribution of Loan amount when other categorical variables such as Loan Status, Education and Self Employed are present.

b. Loan amount for graduate people have been greater than non-graduate people.

c. Loan amount for self-employed people are more than not self-employed people and have more outliers.

Another interesting variable for our study is credit history. We can check how it affects the Loan Status. We can turn it into binary then calculate it’s mean for each value of credit history .

People with a credit history is way more likely to pay their loan. This means that credit history will be an influential variable in our model.

Before modelling we need to turn all the categorical variables into numbers.

Then we try out different models and we’ll create a function that takes in a model.

Add a comment

Related posts:

Do You Have a Headache?

Headache is described as a common condition that affects many people multiple times during their lives. It is indicated by pain in your face or head. The pain can be sharp, dull, throbbing, and…

FIXING A HOME GARDEN

FIXING STUFF LOCATIONS: Yesterday, as I was going through my house’s front gate, I saw that my garden was quite dirty, with tree leaves scattered over the space. I didn’t like the appearance of the…

Money Days

Glass shards and sculpted steel reached up out of the ground. Clawing upwards towards the sign for the financial district. Broken screens ticked from within what remained of the buildings. Mannequins…