Create an India credit risk(default) model, using the data provided in the spreadsheet raw-data.xlsx, and validate it on validation_data.xlsx. Please use the logistic regression framework to develop the credit default model.
Hints :
Data description – Please direct them to the video – Default Risk Prediction. After removing variables for multicollinearity, we should try to take at least one variable for creating the model from each of the 4 factors namely –
1) Profitability
2) Leverage
3) Liquidity
4) Company’s size
For Default Risk Estimation, all the variables are bifurcated in different buckets in the categories tab in raw_data file.
Creation of new variables – This is an important step in the project as the company which is the biggest in size, will also have bigger asset size, cash flows, etc. (Hint: We need to think in terms of ratios – Equity to asset ratio, debt to equity ratio, etc)
Dependent variable – We need to create a default variable which should take the value of 1 when net worth is negative & 0 when net worth is positive.
Validation Dataset – We need to build the model on the raw dataset and check the model performance measures on the validation dataset.
Please note the following:
- You have to submit 2 files :
- Business Reportnot exceeding 3000 words. In this, you need to submit all the answers to all the questions in a sequential manner. Your answer should include detailed explanations & inferences to all the questions. Your report should not be filled with codes.
- R code file: This is a must and will be used for reference.
Criteria Points
- Outlier Treatment – Outlier Treatment
- Missing Value Treatment
- New Variables Creation (One ration for profitability, leverage, liquidity and company’s size each )
- Check for multicollinearity
- Univariate & bivariate analysis
- Build Logistic Regression Model on most important variables
- Analyze coefficient & their signs
- Predict accuracy of model on dev and validation datasets
- Sort the data in descending order based on probability of default and then divide into 10 deciles based on probability & check how well the model has performed
Data Dictionary
Variable Name | Discreption |
Networth Next Year | Net worth of the customer in next year |
Total assets | Total assets of customer |
Net worth | Net worth of the customer of present year |
Total income | Total income of the customer |
Change in stock | difference between value of current stock and the value of stock in last trading day |
Total expenses | Total expense done by customer |
Profit after tax | Profit after tax deduction |
PBDITA | Profit before depreciation, income tax and amortization |
PBT | Profit before tax deduction |
Cash profit | Total Cash profit |
PBDITA as % of total income | PBDITA / Total income |
PBT as % of total income | PBT / Total income |
PAT as % of total income | PAT / Total income |
Cash profit as % of total income | Cash Profit / Total income |
PAT as % of net worth | PAT / Net worth |
Sales | Sales done by customer |
Income from financial services | Income from financial services |
Other income | Income from other sources |
Total capital | Total capital of the customer |
Reserves and funds | Total reserves and funds of the customer |
Deposits (accepted by commercial banks) | All blank values |
Borrowings | Total amount borrowed by customer |
Current liabilities & provisions | current liabilities of the customer |
Deferred tax liability | Future income tax customer will pay because of the current transaction |
Shareholders funds | Amount of equity in a company, which is belong to shareholder |
Cumulative retained profits | Total cumulative profit retained by customer |
Capital employed | Current asset minus current liabilities |
TOL/TNW | Total liabilities of the customer divided by Total net worth |
Total term liabilities / tangible net worth | Short + long term liabilities divided by tangible net worth |
Contingent liabilities / Net worth (%) | Contingent liabilities / Net worth |
Contingent liabilities | Liabilities because of uncertain events |
Net fixed assets | purchase price of all fixed assets |
Investments | Total invested amount |
Current assets | Assets that are expected to be converted to cash within a year |
Net working capital | Difference of current liabilities and current assets |
Quick ratio (times) | Total cash divided by current liabilities |
Current ratio (times) | Current assets divided by current liabilities |
Debt to equity ratio (times) | Total liabilities divided by its shareholder equity |
Cash to current liabilities (times) | Total liquid cash divided by current liabilities |
Cash to average cost of sales per day | Total cash divided by average cost of the sales |
Creditors turnover | Net credit purchase divided to average trade creditors |
Debtors turnover | Net credit sales divided by average accounts receivable |
Finished goods turnover | Annual sales divided by average inventory |
WIP turnover | The cost of goods sold for a period divided by the average inventory for that period |
Raw material turnover | Cost of goods sold is divided by the average inventory for the same period |
Shares outstanding | Number of issued shares minus the number of share held in the company |
Equity face value | cost of the equity at the time of issuing |
EPS | Net income divided by total number of outstanding share |
Adjusted EPS | Adjusted net earning divided by the weighted average number of common share outstanding on a diluted basis during the plan year |
Total liabilities | Sum of all type of liabilities |
PE on BSE | Company current stock price divided by its earning per share |