This is part 3 of a guest post by British Bondora investor ‘ParisinGOC’.
Investments Decisions using the Tree(s)
Using the Data
Using the output is as simple as looking at the visualisation to see how the Decision Tree splits down from the Root Node and comparing this with a Bondora loan application that I see as a potential target for investment. (Illustration 2 and Illustration 3) At the end of the set of branches that I follow dependant upon the data in the loan application, I end up at a “leaf nodeâ€- the end of the tree. (Illustration 4) This node simply states how many previous loans match the one I am looking at, showing how many of the previous loans defaulted and how many have not.
I treat the Decision Tree as the first step in choosing whether to invest. If the performance record of previous loans like the one I am now considering suggest a default rate of 5% or less, I look further into the loan application.
Making the decision to invest
If the applicant already has borrowing of any sort (e.g. Credit Card) I look at the name of the organisation providing that borrowing. If it is a major name, then the applicant has already passed a credit checking regime far more sophisticated than mine. If the applicant has previously borrowed from payday loan providers and is now seeking to consolidate their situation, I look further, but with a higher degree of caution.
The Decision Tree(s) suggest that about 75% of all loan applications I consider carry a higher than 5% risk, and so are rejected. I then reject about half of the remainder for other reasons, the biggest being a lack of consistency on behalf of the applicant either within or across applications. e.g. A borrower who claims to be 36 with over 25 years work experience!
This means I generally invest in about 4 to 9 loan applications per day and this process is all done manually.
Preliminary results
I built the first Decision Tree in Mid to late September 2014. I am confident that all investments into new loans that I have made since the beginning of October have used the Decision Trees as the primary tool for considering that loan.
I have invested in some loans with an indicated default rate of up to 7% where there have been very few previous loans that strictly match the application under consideration. In cases like this, I will move 1 level (or more) back up the tree and look at the performance of the larger group of loans and make the decision on that less precise, but statistically “better†data.
The lack of volume in the Finnish, Spanish and particularly the Slovakian market means that the statistical validity of the trees is questionable. I am not a statistician, but I am aware of the fact that a sample that is too small will fail certain tests of confidence in the stated outcomes. I have set the parameters of the Decision Tree module in the Finnish and Spanish Trees to ensure that nodes (branches within the body of the Tree) have a minimum sample size of 50 loans, with the leaf nodes (those at the very edge of the Tree) having a minimum size of 20 loans. This generates Trees that have up to 7 levels of depth from root to leaf node in the busiest areas of the Trees.
In terms of investment outcomes, it is far too early to tell whether my target of a 5% default rate is achievable using this methodology. If I assume that the first loan I made using the Decision Trees was on October 1st; that loan made its first payment on 16th October and it annoyingly paid back the whole amount! The second loan made that day made its first payment on the 5th November – 2 days late (a common occurrence for the first payment).  So far, out of 390 loans I have invested in using this methodology since the 1st October, only 44 have passed their first payment date.
It will be the end of December before the first investments made using this discipline can default – 60+days without payment – and I can begin to assess whether I am improving on my past performance.
Overdue and defaulted loans (numbers out of 100) in ParisinGOC’s portfolio
The chart (of loans in which I have invested with higher numbered pages representing older loans) shows the progression from overdue to default. Whilst there are some loans that have failed to make their first payment by several days, early signs are not dramatic. Page 6 covers from the end of August, showing that it is far too early to see any defaults since starting to use the decision Trees.
The received wisdom on the Bondora forum is that a loan 14 days late in paying has a 50% chance of defaulting. Bondora’s own statistics show that a loan is most likely to default in the first 6 months of scheduled payments
Conclusion
The major comfort I take from the work so far is that at least I now know why I choose to invest in certain loans. I can point to a database of several thousand examples of similar loan applications and see the patterns of behaviour set down on the page in a way that is easy to see and understand.
The fact that “Past performance does not guarantee future results†(Bondora Investment Guide) is irrelevant for me. At least I now know why I am making the decisions I make and my decisions are made in a consistent way that I will be able to review and modify with reference to solid processes and data. This is certainly not the case for my previous 15 months of investment activity that has been arguably chaotic, at best inconsistent and subjective, as witnessed in the above table by my loan performance from loan 400 upwards.
EDIT July 2015: Read a progress review to see how the strategy worked out so far.
Hi!
I red your article and got interested about researching my data.
Do you have some RapidMiner file that I could research and learn the program how you did it?
I have about 20000€ in my bondora account. + some in two other services.
Could you share your research result that what kind of loans are best?
Have you used bonder scripts when looking for good loans on second market?
Maybe you can answer to my email
Hi,
I have forwarded your comment to the guest author of the article.