This post was written by Wei Hong Low.
Airbnb has grown 21% compared to August 2019 earning gross revenue $4,308,726,681. More impressively, it has grown to 150 million users and a current 6,370,563 property listings across 191 countries. So far, it has been very successful for Airbnb.
In Southeast Asia, Malaysia continues to be the fastest-growing country for Airbnb for the second year running. There is a total of 53,000 listings in Malaysia which brings in an estimated RM3 billion direct economic impact in 2018 alone. Knowing that Airbnb business is growing fast in Malaysia, there is a question keeps looming over my head.
What is the best area to start an in Kuala Lumpur — the capital of Malaysia?
Before thinking about the above question, there is another question that immediately pops up in my mind. What am I thinking about? I do not own a house yet, why am I thinking about this question? However, I still decided to google whether there are other ways and here is what I found.
There are two ways from which you could start an Airbnb business. One is to rent out one of your own properties, the other way is to rent other people’s property to launch an Airbnb business. In other words, I do not need to own a property to start!
If you want me to choose which way to start, I would go for the second option first. As it is more flexible, I could rent a property for a relatively short period of time compared to buying a property. The other reason is that I can have a taste of how is it like being a host in Airbnb, before buying a property just for the sake to start an Airbnb business.
In this article, I will be discussing how I manage to obtain the data for later analysis. Without further ado, let’s start!
Data Collection Journey
Before I start to crawl the Airbnb website, I came across this website which seems to change my mind to do other topics. In summary, the Airbnb website actually does not keen to share the data they have. Amsterdam government tried to scrape the website weekly appeared to have been given up in scraping this website.
Despite knowing that I will be facing a lot of issues, I still decided to try to crawl the website. Now, I am going to share some of the problems I faced when I was crawling the websites.
The first problem I faced is the dynamically changed Xpath or CSS path. Meaning that you can’t just use one example of the Xpath, and assuming that using this Xpath on all the listing’s webpages would give you the same piece of information you want.
Besides, the stability of the crawler is also another issue. Here are some of the ways I would suggest if you are scraping Airbnb. The most efficient way I found is to combine both rotating IPs and slowing down the crawler.
After spending hours inspecting through the network tab, I found a path in which you would be able to access the JSON that contains neater data. Therefore, my first problem has been solved! In order to increase the stability of my crawler, I rotated IPs and slowed down my crawler. Finally, I was able to crawl data in a much stable pace.
Data Cleaning Journey
After collecting the data, let’s do some data cleaning.
First, I obtained the highest price of listings which have at least one review. Let’s called this price — Price A. Moreover, I removed all the listings which are higher than Price A — outlier listings. In other words, outlier listings are listings that have a ridiculously high price but no one has ever left a comment.
Duplicated listings were removed based on localized neighborhood, which only left around 5590 listings. Moreover, the estimated prices of properties for the top 10 listings area were collected from edgeprop, brickz and propsocial.
A good way to show the price distribution of Kuala Lumpur is to plot a choropleth. This map is segmented by the 11 federal constituencies of Kuala Lumpur. According to the figure, you can see that the northeast part of Malaysia has the highest average price while the central part of Kuala Lumpur has a fairly competitive price.
If you are budget traveller coming to Kuala Lumpur for a vacation, I guess you know which area to stay now!
Now, let’s have a bird’s-eye view of the listings distribution across Kuala Lumpur. As you can see, most of the Airbnb listings are accumulated in the city center area of Kuala Lumpur. That’s why the price of the listings are lower in the city area as supply is much higher.
Feel free to navigate the interactive graph to have a much deeper look into a specific area in Kuala Lumpur.
The figure above shows the top 20 most expensive neighborhoods. The triangle represents the number of listings while the bar represents the mean price of listings.
What insight you could get from this graph?
Taman Bukit Maluri, Taman Golden, and Segambut have a relatively higher price and lower listings. One of the reasons might be that there is not enough supply for that area and thus the price is higher. If you are considering to start an Airbnb business, you can do some research on that area whether the place has demand and the cost of renting or owning a house is affordable.
Now, it is time to know the answer.
What is the best area to invest in Kuala Lumpur?
Let’s consider only the top 10 neighborhoods that have the most listings. Before I start to explain the calculation, let me explain some of the terms which I am going to use.
How to calculate the Return on Investment (ROI) in the housing market?
ROI (%) = (monthly rental * 12) / property value
Therefore, in our Airbnb case, I use
occupancy rate * fee per night * 30 to calculate the monthly rental. Besides, base on the report by AirDNA, the average occupancy rate for Kuala Lumpur is 66% from 2018–2019. Assume there are on average 30 days each month.
Thus, the final formula looks something like this.
ROI (%) = (fee per night * 0.66 * 30 * 12)/ property value
The other variable I am looking for is the demand. However, if I need to get the actual number of bookings, I would need to build a larger scale of a web crawler. Therefore, for my case, I use number of reviews as a proxy to the demand.
The best areas to invest are Kampung Baru and Chow Kit, which lie in the top right quadrant. In order words, these neighborhoods have a high ROI (%) and demand. This graph is for an investor who wants to invest in real estate to start an Airbnb business.
On the other hand, for a person who can’t afford to buy a house, but who intends to start an Airbnb business, he or she will need to look into a slightly different metric. Instead of looking into ROI (%), he or she should be looking at
airbnb expected monthly income/ monthly rental feewhich is equivalent to
(fee per night * 0.66 * 30)/monthly rental fee. Here, I will define it as revenue to cost ratio.
If you do not want to own a house to start an Airbnb business, Chow Kit and Kampung Baru would be the place that you could consider.
All the estimated rental fee and housing price I collected is from some free websites. Therefore, if you want a more accurate number, you would need to subscribe to one of the paid services, for instance, brickz. Besides, all the data above ignore the seasonality effect, as I only scraped data from 29–11–2019 to 02–12–2019.
Thank you so much for reading it until the end. I really appreciate it!
What I would say is this mini-project really consumes quite some time for me, so if you do like or prefer this kind of content, do let me know in the comment below.
Besides, the collection of Airbnb data is very time-consuming as well. However, in order to thank you for your support this year, I would like to give it out for free!
The dataset I had collected contains a total of 71 neighborhoods in Kuala Lumpur. If you are interested to get the dataset so that you are able to play around with it, feel free to check out this link!
This post is featured in Tech In Asia, feel free to check out over here!
See you in the next post!
About the Author
Low Wei Hong is a Data Scientist at Shopee. His experiences involved more on crawling websites, creating data pipeline and also implementing machine learning models on solving business problems.
He provides crawling services that can provide you with the accurate and cleaned data which you need. You can visit this website to view his portfolio and also to contact him for crawling services.