Web Scraping and Price Analysis on Hepsiburada -via selenium and numpy

kommradHomer
2 min readFeb 21, 2020

--

I was trying to decide which budget phone to buy, but also find the best price/value ratio. So I decided to remember an old tool of trade, web scraping. So I came up with a tool , consisting of 2 python codes for analyzing prices of all products of a category or a filter result , in-between the different sellers and their prices for the same products, and hopefully detect some cut-price sellers

I’ve shaped the xpath parameters for hepsiburada.com , the biggest retailer in Turkey. The results have been promising so far. On the phone category, whenever I find a seller that’s selling %20 cheaper than the “mean price”, it was almost always far more better than the best price in the other 2 biggest marketplaces, gittigidiyor.com and n11.com. Later, I generalized the code , so that it can take any category or filter result URL as an input parameter and output the url list of products, ready to be input for the analyzer script.

So, in final from , the first code outputs a list of product urls , and the second code outputs a CSV file with the mean and median prices of that product and the percentage of the minimum price compared to the mean and median.

If you would like to give it a try , you can find it on github . you would need python3 itself, 2 modules on python3 and the chrome driver for the chrome browser you will be using , which can be found here. Later you would need to change the CHROME_DRIVER_PATH in the code to show the driver’s path. I might be doing some more improvements for generalization and output formatting

The bash commands below should be helping you try it out for yourself, given that you have downloaded the appropriate chromedriver and replaced the path accordingly

sudo apt install python3
sudo apt install python3-pip
sudo pip install selenium
sudo pip install numpy
git clone git@github.com:kommradHomer/hepsiburada-price-analysis.gitcd hepsiburada-price-analysis## BEFORE PROCEEDING YOU NEED TO DOWNLOAD CHROMEDRIVER and replace ## the CHROME_DRIVER_PATH variable in the code###this one would analyze prices for Android OS Phones
python3 get-urls.py https://www.hepsiburada.com/android-telefonlar-c-60005201
python3 analyze-product-price.py output/my-recent-urls.out### your CSV result should be in the output folder with a "-analyze.out" suffix

--

--

kommradHomer
kommradHomer

Written by kommradHomer

proud seeder of 146.5GB The.Lord.of.the.Rings.Trilogy.1080p.Extended.Complete.Bluray.DTS-HD-6.1.x264-Grym

No responses yet