Skip to main content

Posts

Showing posts from 2017

Reducing system load on cache servers by using Bloom Filter

Intro        In this post, I want to share my experience on how bloom filter was used to reduce system load (CPU, RAM, Disk operations..) on our cache servers at CDNetworks. How it all started?        While working at CDNetworks, I got contacted by a recruiter to apply to Japanese company named Rakuten. It was an interesting challenge, so I tried. I had a skype interview with a technical recruiter and he asked me "what is Bloom Filter?", I did not know what it is. I failed the interview,  but it taught me what is Bloom Filter. Bloom filter is a probabilistic data structure, which is similar to HashMap, but insanely memory optimal. If you hold a million URLs in HashMap, it can reach up to 500Mb, whereas BloomFilter can make it with 16Mb (More info here:  http://ahikmat.blogspot.kr/2016/07/intro-bloom-filter-is-probabilistic.html ) . In other words, Bloom Filter is a clown with a bag full of balls marked with random integer numbers. if you ask him whether some ball wit