Introduction In this post, I am going to go over a simple project running on Amazon EMR. I am using a dataset “Baby Names from Social Security Card Applications In The US” which holds the data for 109 Years (1910-2018). I transformed the data to make it compatible with this project and made it available …
If you ever work with any application in the Hadoop ecosystem you probably used or maybe heard of one of these file formats. Apache Avro Apache ORC Apache Parquet
Introduction The Hadoop file system is designed as a highly fault-tolerant file system that can be deployed on low-cost hardware. HDFS provides high throughput access to application data and is suitable for applications that have large datasets. HDFS is designed for batch processing rather than interactive use by users. HDFS provides write-once-read-many access models. In …
Hello and welcome to my blog. My name is Amirali Shahinpour, I work as a Software Engineer in Silicon Valley. I currently work for Nebbiolo Technology. Our product is fogOS which allows me to work on all the areas that I am interested in.
This blog is a collection of my thoughts and researches about engineering and different technologies. My goal is to have a central place that I can collect my research and thoughts and share what I’ve learned and experienced in work or my personal projects. I hope you can find them useful.
Away from desk
When I’m not working, I like to spend my time outdoors and hike, read books and watch documentaries.