Data aggregation, big data analysis and visualization

Tech Stack: Python, Hadoop, Tableau, beautifulsoup4

  • Extracted and pre-processed data from three sources – Twitter, NYTimes, Common crawl.
  • Implemented map-reduce program for word-count and word-co-occurrences in Hadoop.
  • Visualized and published the obtained results (Wordcloud, etc) in tableau.

See Project