Hi, I am

Jeffy Merin Jacob

Welcome to my page!

Hi Guys!

I am a proud Trojan, class of 2019 with a Master’s degree in Computer Science - specialization in Data Science. As much as I understand that data science requires sound knowledge of statistics and programming, which USC has done an excellent job equipping me with, I also believe creativity and risk taking are key characteristics of a good data scientist.
I love exploring interesting data on the internet and spend most of my time prepping data because I am convinced that in order to do the “science”, you first need good data. Through my years at University, I have acquired skills in data cleansing, wrangling, visualization and machine learning.
I am a data- driven storyteller and I involve my heart and hands in everything I do. Internships have challenged me to work hard, push boundaries and think outside the box, but most of all, it has taught me the importance of building relationships and helping others achieve the same American dream!
Download Resume

My Projects

ASL INTERPRETATION USING DEEP LEARNING

  • Designed a deep learning neural network in TensorFlow that classified images that belonged to 1 of 29 classes comprising the English alphabets A-Z in sign language. The network consists of 2 hidden layers and SoftMax to convert weights to probabilities. The optimizer used was Stochastic Gradient Descent.

  • RECOMMENDATION SYSTEM - CHARITY MATCHING

  • Created content based & collaborative filtering recommendation systems to recommend projects to donors in Oakland in order to maximize number of donations.
  • Identified targeted donors who have previously donated to a cause and recommend similar projects to them.

  • AI GAME PLAYING AGENT

  • Developed a game playing agent that plays the adversarial game - Halma using the minimax algorithm and alpha-beta pruning. Each player starts with 19 game pieces clustered in diagonally opposite corners of the board.
  • To win the game, a player needs to transfer all of their pieces from their starting corner to the opposite corner, into the positions that were initially occupied by the opponent.

  • MACHINE LEARNING FOR DATA INFORMATICS

  • Improvement of Python code that created expert variables for a fraud detection algorithm. Initial code took 24 hours to run 100,000 records. The modified code took 20 minutes to run 97,000 records in addition to 4 more variables.
  • Developed decision trees, gaussian mixture models, SVMs as part of coursework with a purpose of developing a deeper understanding of the internal workings of various classifiers.

  • DATA SCIENCE

  • Performed exploratory data analysis to evaluate a model to ensure it generalizes to unseen examples.
  • Used techniques to clean, reformat, transform, and describe raw data; generate visualizations; remove outliers; perform simple data analysis; and generate interactive graphs using the Plotly library.
  • Performed linear and logistic regression, use K-means and hierarchal clustering, identify relationships between variables, and use other machine learning tools such as neural networks and Bayesian models.
    • Education

    • Masters, CS (Data Science) at University of Southern California
      08/2017 – 02/2019

      Machine Learning - Data Mining - Artificial Intelligence - Database Systems - Geospatial Information Management - Information Retrieval and Web Search Engines - Web Technology - Analysis of Algorithms

    • BE, CSE at St. Joseph's College of Engineering, India
      07/2013 – 05/2017
    • High School, St. Dominics Anglo Indian Higher Secondary
      06/2001 – 05/2013
    • Experience

    • USC Center for Artificial Intelligence in Society- Student Intern
      01/2019 – 08/2019

      • Developed a Python Django web application for a preference elicitation machine learning algorithm. • Performed feature selection (RFECV) on HMIS data to determine best number of features accuracy wise when using a Logistic Regression model.

    • USC Information Technology Services- Summer Research Intern
      05/2018 – 08/2018

      Team: Data Networks and Operations • Designed a dashboard using Python Dash and perfSONAR toolkit to visualize packet moment and re-transmits between BWCTL servers thereby monitoring network performance.

    • Dowell Technologies - Intern
      05/2017 – 07/2017

      • Developed numerous simple to complex queries involving procedures, functions, triggers for diverse business requirements. • Optimized queries using indexing strategies and altering database design.

    • KEY SKILLS

    • Machine Learning - Python scikit-learn
    • Data Analysis -Data Analysis- Python pandas, NumPy, Beautiful Soup
    • Data Visualization - Python seaborn, matplotlib, Tableau
    • Deep Learning - Tensorflow, Keras, TFLearn
    • Web Development - Python Django, Flask, Dash
    • Database - MySQL
    • Cloud Services - Google Cloud, AWS
    • Programming languages – Python, C++
    • MS Excel – Pivot tables, Correlation
    • Statistics – Hypothesis Testing, Regression

    Certifications

    Connect

    Feel free to get in touch.