Hitesh Arora

I am a Research Masters student at the Robotics Institute, part of the School of Computer Science at Carnegie Mellon University (CMU). I am passionate to advance the theoretical foundations of core machine learning algorithms and solve practical challenges in its application to real-world problems, particularly in the domains of robotics, healthcare and climate change.

I am currently being advised by Prof. Jeff Schneider on studying and designing sample-efficient deep reinforcement learning algorithms for end-to-end self-driving. I am also working with Prof. Asim Smailagic in the Engineering Research Accelerator to design explainable semi-supervised deep learning approaches for disease detection from medical images.

Before joining graduate school, I worked at Microsoft for 3 years, where I designed and shipped multiple hyper-scale distributed and analytics solutions currently being used by millions of cloud users. I graduated with a B.Tech in Computer Science and Engineering from IIT Guwahati in 2015. In the past, I had the incredible fortune to gain first-hand research experience by working with Prof. Tomaso Poggio and Prof. Ethan Meyers at the Centre for Brains, Minds and Machines, MIT, Prof. Onur Mutlu at CMU (now at ETH Zurich), Prof. Scott Beaton at the University of Queensland, Australia, and Prof. Ashish Anand at IIT Guwahati.

Also, I have always been driven to solve social problems. Being deeply concerned with North India's alarming pollution, I pioneered the Charvesting project with Dr. Brian Von Herzen and the Climate Foundation NGO to transform the practice of open rice-straw burning into cost-effective, clean conversion to biochar. I enjoy teaching and served as a volunteer teacher to underprivileged students in India over the last 7 years.

Resume  /  LinkedIn  /  GitHub  /  Email

profile photo

Research

project image

Learning to Drive using Waypoints


Tanmay Agarwal∗, Hitesh Arora∗, Tanvir Parhar∗, Shubhankar Deshpande, Jeff Schneider
NeurIPS 2019 Workshop on Machine Learning for Autonomous Driving, 2019
paper / poster / media /

We designed an architecture for self-driving agent to learn control directly from semantically segmented images and waypoint input to drive in urban settings in the CARLA simulator for autonomous driving. We used a convolutional autoencoder to extract a lower-dimensional embedding of semantically segmented image and used the Proximal Policy Optimization (PPO) algorithm for policy learning, achieving significant performance improvements on the CARLA benchmark.

project image

Semi-supervised learning for Diabetic Retinopathy


Advisor: Prof. Asim Smailagic
report /

Built a semi-supervised deep learning pipeline for Diabetic Retinopathy (DR) detection from retinal fundus images to address the lack of labeled images in the medical domain. Designed a novel deep learning architecture to enable simultaneous training of auto-encoder and classifier networks to learn useful latent representation from unlabelled data. Extended the GradNorm algorithm to handle dynamic tuning of gradient magnitudes of multiple losses in semi-supervised multi-task settings, where some of the losses may not be present for all data samples. Achieved an improvement of 2% on ResNet18 baseline on the Messidor DR dataset, and are currently working towards evaluating performance on larger datasets such as EyePACS and publishing our work.

project image

Deep forest: Neural Network based reconstruction of the Lyman-α forest


Advisor: Prof. Rupert Croft

Applying deep learning approaches to make predictions on intergalactic medium characteristics such as the density of neutral hydrogen in dense regions of the universe which are not observable directly. Designed a CNN based architecture to predict optical depth from noisy observations of observed flux from the simulation spectra of Lyman-α forest and achieved promising results. We are working towards submitting this work to the “Monthly Notices of the Royal Astronomical Society” journal.

project image

Decoding Neural information from monkey's brain


Advisors: Prof. Tomaso Poggio , Prof. Ethan Meyers , Prof. Ashish Anand
paper /

In Summer 2014, I received the SN Bose Scholars Program award to pursue research at the Massachusetts Institute of Technology (MIT), USA. I worked in the Poggio Lab, part of the Center for Brains, Minds and Machines (CBMM), under the supervision of Prof. Ethan Meyers and Prof. Tomaso Poggio. Here, I applied machine learning algorithms to decode monkeys’ brain data collected during experiments designed to study specialized brain functions of spatial working memory and task representation. Specifically, my work involved identifying the relevant conditions/stimuli to define as classes (such as eye movement towards target stimulus vs towards distractor stimulus), pre-processing neural data with normalizations to generate pseudo-population vectors, finding the optimal features and classification algorithm, and doing cross-validation analysis to determine decoding accuracy. I extended this work in my bachelor thesis with my advisor Dr. Ashish Anand at IIT Guwahati. We got promising results about the functions of specific brain regions and the nature of neural representation with more than 90% decoding accuracy.

project image

Research Internship, Carnegie Mellon University (CMU), USA, Summer 2013


Advisor: Prof. Onur Mutlu
paper /

In summer 2013, I was selected for the Research Experience for Undergraduates Program at CMU where I worked with Prof. Onur Mutlu on improving the state-of-the-art algorithm for DNA sequence mapping. In this work, we designed an optimal algorithm for hash-based mappers based on the idea of heterogeneous seeds and achieved ~15X reduction in mapping cost, while increasing memory usage by only ~1.5% compared to state-of-the-art mappers like mrFAST. Our paper on this work, unfortunately, didn’t get accepted as similar research was published around the same time and it taught me about the more humble side of research.

project image

Research Internship, The University of Queensland (UQ), Australia, Winter 2013


Advisor: Prof. Scott Beatson
paper /

I worked with Prof. Scott Beatson to design an algorithm for classifying a bacterial sequence as either a chromosome or plasmid, which forms an important part of antibiotics research. We initially developed rule-based methods for classification based on the alignment distance from reference genomes. To improve results further, we applied machine learning methods of Hidden Markov Model, Support Vector Machine and Neural networks achieving accuracy of 67.7%, 82% and 87.6% respectively.




Professional Experience

Having gained mainly academic research experience during my undergraduate studies, I also wanted to get some industry exposure before joining graduate school. I worked in the core Azure team at Microsoft for three years, where I learned to design and implement hyper-scale distributed and analytics systems.

project image

Microsoft Azure Compute team


June 2015 - June 2018

Delivered various core compute platform functionalities to achieve availability and performance goals of five 9s (99.999%).
Some of the projects I shipped include: a) Platform supported migration of IaaS resources from classic to Azure Resource Manager.
b) Designed and implemented automated health monitoring of Service Fabric infrastructure for internal Azure Diagnostics services.
c) Shipped the throttling service to safeguard Azure Geneva diagnostics cloud services from heavy users.
d) Designed “Top Errors” dashboard which shows the trending errors in a service enabling easier debugging.





Design and source code from Jon Barron's website