ns.

Nidhish Shah

Machine Learning Engineer

About

Machine Learning Engineer and aspiring Research Scientist, driven by a passion for developing innovative AI solutions. With experience at Prosus Group, Microsoft, and Amazon, I've worked on impactful projects across various domains of AI.

My current focus is on training and evaluation of LLMs , where I've achieved significant performance improvements and cost reductions. My research interests span Natural Language Processing, Computer Vision, and Reinforcement Learning, with publications at conferences like NeurIPS.

Blogs

I write about AI research and building split mechanical keyboards - two very different things that somehow both involve a lot of tinkering. Expect some thoughts on neural nets, split keyboards, and maybe a few other nerdy obsessions I pick up along the way.

Publications

Machine learning research publications ranging from LLMs and computer vision to Medical AI. Peer-reviewed papers at major ML conferences along with arXiv preprints showcasing some my work.

StackEval: Benchmarking LLMs in Coding Assistance

We introduce two coding benchmarks - StackEval and StackUnseen - to evaluate language models' performance on real programming tasks, along with a comprehensive framework to assess how well LLMs can judge coding solutions.

StackEval: Benchmarking LLMs in Coding Assistance