<|user|> Bailey was very excited. She had never seen a magnet
I implemented the post-RMSNorm Transformer architecture (GPT-2 ish) and Byte-Pair Encoder (BPE) from scratch.
I improved training efficiency through custom triton fused kernels, data parallelism, and sharded optimizer states.
I created a data pipeline that generates high quality data from common crawl using custom n-gram quality filters, FastText language filters, and Gopher-inspired heuristics.
I instruction finetuned and aligned the model through SFT and DPO to perform better on SST, MMLU, and GSM8K Tasks.
Glimpse is a new approach to blind dating online where users are allowed to swipe on people's profiles, but when they are matched later, they are chatting with a psuedonym. I was lead engineer on team of four engineers creating the mobile app which was launched at over 50 universities
Implemented a Tinder-like swipe interface with a custom chat interface for blind-dating.
Used serverless computing for scalability and cost efficiency.