Archie Sengupta

Hi,

I’m Archie - An AI Research Engineer. Working on ML, RL, DL, LLMs and their theory.

I think the most important algorithm in RL has to be PPO. PPO in LLMs is like the cosmic fine-tuning of intelligence — balancing exploration and control to expand the frontiers of thought.

PPO Algorithm:

$L(\theta) = \mathbb{E}{t} \left[ \min \left( \frac{\pi\theta(a_t|s_t)}{\pi_{\theta_{\text{old}}}(a_t|s_t)} A_t, \text{clip}\left(\frac{\pi_\theta(a_t|s_t)}{\pi_{\theta_{\text{old}}}(a_t|s_t)}, 1-\epsilon, 1+\epsilon\right) A_t \right) \right]$

The PPO algorithm is IMHO very – important, and now so is GRPO.

Work

Atomicwork(Series-A): AI Research Engineer working on search

TurboML (Puch AI) - AI Engineer building indic models & agents for India 🇮🇳 click

Luppa AI - Applied AI Engineer building ai agents for Marketers

Arth AI(YC S21) - Applied AI Engineer building autonomous financial agents

Heva AI - AI researcher on Human Brain Cancer

Thinklink io - Golang Engineer - External Attack Vector

Xelp - NLP Engineer on natural language

Metafy - Research Engineer on zkp

Codecrafters(YC S22) - Developer You might know them from “Build your own X” - 388k stars on Github