dev meets ai // proof of work

build what matters.

Jul 22, 2025

Earlier this month, I was engaged in a deep conversation with the founder of an AI startup. Eight hours of calls behind me, dozens of founders interviewed, and everyone working on either neural nets, vision transformers, token optimisation, rag systems, or integrations for enterprise. But when I asked what “proof of value” means for developers in AI, no one had a clear answer. They said, and I quote, “If you get an answer to that, please share.”

Proof of value used to be simple.
You solved a business problem. Shipped features. Scaled systems. Clear metrics. But AI scrambled everything. Now, founders look for weird signals. Have you built an MCP server? Touched agent architectures? Worked with ACPs? They're checking if you're up to date, not if you're good.

The number of companies hiring
for AI in India is still evolving. Most are in the early stage. The ones that exist care more about freshness than experience. Four-year-old AI work? Irrelevant. Two-year-old work? Ancient. This makes showing proof of value nearly impossible. But there's good news.

What all will we cover?

First, org structure
Every technology shift reshapes
How companies are organised. Understanding
How AI companies are set up tells you
Where opportunities live.

Second, your fitment
Frontend or backend, your path differs. We'll decode the core business problems companies actually face.

From there, learning
AI foundations and skills that matter.
For FE vs BE vs DevOps engineers.

Finally, the meat:
Picking projects that demonstrate real value

1/ Macro context (org structure)

Whenever a new technology shift comes,
It always reflects in the way a team or the organisation is set up to deploy the technology. Let's go way back to the industrial revolution, to then let's say, a Tata was set up back in the day for a steel company. To the next revolution around computers. The way an IBM or an Apple hardware division was set up. And then the internet came. To deploy internet-based products fast, we have FE, BE and DevOps.

You cannot take an internet-based company's structure and go and deploy like, you know, hardware. That setup won't work. And now with AI, that’s changing.

AI org structures are nuanced.
I spent hours mapping job postings from OpenAI, Cursor, Lovable, Bolt, Replit and more. This is my understanding after researching across hundreds of AI companies, looking at their job descriptions, whether it's foundational companies like OpenAI, Anthropic, DeepMind. Or it’s application layer ones like Cursor, Lovable, Bolt, & Replit. Let’s dive in.

Product engineering

Frontend

You’ll build every surface a user touches. Think chat, voice, and multimodal token streams. You’ll craft prompt editors, playgrounds, and in-browser eval toggles, all held to a sub-300 ms p95 latency budget. The role is billed as a senior frontend engineer or full-stack core product engineer.

Model services

Backend

All code that talks to checkpoints. Retrieval and vector DB orchestration. Agent runtimes, tool calling, and control plane for model rollouts. Developer tooling and experiment harness. An example listing is an AI applied engineer focused on reliability at Lovable.

Platform

Infrastructure

Autoscaling, spot instance bidding, and cache layers. Observability stacks for token cost and latency. (Mostly seen in larger AI companies)

Data & evaluation

New

You’ll write the code that speaks to checkpoints, manage retrieval layers and vector DB orchestration, shape agent runtimes and tool calling, and run the control plane for model rollouts. You’ll also craft developer tooling and an experiment harness. An example job listing is an AI applied engineer focused on reliability at Lovable.

Reliability & safety

New

You’ll handle autoscaling, design cache layers, and build observability for token cost and latency. Roles like this show up mainly at larger AI companies.

How does this change by stage of the company?

Early-stage companies

At lovable, they hired a generalist product engineer when the team was still under fifteen. Once the headcount hit roughly twenty-five, the first boundary appeared. Product engineering split off from model services.

Mid-stage,

Platform engineering emerges. Primarily driven by the core need to implement a dedicated team that looks at models, evaluation and safety.

Larger-ish companies

Here, the third split to safety happens. Plus token optimisation for cost, spend caps, and guardrail filters. Someone needs to own incidents when the model starts speaking Latin.

If the company focuses on models instead of just the app layer, the org splits into training infra and inference infra. Baseten already lists roles for model performance, site reliability, and forward-deployed engineers. Research is only now spinning up as its own track.

2/ Where do you fit

This section gives you a high-level view of what problems you should be solving, based on the role you're in, i.e. a front-end dev or back-end.

1. Frontend engineers

FE devs at AI companies are like ux therapists for unpredictable machines. Your users expect chat interfaces that handle partial responses, voice inputs that feel natural, and error states for errors that shouldn't exist. The model returns malformed JSON. Hallucinates data structures.

The core JTBD (job to be done) is
to build trust in interfaces. Visualisation for attention weights, confidence scores, and reasoning chains. Making black boxes feel less black without overwhelming users. Evaluation interfaces everywhere. A/b testing model outputs. Collecting human feedback. Determining if the new model actually beats the old one.

My opinion on this: you still own experiences. But the medium of experiencing products is changing. Voice is going to be the new frontend.

2. Backend engineers

Honestly? It’s a lot about data.

Your core job is to build platform systems that take care of all kinds of observability, evaluation, safety, accuracy & cost. The second part is all about building a platform for AI native integrations with external data pipelines. And the last part is about speed.

Monitoring becomes philosophy.
How do you alert on "the model got slightly dumber"? You build evaluation pipelines that run constantly. Detect drift before users notice. Implement guardrails for behaviours nobody predicted because nobody could predict them. And lastly is around latency (this is specific to certain businesses).

3/ Learning about AI

1. Start with foundations

Even though everyone says they don't matter.

My friend & I used to work in an AI company about 10 years ago. He was the main AI person there. Now at an OTT MNC working as a lead for their AI.

I asked him before writing this article
How imp is the foundation? "Everything happens at the application layer now.” He said, “Foundations are not recommended tbh” Then I asked how he debugs or solves a problem. I dug deeper, but how did you know where to find that issue? "Oh, have a gut about how these models fail and why"

The goal is not deep expertise.
When someone mentions "catastrophic forgetting" or "gradient explosion," you need enough background to engage.

Think of it as learning
A new neighbourhood. You don't memorise every address, but you know the main streets :)

I recommend
1/ Watch Karpathy's neural network series.
2/ Deeplearning.ai’s foundational courses

2. What skills should you learn?

a) Python proficiency is core.

This has come out every single time. Like new internet companies ran on JavaScript. AI runs on Python, man.

b) Some math classes will help

If you’re applying for companies that are AI first, I’ve reviewed their interview questions, and they def ask questions around loss plateau, batch size & more. You need mathematical thinking. Because model outputs are probability distributions pretending to be answers.

4/ Building the proof

Let’s start with how we always pick a good project. We ask, what are the top business problems right now? These will not be sexy projects, but def worth building.

Data pipeline reliability
Wastes countless hours. Cleaning, deduplication, versioning, and privacy compliance. Build tools that handle messy real-world data. Show before and after metrics.

Evaluation infrastructure
Everyone’s flying blind without the right tools. Most teams can’t even tell if their new model is better than the last. You’ll build frameworks that compare outputs, catch regressions, and surface subtle degradations. The goal is simple: make quality measurable.

Latency reduction
Model routing based on query complexity. Sub 300ms responses for chat. Instant feel for completions. Progressive streaming that doesn't break. Show p95 improvements.

Two hiring managers
told me that I am looking for the freshness of the tech they’re using. Don’t show me your projects from 5 years back on AI. While they are great, the world has changed.

I asked why freshness matters so much. It’s simple. I want to know whoever I am hiring knows what’s latest. That can be an MCP server built or small models running on phones, browsers, or embedded devices. Webgpu implementations. Anything that shows you know what’s happening in the AI space right now.

I asked, what would make you
Say that if this person has done ____, I would hire them.

1/ Open source contributions to AI projects
Pick projects that developers actually use. Not documentation fixes or typo corrections. Real features. Look at issues labelled "help wanted" on major repos. Start small but meaningful.

2/ Side projects
But you have to connect projects to business metrics. Increased model accuracy on production data by 12%. Caught 89% of quality regressions before deployment. Some kind of metric that the business cares about.

3/ Show adoption and impact.
1,000 developers use your library. Your tool saves 10 hours per week for ML engineers. Doing these 2 things on top of the projects you choose is a guaranteed hire.

What is low-quality work?

Paper reproductions

Read as homework. Everyone can follow a tutorial. Unless you beat the original paper's numbers or add novel optimisations, skip it.

Demo apps wrapping

OpenAI's api screams beginner. ChatGPT clones. Basic rag demos. These flood every portfolio. They demonstrate api reading skills, not problem-solving.

5/ Finding opportunities

Expert advice from Darshan Kabadi. He’s the lead scientist at Square. Previously, a data scientist at Spotify & LinkedIn.

When a hiring manager judges
proof of work, the lens changes depending on whether we are looking at a résumé or sitting in a final-round interview.

On a résumé, it is hard to stand out
Without concrete, production-level examples. List the AI projects you shipped, tie each to a measurable result, and stay specific: “Built a tax-classification model that reduced manual tagging by 80 %” or “Deployed a RAG-based chatbot now handling 70 % of support tickets.” The more high-impact cases you show, the faster you reach the next step.

In an interview, depth beats breadth.
Expect to walk through one
End-to-end project and
Defend every decision:

Data: Why that source, how you cleaned it

Model: architecture, fine-tuning, evaluation

Platform: where it runs (Databricks, custom stack, etc.)

Efficiency: GPU choice, batch size, quantisation, precision

Impact: latency, cost per thousand calls, business lift

Interviewers push here because
Inference is expensive, and sloppy choices burn cash. Be ready to explain how you balanced cost with accuracy. Mixed precision versus full precision, dynamic batching versus real-time calls, and autoscaling versus fixed allocation.

That is the difference:
Breadth on paper, depth in person.

If you're working, start internally.
That manual process everyone hates? Automate it with AI. Measure time saved. Document adoption rates. Real impact on real teammates beats theoretical projects.

Between jobs?
Build in public. Pick problems you personally face. Create tools other developers need. Share progress on Twitter, LinkedIn, github. Let the community validate your direction.

Follow the pain.
Read Discord channels for AI frameworks. What do people complain about? Monitor GitHub issues. What stays open longest? Join ml twitter. What problems repeat? Pain points are project ideas. Or engage with communities. Join Eleuther AI, laion, and Hugging Face discord servers. Participate in discussions. Share your experiments. Get feedback before building. Communities guide you toward real problems.

The founders I interviewed
weren't really looking for AI specialists. They wanted evidence of continuous adaptation. Can this person unlearn and relearn every quarter? Will they notice when the ground shifts again? Do they solve real problems or chase trendy demos?

So I'll leave you with the question
That haunted me, those eight hours of conversations. In a field where knowledge expires faster than milk, what does it mean to be senior anymore?

Note: If you love reading these articles in small case, like I do, click here.

Curiosity. Intent. Empathy.

GrowthX is a private club of 4,200+ members from top companies — Apple, Jio, Zepto, Netflix, Microsoft, Zerodha & 2,000+ more. CXOs, VPs, ICs — all united by the drive to constantly surround themselves with the right people.

Every member gets 6 unfair advantages —
All as part of our new pricing plan.

4,200+ member online private community
In-person monthly events across India
$100K worth of credits from top brands
All self-paced learning programs
Live weekly learning with CXOs
Career Hub for interviews
12 months of access

GrowthX exists to give people who take their craft seriously the ecosystem they deserve. One that teaches, connects, and accelerates. One that opens doors.

Explore GrowthX

Discussion about this post

Ready for more?