AI for data analysts // Proof of work

what matters for analysts, scientists and data folks in an ai world

Jul 31, 2025

Another post about "ChatGPT changed my analysis workflow."

Another screenshot of claude writing complicated SQL. And you're sitting there with your 6 years of experience, wondering if your Tableau dashboards just became worthless.

Let me tell you something. Using ChatGPT to clean data doesn't make you AI-ready. Asking Claude to write your SQL queries doesn't either.

Everyone thinks that being an analyst in the AI age means learning more tools. cursor for SQL. Julius AI for visualisation. Claude for analysis.

Wrong. It means understanding 3 things.

Which problems actually matter in an AI world?
Which metrics make sense for probabilistic systems?
Which insights drive decisions when the ground truth keeps shifting?

Tools are commodity. Your judgment about what to measure isn't.

You're a bridge builder. Business has questions about AI systems. Data has answers hidden in token logs and confidence scores.

You build the bridge between "Why is our chatbot weird today?" and "The model's perplexity increased 23% after yesterday's update."

The core equation every analyst should know

Value = Uncertainty quantified × Decisions enabled × Risk mitigated × Trust built

Let's break this down?

Uncertainty quantified = Probabilistic thinking × Confidence intervals × Distribution analysis

Measuring not just what happened, but how confident are we that it happened. Understanding when 95% accuracy means 5% catastrophic failure

Decisions enabled = Speed to insight × Actionability × Stakeholder alignment

From "the model is behaving strangely" to "increase temperature parameter by 0.2" connecting model behaviour to business outcomes

Risk mitigated = Early detection × Impact assessment × Prevention mechanisms

Catching distribution drift before customers notice. Quantifying the cost of hallucinations in rupees, not percentages

Trust built = Explainability × Consistency × Communication clarity

Making black boxes slightly less black. Translating ML engineer speak to CEO speak

See the shift, folks? Most analysts optimise for the old world.

They optimise for:
Historical accuracy.
Pretty visualisations.
Statistical significance.

But the highest leverage is in:
Uncertainty management.
Real-time evaluation.
Trust quantification.

Two games, Different proof.
Depending on whether you want to crack a role in internet-first vs AI-first companies, your problem statements will change. A lot.

1/ Internet-first companies?

Swiggy, Paytm, Phonepe, Dunzo. These companies have data problems, you know. Conversion funnels. User retention. Revenue optimisation.

Their analysts need to show how AI amplifies existing metrics. Reduce cart abandonment using predictive models. Increase ltv with personalisation. Optimise delivery routes with reinforcement learning.

The math is familiar.
The tools are just more powerful.

2/ AI-first companies

Cursor, OpenAI, Bolt, Replit, etc different games entirely. The product is the model. No model, no company.

Their analysts need to measure things that don't have precedent. How do you measure conversation quality at scale? What's the right metric for multilingual performance? How do you catch model degradation before it ships?

The math is unfamiliar.
The tools don't exist yet.
You build them.

6 steps to build proof of work.

I’ve covered both, AI first companies and internet-first.

1/Pick your targets

Pick any type of company.

Internet first examples:
Zomato (food delivery)
Meesho (social commerce)
Groww (fintech)

AI first examples:
Sarvam AI
Cursor
Replit
Krutrim

You can choose an industry if you’re confused about the exact company.

2/ Decode their data problems

Not their job descriptions. Their actual problems. Let’s take internet first example: meesho

Surface problem: Need an analyst to track GMV metrics
Real problem: Can't predict which resellers will scale
Data problem: No early signals for reseller success
Go deeper: Treating all resellers the same when they're fundamentally different

AI first example: Sarvam AI

Surface problem: Need an analyst for model performance
Real problem: Can't measure quality across 22 indian languages consistently
Data problem: No standardised evaluation framework for Indic languages
Go deeper: English benchmarks don't capture Hindi English code mixing

3/ Understand the landscape

For Internet-first:

Billions of events daily
Optimization problems
Structured data pipelines
Known unknowns (We know what we don't know)

For AI-first:

Millions of model inferences
Evaluation problems
Unstructured outputs
Unknown unknowns (We don't know what we don't know)

4/ build something real

Let’s built an internet first proof: Groww user activation predictor

The problem: 70% users who download never complete KYC
The insight: The first 3 app interactions predict KYC completion with 85% accuracy
The solution: Real-time nudge system based on interaction patterns

What I would build?

First identify 7 behavioural markers:

Time spent on education content
Number of stocks browsed
Mutual fund calculator usage
Price alert setup
Watchlist creation
Demo portfolio interaction
Referral screen views

Then, built a predictive model (gradient boosting). Lastly, create an intervention framework.

AI first proof: multilingual model evaluator

The problem: can't measure if Tamil responses are as good as Hindi
The insight: Fluency != Accuracy != Cultural appropriateness
The solution: Multi dimensional scoring with native speaker validation

What I’d build: Analysed 10k model outputs across 5 languages (get public or open source data)

Created evaluation dimensions:

Linguistic accuracy (grammar, spelling)
Semantic preservation (meaning retained)
Cultural appropriateness (idioms, context)
Code mixing handling (Hinglish, Tanglish)
Response latency by language

Automated scoring pipeline
Integrated human-in-the-loop validation
Language Parity Dashboard

Post your results if possible
Identified a 40% quality gap in Tamil vs Hindi

Tech stack:
Langchain for LLM orchestration
Custom evaluation prompts
PostgreSQL for score storage

5/ Critical patterns to show

Depth over breadth
Don't show 10 shallow analyses

Show one deep investigation
Connect technical metrics to business outcomes

Expert advice from Darshan Kabadi.
Lead scientist at Square
Previously, a data scientist
at Spotify & LinkedIn.

When a hiring manager judges
Proof of work, the lens changes depending on whether we are looking at a résumé or sitting in a final round interview.

On a résumé, it is hard to stand out
without concrete, production-level examples. List the AI projects you shipped, tie each to a measurable result, and stay specific: “built a tax classification model that reduced manual tagging by 80 %” or “deployed a RAG-based chatbot now handling 70 % of support tickets.” The more high-impact cases you show, the faster you reach the next step.

In an interview, depth beats breadth.
Expect to walk through one
End-to-end project and
Defend every decision:

Data:
Why that source? How did you clean it?

Model:
architecture, fine tuning, evaluation

Platform:
where it runs (Databricks, custom stack, etc.)

Efficiency:
GPU choice, batch size, quantisation, precision

Impact:
latency, cost per thousand calls, business lift

Interviewers push here because
Inference is expensive, and sloppy choices burn cash. Be ready to explain how you balanced cost with accuracy. mixed precision versus full precision, dynamic batching versus real-time calls, and autoscaling versus fixed allocation.

That is the difference:
breadth on paper, depth in person.

P.S. If you liked this, do give Darshan a shoutout on LinkedIn.
His advice is actually a gold mine.

6/ Package your proof

Send it to the Hiring Manager or Founder. Never the HR at this stage.

Subject: Reduced model evaluation costs by 90% for [company]

Hi [name],

Spent last week diving into [specific problem].

What I found: [one line insight]
What I built: [one line solution]

Early results: [key metric + business impact]
Tested with [x] real model outputs across [y] languages.
[specific impressive outcome].

3-minute demo: [loom link]

Full analysis: [github link]

Interactive dashboard: [streamlit link]

Worth discussing how this scales?
[your name]

What doesn't work

tool obsession
"I know Langchain, Llamaindex, weights & biases, mlflow..." Great. What insights have you delivered? What decisions have you influenced? What money have you saved?

Tools and all are okay
If you have less than 4 years of experience. not beyond that.

Bonus: What companies ask in interviews

I looked at JDS, interview questions for new analyst roles. I looked at both internet ones, like Swiggy, etc, who are trying to get AI to make existing products better and an AI-first company like a cursor or OpenAI. Let’s understand what they’re looking for.

1/Internet first companies

When they implement an AI model in an existing product. They want to understand the following.

Is this model worth the cost?
Who's using AI features?
Did AI drive this outcome?
Cost optimisation, i.e. where are we burning tokens?

Interview focus for internet-first companies.

I checked a few job descriptions and questions that companies are asking for analyst roles working on an AI implementation. Here’s what they’re asking -

1/ SQL with JSON parsing
(Model outputs are JSON)

2/ Probability Basics
(understand confidence)

3/ A/B test design for ML features

4/ Cost-benefit analysis

2 more interview questions
Those were very interesting 👇

Attribution related
(How will you know if it was AI or ux?)

Explaining ML
Stakeholders don't understand ML (translation critical)

2/ AI first companies

What they need:
Evaluation frameworks
(Is our model good?)

Quality metrics
(define "good" for each use case)

Drift detection
(Catch degradation early)

Competitive benchmarking
(How do we compare?)

Interview focus for AI-first companies.

Python
(Live analysis)

Statistical methods
for evaluation

Understanding
of ML pipelines

They also probed
people in round 2/3 on

Ground truth is expensive
(human evaluation costs)

Metrics conflict
(accuracy vs latency vs cost)

Quality is subjective
(What's a good conversation?)

From the job descriptions I analysed:

OpenAI wants analysts
Who can "define north star metrics" and "design a/b tests" for products reaching millions. They care about "statistical rigour" and "communicating with executives."

Anthropic emphasizes
"empirical approaches" and "quantifying uncertainty." They want people who can "measure what doesn't exist yet."

Sarvam AI needs analysts
who understand "multilingual evaluation" and can work with "sparse feedback loops."

The analyst role is being split into three

Type 1: AI system evaluators
They sit with ML engineers. Design evaluation frameworks. Measure model behaviour. Quantify uncertainty. Create trust metrics.

These analysts will thrive.
AI needs evaluation more than ever.

Type 2: Decision scientists++
They partner with a product. Connect model metrics to business outcomes. Design experiments for probabilistic systems. Basically analysts who understand AI deeply.

These analysts will evolve.
Traditional + AI skills = Lethal combination.

Type 3: Report generators
They pull numbers. Update dashboards. Create weekly reports. answer ad hoc requests. Basically human SQL interfaces.

These analysts will be automated.
AI will get better at writing SQL than most analysts.

Harsh? Look at the job postings.

"Must understand transformer architectures" for analyst roles. "Experience with LLM evaluation" required. "Statistical methods for probabilistic systems" is mandatory.

The writing is on the wall.
Or should I say, the tokens are in the context window? (Sorry for the dad joke)

Your move

In an internet-first company?
Stop using AI just to work faster. Find one growth equation lever. show how AI can improve it 30%+. Connect to revenue.

Want a role in an AI-first Org?
Pick a measurement problem that doesn't have a solution yet. Build one. Even if crude. Show you can think in probabilities.

If you want to build bridges,
Find a company using AI badly. Show them what they're measuring wrong. Build the right metrics. Become indispensable.

Pick one,
Follow my steps
and build something that solves it.
Ship it this weekend.

It does not need to be perfect.
Trust me. Doing a failed attempt at this will be better than 99% of others who have never even tried this.

because in 12 months,
"Proficient in SQL" will be table stakes. Everyone will have AI assistants for that.

But "built evaluation framework for code mixed language models"? "reduced inference costs by 40% through smart sampling"? "created early warning system for model drift"?

That changes orbits.

When AI can analyse data
In seconds, clean datasets in minutes, and generate reports instantly, what's left for analysts?

The same thing
That was always most valuable.

Knowing what questions
to ask about systems no
One fully understands yet.

Note: If you love reading these articles in small case, like I do, click here.

Inside the GrowthX community.

If you have been following the newsletter, you might have come across the GrowthX community. It’s a private club of 4,200+ members from top companies — Apple, Jio, Zepto, Netflix, Microsoft, Zerodha & 2,000+ more. CXOs, VPs, ICs — all united by the drive to constantly surround themselves with the right people.

Every member gets 6 unfair advantages —
All as part of our new pricing plan.

4,200+ member online private community
In-person monthly events across India
$100K worth of credits from top brands
All self-paced learning programs
Live weekly learning with CXOs
Career Hub for interviews
12 months of access

GrowthX exists to give people who take their craft seriously the ecosystem they deserve. One that teaches, connects, and accelerates. One that opens doors.

Explore GrowthX

Discussion about this post

Ready for more?