Risk Management Using RAG (Retrieval-Augmented Generation)

Case Study: Adani Enterprises Limited (ADANIENT)

This document demonstrates how to build a contextual risk assessment system for Adani Enterprises Limited (ADANIENT) using Retrieval-Augmented Generation (RAG).

While traditional risk models rely on quantitative metrics such as volatility and VaR, critical risks often emerge from disclosures like credit rating changes, regulatory actions, contingent liabilities, and funding expansions. This project integrates structured financial disclosures and rating updates into a RAG pipeline to generate analytical, traceable, and context-driven risk assessments.

Project Overview

Component	Details
Target Company	Adani Enterprises Limited (ADANIENT)
Data Sources	CARE Ratings, ICRA Ratings, Annual Report FY2024-25
Embedding Model	`sentence-transformers/all-MiniLM-L6-v2`
Vector Store	FAISS (Local)
LLM	LLaMA 3.1 8B via Groq API
Framework	LangChain

Objective

The objective of this project is to build a contextual risk assessment system for Adani Enterprises Limited using Retrieval-Augmented Generation (RAG).

This project integrates structured financial disclosures and rating updates into a RAG pipeline to generate analytical, traceable, and context-driven risk assessments.

Data Sources

Credit Rating Documents (Last 12 Months): Credit rating updates were sourced from stock exchange filings (NSE & BSE Corporate Announcements) under Regulation 30 disclosures. Only SEBI-registered agencies with active coverage were considered — CARE Ratings Limited and ICRA Limited — capturing rating upgrades, reaffirmations, facility enhancements and fresh assignments.

Annual Report (Latest Available): The latest Integrated Annual Report (FY2024-25) was used, with targeted keyword searches across Borrowings, Contingent Liabilities, Risk Management Framework, and Financial Risk Sensitivity sections.

Manual Data Structuring

Relevant sections were manually extracted and reformatted into structured analytical text blocks — four independent files:

borrowings.txt — Leverage and debt structure
contingent_liabilities_and_commitments.txt — Off-balance sheet risks
risk_management_framework.txt — Interest rate & FX sensitivity
credit_ratings.txt — Rating trajectory signals

Why manual structuring? Raw PDFs contain boilerplate accounting language, formatting noise, and redundant tables that reduce retrieval precision. Structured data ? fewer chunks ? sharper retrieval ? stronger RAG.

Auto Extraction — Using PyMuPDF Library

The snapshot below shows raw text extracted automatically from the Annual Report PDF using the pymupdf library in Python. Notice the dense, noisy output — boilerplate text, table artifacts, and fragmented financial figures all mixed together, making it unsuitable for clean semantic retrieval.

No description has been provided for this image

Manual Extraction — Structured Risk Intelligence Format

The snapshot below shows the manually structured version of the same data, reformatted into clean analytical text blocks. Each section is clearly labelled (e.g., [SECTION: LONG-TERM LEVERAGE RISK]), figures are contextualized with year-on-year comparisons, and boilerplate is removed entirely. This structured format directly improves chunk quality and retrieval precision in the RAG pipeline.

Step 1: Installing Dependencies

We install all required libraries for document loading, text splitting, embedding generation, vector storage, and LLM inference.

LangChain (core + community)
LangChain Text Splitters
LangChain-HuggingFace + Sentence-Transformers
FAISS (faiss-cpu)
LangChain-Groq + Groq API

Step 2: Load Structured Risk Intelligence Files

Instead of using raw PDFs, we prepared structured risk intelligence text files — one per risk category — to ensure clean, analytical content reaches the retrieval pipeline.

Each file is tagged with a source_type metadata label, enabling downstream filtering and traceability of retrieved chunks.

File	Risk Category
`current_noncurrent_borrowings.txt`	Leverage Risk
`contingent_liabilities_and_commitments.txt`	Contingent Risk
`risk_management_framework.txt`	Financial Risk Framework
`creditratings.txt`	Credit Rating Risk

Step 3: Text Chunking

Large documents must be split into smaller overlapping chunks before embedding. The RecursiveCharacterTextSplitter splits text hierarchically — first by paragraphs, then sentences, then characters — to preserve semantic coherence.

Parameters chosen:

Parameter	Value	Reason
`chunk_size`	900	Balances context richness with embedding precision
`chunk_overlap`	150	Prevents loss of context at chunk boundaries

Why chunking is required:

LLMs have hard context window limits — documents must fit within them
Smaller, focused chunks improve retrieval precision
Overlapping chunks preserve contextual continuity across boundaries

Step 4: Embedding Generation & Vector Storage

Each text chunk is converted into a high-dimensional numerical vector using the all-MiniLM-L6-v2 embedding model. These vectors capture semantic meaning, enabling similarity-based retrieval instead of keyword matching.

Model: sentence-transformers/all-MiniLM-L6-v2

Lightweight and fast — suited for local experimentation
Strong performance on semantic similarity benchmarks
384-dimensional dense embeddings

Vector Store: FAISS (Facebook AI Similarity Search)

Stores vectors locally on disk
Enables fast approximate nearest-neighbour (ANN) retrieval
No external database required

Step 5: Load Large Language Model (Groq)

We use Groq's hosted LLaMA 3.1 8B Instant model for generation.

Why Groq?

Extremely fast inference via dedicated LPU (Language Processing Unit) hardware
Cost-effective for high-volume experimental runs
Reliable reasoning performance suitable for structured financial analysis

Model Configuration:

Parameter	Value
Model	`llama-3.1-8b-instant`
Temperature	`0` (deterministic outputs for analytical tasks)

Step 6: Configure Retriever

The retriever performs semantic search over the FAISS vector database to find the most relevant chunks for each query.

Parameter: k = 15

Consideration	Reasoning
Coverage	Retrieves enough chunks to cover all four risk categories simultaneously
Completeness	Ensures leverage, liquidity, contingent, and rating signals are all represented
Prompt Size	Balances context richness without exceeding LLM context limits

Step 7: Prompt Engineering Strategy

Prompt design is the most critical determinant of output quality in RAG systems. The system prompt instructs the LLM to:

Use only retrieved context — no hallucination from parametric memory
Quote numerical figures for verifiability and traceability
Compare year-on-year trends for dynamic risk signals
Avoid generic corporate governance language — focus on inference
Classify overall risk as Low / Moderate / High with explicit reasoning

The chain is assembled as a LCEL (LangChain Expression Language) pipeline:

retriever ? format_docs ? prompt ? llm ? StrOutputParser

Code Snippet for: Prompt Engineering

from langchain_core.prompts 

import ChatPromptTemplate

from langchain_core.runnables 

import RunnablePassthrough

from langchain_core.output_parsers import StrOutputParser



prompt = ChatPromptTemplate.from_template("""You are a professional equity risk analyst.Use ONLY the provided context.If data is missing, explicitly state that it is not available.Context:{context}Question:{question}Instructions:1. Analyse leverage risk using total borrowings, net debt, and gearing ratio.2. Analyse liquidity risk using maturity profile and short-term obligations.3. Analyse interest rate and currency sensitivity risks.4. Analyse contingent liabilities and regulatory exposure.5. Analyse credit rating trajectory and funding expansion signals.6. Quote numerical figures where available.7. Identify trend changes (increase/decrease year-on-year).8. Classify overall risk as Low, Moderate, or High.9. Provide reasoning before classification.10. Avoid generic governance commentary.Be analytical and inference-driven.""")

def format_docs(docs):   

        return "\n\n".join(doc.page_content for doc in docs)

        rag_chain = ({"context": retriever | format_docs, "question": RunnablePassthrough(),}| prompt| llm| StrOutputParser())

Step 8: Execute Risk Assessment Query

We now test the RAG pipeline with the primary integrated risk assessment query. This query is designed to stress-test the system's ability to synthesise information across all four risk dimensions simultaneously:

"Assess the overall risk profile of Adani Enterprises combining leverage, liquidity, contingent exposure and credit rating signals."

Model validation is performed by checking whether outputs:

Reference actual numerical figures from the source documents
Identify year-on-year trend changes
Provide structured analytical reasoning before classification

Query

response = rag_chain.invoke("Assess the overall risk profile of Adani Enterprises combining leverage, liquidity, contingent exposure and credit rating signals.")

print(response)

Output

Overall Risk Profile Assessment

Based on the provided data, I assess the overall risk profile of Adani Enterprises as High.

Leverage Risk Analysis

Total borrowings have increased from ?1,00,557.23 crore (FY24) to ?1,26,603.70 crore (FY25), indicating a significant rise in leverage.
Net debt has increased from ?43,055.40 crore (FY24) to ?69,273.95 crore (FY25), further exacerbating leverage risk.
The gearing ratio has increased from 49% (FY24) to 55% (FY25), indicating a higher level of debt relative to equity.

Liquidity Risk Analysis

The maturity profile shows a significant increase in short-term borrowings, with ?11,883.69 crore due within 1 year and ?35,793.75 crore due between 1-5 years.
The increase in short-term debt obligations from ?6,405.73 crore (FY24) to ?11,635.15 crore (FY25) indicates a higher liquidity risk.

Interest Rate and Currency Sensitivity Risks

The increase in variable rate borrowings and interest rate sensitivity indicates a higher risk exposure to interest rate fluctuations.
The rise in foreign currency borrowings declined year-on-year, reducing foreign exchange exposure relative to the previous year, but the overall increase in long-term debt elevates leverage risk.

Contingent Liabilities and Regulatory Exposure

Quantified tax and regulatory contingencies exceed ?5,000 crore in aggregate, indicating a significant contingent liability risk.
Capital commitments amount to ?31,737.09 crore (FY2024: ?29,696.41 crore), indicating significant future funding requirements.
Export obligations exceeding ?7,700 crore remain pending across subsidiaries, which may trigger duty recovery exposure if not fulfilled.

Credit Rating Trajectory and Funding Expansion Signals

The long-term rating upgrade reflects improved credit profile and stronger perceived debt servicing capability.
The stable outlook across both agencies suggests no near-term downgrade risk.
However, the total rated facility amounts increased materially during the year, indicating increased leverage and funding expansion.

Reasoning

The overall risk profile of Adani Enterprises is classified as High due to the significant increase in leverage, liquidity risk, contingent liabilities, and regulatory exposure. While the credit rating trajectory suggests an improved credit profile, the funding expansion signals indicate increased leverage and risk exposure. The company's ability to manage these risks will be crucial in maintaining its credit rating and overall financial stability.

Trend Changes

Leverage risk has increased year-on-year, with total borrowings and net debt rising significantly.
Liquidity risk has increased year-on-year, with short-term debt obligations rising from ?6,405.73 crore (FY24) to ?11,635.15 crore (FY25).
Contingent liabilities and regulatory exposure have increased year-on-year, with quantified tax and regulatory contingencies exceeding ?5,000 crore in aggregate.

Classification

Overall risk profile: High

Recommendations

The company should focus on reducing leverage and improving liquidity by managing short-term debt obligations.
The company should prioritize managing contingent liabilities and regulatory exposure by resolving tax disputes and regulatory proceedings.
The company should maintain a stable credit rating by continuing to improve its credit profile and debt servicing capability.

Conclusion

This project demonstrates how Retrieval-Augmented Generation can be applied to contextual equity risk management.

By combining structured financial disclosures, rating trajectory signals, and funding expansion data, the system produces explainable and traceable risk assessments that go well beyond quantitative metrics alone.

Key Takeaways

Insight	Details
RAG complements quant models	Structured disclosures surface risks invisible in price/vol data
Data quality is paramount	Manual structuring ? cleaner chunks ? sharper retrieval
Prompt design drives quality	Instruction-level control shapes analytical depth and format
k=15 retriever coverage	Cross-category coverage is essential for integrated risk queries
LLaMA 3.1 via Groq	Cost-effective, fast, and reliable for structured financial reasoning

Extensions

This architecture can be extended to:

Multi-stock screening — parallel RAG pipelines across a portfolio
Sector-level risk monitoring — aggregated disclosures by industry
Early warning systems — automated ingestion of real-time exchange filings
Comparative peer analysis — cross-entity risk benchmarking