Skip to main content
RETURN_TO_MAIN_DASHBOARD
SECURE_DATA_DUMP // PORT_443TARGET_SYSTEM: CONVERSATIONAL-AI
JUNE 2026 – PRESENT

INTELLIGENT CRIME DATABASE CONVERSATIONAL AI

AI-Powered Multilingual RAG Search & Predictive Analytics

01 // PROJECT_SUMMARY

Led AI development by building LLM-powered conversational search, NLP intent classification, Retrieval-Augmented Generation (RAG) pipelines, and predictive analytics for multilingual crime intelligence and investigation support for KSP (Karnataka State Police) Crime Database.

PythonNLTKTableauLLMsRAGFAISSNLP

SYSTEM_METRICS

HOST_STATUS:STABLE
ROLE_TYPE:SEC_ARCHITECT
REPOSITORIES:GITHUB_SRC

02 // STRIDE_THREAT_MODELING_LOGS

THREAT_CATEGORYEXPLOIT_VECTORMITIGATION_STRATEGY
Prompt Injection (AI Specific)Malicious users input adversarial prompts to force the LLM to output classified investigation case details.Implemented LLM Guardrails (input-output validation) and system-level prompt isolation.
Data Leakage via EmbeddingsAccess to the FAISS vector database allows attackers to reconstruct original case documents.Strict role-based access control (RBAC) on the API endpoints serving the embedding query.
Denial of Service (DoS) via Token ConsumptionAttackers submit extremely long queries to deplete API tokens and throttle system resources.Rate limiting by user session and prompt length token limits at the API gateway.

03 // ARCHITECTURAL_SANDBOX_SCHEMAS

FILE_DUMP // DIAGRAM_NODES.LOG
  • Data Pipeline: CSV/JSON crime log ingest parsed, cleaned, and normalized with NLTK.
  • Embedding Vector Database: Documents chunked and stored in a local FAISS database using text-embedding models.
  • RAG Core: Query-context retrieval matching case IDs combined with LLM prompting for natural responses.
  • Analytics Dashboard: Integrated Tableau dashboards displaying predictive hot-spots and monthly crime trends.

04 // ARCHITECTURE_LESSONS

  • LLMs are prone to hallucinating penal codes (e.g., IPC/BNS sections). Grounding the model strictly to verified datasets is mandatory.
  • Handling multi-lingual search queries requires semantic embeddings that map local dialects (e.g., Kannada descriptions) to centralized crime schemas.
  • Performance optimization is crucial when querying massive crime databases with low latency.

05 // TARGET_OUTCOMES

  • Optimized crime case search times, transforming keyword-based filtering into semantic conversations.
  • Presented predictive insights on crime hotspots using Tableau visualizations.