// Senior Data Science Specialist & AI Researcher
Arnol P S
About Me
I'm a Senior Data Science Specialist with 5+ years of experience designing and deploying production-grade AI/ML systems across Computer Vision, NLP, and Generative AI.
I build multi-agent architectures, hybrid RAG pipelines fusing vector and graph databases, and real-time conversational AI platforms. My work spans the full stack — from semantic chunking and embedding pipelines to LLM guardrails and production observability.
I've led patent-pending research on biometric identification using DINOv2 feature extraction and wavelet-based image enhancement, achieving 92% accuracy with novel computer vision techniques.
Received Shout-Out award for contributions to AI Management System documentation and preparedness for the ISO 42001:2023 surveillance audit.
Experience
Professional journey in AI/ML and Data Science
Senior Engineer - Data Science
Reflections Info Systems Pvt. Ltd.
Leading R&D of AI-powered enterprise solutions spanning multi-agent architectures, hybrid RAG systems, and real-time conversational AI.
- Built multi-agent log analytics platform with 4-agent event-driven pipeline processing 10K+ daily logs
- Designed hybrid RAG system fusing Qdrant vector search, Neo4j graph traversal, and Reciprocal Rank Fusion ranking
- Implemented 5-layer security pipeline with LlamaGuard content safety, PII sanitization, and prompt injection prevention
- Architected real-time WebSocket streaming with buffer-then-sanitize pattern for secure LLM output delivery
- Built Vision-Language Model pipelines for multi-format document extraction with provider failover
- Built agentic outreach platform with MCP server architecture, multi-phase LLM workflows, and template-driven document generation
- Contributed to ISO 42001:2023 AI Management System surveillance audit - received Shout-Out award for documentation quality
Senior Data Scientist - Consultant
Digital University Kerala
Led research initiatives in computer vision and NLP, including patent-pending biometric identification system.
- Led patent-pending cattle identification research using DINOv2 (92% accuracy)
- Developed novel wavelet-based image enhancement pipeline
- Built semantic document search engine with vector databases
- Implemented LLM-based document Q&A systems using AWS Bedrock
Senior Software Engineer - AI/ML
Techversant Infotech
Developed and deployed AI/ML solutions for enterprise applications.
- Built RAG applications with memory for contextual conversations
- Developed face recognition systems with SOTA deep learning models
- Designed AI-powered proctoring tools using YOLO
Senior Engineer - Data Science
Digital University Kerala
Led development of ML-based search infrastructure and data processing systems.
- Engineered ETL pipelines for document extraction and Elasticsearch indexing
- Led team of 3 in developing semantic search infrastructure
- Created backend for 'Fun With AI' at Global Science Fest Kerala
Data Analyst
Digital University Kerala
Database optimization and analytics pipeline development.
- Created automated data pipelines reducing processing time
- Developed interactive data visualizations for reporting
Research Fellow
ICFOSS
NLP research for Malayalam language processing.
- Developed Morphological Analyzer for Malayalam
- Built sentiment analysis systems for Indian languages
- Conducted research on YouTube comment data
Research
Patent-pending innovations in computer vision and biometrics
Biometric Cattle Identification System
Computer vision system for individual cattle identification using muzzle patterns as biometric markers, analogous to human fingerprint recognition.
Problem
RFID-based livestock identification is susceptible to tampering, loss, and requires time-consuming manual verification.
Solution
Non-invasive, tamper-proof biometric identification using deep learning and novel image enhancement techniques.
Technical Pipeline
Key Innovation
Novel wavelet-based ridge enhancement algorithm adapting fingerprint recognition techniques (BayesShrink denoising, biorthogonal wavelet contrast enhancement, morphological skeletonization) for biological pattern extraction.
Research conducted at Digital University Kerala
with literature review of 27 academic references.
Dataset creation, model training, and end-to-end system architecture designed and implemented independently.
Industry Projects
Production-grade AI/ML solutions built at scale
Hybrid RAG Sales Intelligence
Proposal search assistant fusing vector search, graph database traversal, and Reciprocal Rank Fusion with 5-layer security.
- 3-path query routing: metadata (Cypher) / content (semantic) / general
- RRF ranking engine fusing 4 weighted sources (0.4/0.3/0.2/0.1)
- Semantic chunking with cosine similarity thresholding at 0.5
Document Verification
AI-powered document verification using RAG and multi-agent orchestration for regulatory compliance.
- Multi-agent architecture for extraction & verification
- Hybrid RAG with confidence-based classification
- 39 articles, 149+ compliance criteria verified
Vision-Based Document Extraction
Multi-modal AI pipeline for structured data extraction from financial documents using Vision-Language Models.
- Adapter pattern supporting 13+ document formats
- VLM fallback strategy with quality thresholds
- Multi-provider routing with priority-based failover
AI-Powered Log Analytics Platform
Enterprise log analytics with multi-agent orchestration, semantic error grouping, and anomaly detection processing 10K+ daily logs.
- 4-agent event-driven pipeline (Triage → Mapper → Analysis → Notifier)
- Semantic error grouping with 0.85 similarity threshold and anomaly scoring
- 5-workflow decision tree routing with 200ms–2000ms processing budgets
AI-Powered Outreach Automation
Agentic LLM platform orchestrating multi-phase target discovery, data enrichment, fitness scoring, and branded document generation via MCP server architecture.
- 5-phase agentic workflow: discovery, enrichment, scoring, document generation, outreach
- Custom MCP server with 11 tools for API abstraction and response trimming
- Template-driven document pipeline (PPTX, DOCX, XLSX) with zero LLM hallucination in output
Agentic AI for Roadside Assistance
Conversational AI platform with real-time voice synthesis, sentiment analysis, and intelligent technician routing.
- Multi-model LLM pipeline for real-time analysis
- WebRTC-based live transcription with sub-256ms latency
- Real-time sentiment tracking and escalation
AI-Powered Debt Collection
Voice AI platform with multi-model analysis pipeline for automated loan recovery conversations.
- Sub-256ms voice synthesis latency
- Automated promise extraction from calls
- Stage-specific conversation strategies
Production RAG System
Semantic search with two-stage retrieval, cross-encoder reranking, and content guardrails for specialized domains.
- 94.7% query classification accuracy
- 6 configurable chunking strategies
- Bilingual support with real-time SSE streaming
Skills & Technologies
Tools and technologies I work with daily
Machine Learning & AI
Large Language Models
Computer Vision
NLP & RAG
Databases & Search
AI Safety & Observability
Backend & APIs
Voice & Conversational AI
Cloud & Infrastructure
Languages
Education & Credentials
Academic background and professional certifications
Education
Master of Science in Computer Science
Data Analytics
Indian Institute of Information Technology and Management - Kerala (IIITM-K)
Cochin University of Science and Technology
Bachelor of Science
Computer Science, Statistics and Mathematics
Kristu Jayanti College, Bengaluru
Certifications
Google Data Analytics Professional Certificate
Building Real-Time Video AI Applications
NVIDIA Deep Learning Institute
Getting Started with Deep Learning
NVIDIA Deep Learning Institute
Languages
Get In Touch
Interested in collaborating on AI/ML research or projects? I'm always open to discussing new opportunities.