Priyaranjan (“Priyan”) Pattnayak is a Senior Principal Data Scientist at Oracle Cloud working on Generative AI, NLP, and AI robustness and leading initiatives for enterprise scale multilingual and multimodal systems.

My work focuses on building production-grade AI frameworks that power real-world business experiences, combining deep learning, tokenization research, document understanding, and conversational agents.

You can find my publications, blogs, and recorded talks here.

Selected Publications

  • IndicJR: A Benchmark For Jailbreak Robustness of Multilingual LLMs in South Asia
    Accepted at EACL 2026 [PDF]

  • A Diagnostic Framework for Auditing Reference-Free Vision-Language Metrics
    Accepted at AACL 2025 [PDF]

  • LLM-Guided Lifecycle-Aware Clustering of Multi-Turn Customer Support Conversations
    Accepted at AACL 2025 [PDF]

  • MVTamperBench: Evaluating Robustness of Vision-Language Models
    Accepted at ACL 2025 [PDF]

  • Hard Negative Mining for Domain-Specific Retrieval in Enterprise Systems
    Accepted at ACL 2025 [PDF]

  • Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia
    Accepted at ACL 2025 [PDF]

  • SweEval: Do LLMs Really Swear? A Safety Benchmark for Testing Limits for Enterprise Use
    Proceedings of NAACL 2025 [PDF]

  • Hybrid AI for Responsive Multi-Turn Online Conversations with Novel Dynamic Routing and Feedback Adaptation
    Proceedings of NAACL 2025, KnowledgeNLP [PDF]

  • Clinical QA 2.0: Multi-Task Learning for Answer Extraction and Categorization
    Accepted at IEEE EIT 2025 [PDF]

  • Tokenization Matters: Improving Zero-Shot NER for Indic Languages
    Accepted at IEEE EIT 2025 [PDF]

  • LLM for Barcodes: Generating Diverse Synthetic Data for Identity Documents
    Proceedings of WCSC 2025 [PDF]

  • Enhancing Document AI Data Generation Through Graph-Based Synthetic Layouts
    IJERT 2024 [PDF]

  • Review of Reference Generation Methods in Large Language Models
    IJAI&ML 2025 [PDF]

  • Survey of Multimodal Model Datasets, Application Categories and Taxonomy
    Proceedings of CVR 2025 [PDF]

  • Joint Answer Extraction and Medical Categorization
    arXiv 2023 [PDF]

  • A Novel Method of Predicting Rainfall in Pacific NorthWest
    Researchgate 2017 [PDF]

  • Monthly Auto Sales in US - Time Series Analysis using SARIMA
    Researchgate 2017 [PDF]


Blogs & Talks


Service & Leadership

  • Reviewer: NAACL, ACL, EACL, AACL, IEEE Access, ICML, ICLR, WWW and other conferences
  • Mentor: Early-career data scientists at Oracle and open-source contributors
  • AAAI Undergraduate Consortium: Mentoring undergrad students from Cornell University for NSF Research Grants
  • Contributor: Indic NLP expert, tokenization and robustness benchmarks (like SweEval, MvTamperBench etc.)

Experience

Oracle Cloud (2018–Present)

Senior Principal Data Scientist

  • Leading development of Generative AI and NLP capabilities across enterprise applications
  • Architecting multi-turn conversation systems, tokenization pipelines, and hybrid response engines
  • Spearheading research collaborations and patentable innovations in AI observability and dynamic routing
  • Focus on Generative AI robustness and AI scheduling platforms for cloud support

T-Mobile/Procogia 2018

Data Science & Analytics

  • Developed predictive analytics models for customer churn and segmentation
  • Built scalable reporting and ML infrastructure for internal business teams

TCS 2013-2017

Data Science Consultant

  • Led cross-functional projects in financial analytics
  • Delivered machine learning workflows for forecasting and optimization
  • Multiple On-the-Spot awards and best developer award
  • Authored Reference Architecture doc for NoSQL database for CoE Bangalore

Education

University of Washington, Seattle, US

Master of Science in Information Systems & Data Science 2018

  • Graduated with Honors (Top 5 in Class)

Odisha University of Technology & Research (OUTR, Formerly CET), Bhubaneswar, India

Electrical Engineering 2013

Recent Posts