

{
"name": "Arun Gautham Soundarrajan",
"occupation": "Data Scientist",
"likes": ["badminton", "ben & jerrys"],
}
About Me
Hi, I’m Arun, a passionate Machine Learning Scientist and Data Scientist with over three years of experience turning data into actionable insights and innovative solutions. I specialize in building predictive models, designing scalable systems, and solving complex problems with data-driven approaches.
My journey began with a fascination for how technology and algorithms can shape the world, and since then, I’ve contributed to projects that optimize processes, enhance user experiences, and drive business value. Beyond work, I’m always exploring new tools and frameworks—currently diving into Rust and ESP32 to expand my expertise in embedded systems and IoT.
When I’m not coding, you’ll likely find me on the badminton court, perfecting my game, or brainstorming my next personal project. I thrive on challenges and am constantly seeking opportunities to grow, lead, and innovate in the tech space.
My Experience
Senior Data Scientist
Sep 2023 - Present
Data Scientist
Jan 2022 - Sep 2023
Generative AI Innovation: Developed Aviva’s first Generative AI-powered application, enabling claim handlers to quickly access critical information and reduce call hold times by up to 40%, benefiting over 14,000 policyholders.
Involved in the full-stack development, including building back-end APIs, designing front-end UI, prompt engineering, and overseeing deployment.
Knowledge Base: Leading the development of an internal knowledge base using vector databases and Large Language Models (LLMs) to structure and query data from over 7,000 web pages and PDFs, improving efficiency in answering customer queries and reducing SME involvement.
Hackathon: Spearheaded infrastructure setup including custom API's & Vector Database and guided participants in developing Generative AI use cases at the Aviva's internal Hackathon. Delivered multiple prototypes in 10 days, securing business funding to expand AI initiatives.
Infrastructure & Efficiency: Built scalable infrastructure and reusable code packages, which streamlined development processes across the team, reducing time-to-delivery for new use cases.
Enhancing tNPS: Partnered with stakeholders to identify key drivers of transactional Net Promoter Score (tNPS), developing data models that pinpointed areas for improvement. This provided valuable insights that helped guide strategic decisions for enhancing customer satisfaction.
Fraud Detection with NLP: Designed and implemented a data annotation pipeline to streamline entity extraction from medical reports, leveraging outputs from other processes. Used the annotated data to train a Longformer Transformer model, enhancing feature engineering and analysis for fraud detection in claims. The model’s success led to further investments in expanding entity extraction efforts.
Real-Time Claims Classification During Storms: Engineered a real-time email classification system to handle the surge in claims during Storm Eunice. Delivered a solution in 48 hours that supported claim handlers in efficiently managing the increased workload.
Automating Complaint Classification: Developed and deployed advanced complaints classification models for customer and claims operations, automating the manual theming process that would have required 2 full-time employees. The solution, which performs in minutes, is now available for self-service, freeing up valuable resources to focus on resolving customer issues.





























Suffering from skill issues
Huggingface
Linux/Unix
Javascript
HTML/CSS






Scikit-learn


React / FastAPI / Django


SQL
Rust


Python


PyTorch
Docker

