🎓 Master’s student in Digital Engineering | Multimodal AI & Medical Imaging
🏢 Software Engineer @ Carl Zeiss Meditech AG | Formerly @ Tata Consultancy Services
🔬 Research interests: Multimodal learning, generative AI, speech & vision systems
I work on multimodal AI systems that combine vision and language with a focus on building robust, interpretable and deployable models.
My background spans medical image processing, generative models, and human–AI interaction, and I am currently transitioning toward speech recognition, speech synthesis, and multimodal communication systems for clinical and assistive applications.
- Multimodal machine learning (vision, text, speech, biosignals)
- Medical image processing & segmentation (AS-OCT, surgical imaging)
- Generative models (image generation, representation learning)
- Speech recognition & speech synthesis (text-to-speech)
- Explainable & interpretable AI for neural and clinical systems
- Human–AI interaction and evaluation
- Multimodal Image Generator — Master’s thesis on text–image semantic alignment
- Medical Image Segmentation Pipelines — De-warping & segmentation for clinical imaging
- Human–AI Image Generation Tool — Web-based system for co-creation and reflection
Languages: Python, C++, Java, MATLAB
Frameworks: PyTorch, MONAI, LangChain
Models: CNNs, U-Net, Transformers, GANs
Tools: Docker, GitHub, GitLab, CI/CD, HPC (SLURM)
-
Image Generation with Argument–Aspect Fusion and Rank Prediction
CEUR Workshop Proceedings
https://ceur-ws.org/Vol-4038/paper_380.pdf -
Overview of Touché 2025: Argumentation Systems
CEUR Workshop Proceedings
https://ceur-ws.org/Vol-4038/paper_378.pdf
- Email: stsharatanand@gmail.com
- Website: sharat-anand.github.io
- LinkedIn: linkedin.com/sharat-anand
⭐ I use this GitHub to share research code, experiments, and reproducible systems.