Skip to content
View hanoonaR's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@mbzuai-oryx

Block or report hanoonaR

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hanoonaR/README.md

Hi there 👋

I am a Computer Vision researcher focused on multimodal understanding and reasoning. I am currently pursuing my PhD at MBZUAI and have worked with Meta and Adobe as a Research Scientist Intern. My work centers on building more reliable, interpretable, and grounded multimodal models, and has resulted in impactful research publications.

Pinned Loading

  1. mbzuai-oryx/groundingLMM mbzuai-oryx/groundingLMM Public

    [CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

    Python 951 54

  2. mbzuai-oryx/Video-CoM mbzuai-oryx/Video-CoM Public

    (🔥CVPR 2026) Video-CoM: Interactive Video Reasoning via Chain of Manipulations

    Python 20

  3. mbzuai-oryx/VideoMathQA mbzuai-oryx/VideoMathQA Public

    VideoMathQA is a benchmark designed to evaluate mathematical reasoning in real-world educational videos

    23 1

  4. mbzuai-oryx/Video-ChatGPT mbzuai-oryx/Video-ChatGPT Public

    [ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…

    Python 1.5k 130

  5. mbzuai-oryx/LLaVA-pp mbzuai-oryx/LLaVA-pp Public

    🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

    Python 845 61

  6. object-centric-ovd object-centric-ovd Public

    [NeurIPS 2022] Official repository of paper titled "Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection".

    Jupyter Notebook 296 21