Rubèn Tito

Short bio

I have recently completed my Ph.D. on “Exploring the role of Text in Visual Question Answering on Natural Scenes and Documents”. Highlighting the importance of text in natural scenes, and using VQA as a natural language interface to guide information extraction on document images. During this journey, I analyzed the needs and flaws of existing research and industry, pursuing the most interesting topics such as integrating and exploiting new modalities in VQA, or processing long sequences of tokens.

Moreover, I’ve always been a very versatile person, being able to explore new challenges and domains either because they were interesting or to help some colleagues in their research. Thanks to this, although I started my research working on scene-text image retrieval and handwritten documents, I moved soon to the challenging and multimodal VQA task, which allowed me to investigate and pursue my Ph.D. During which I have also worked on model calibration, federated learning or privacy preservation with differential privacy.

Beyond my scientific profile, I love to go to the mountains to ease my mind and admire the beauty of the nature. On the other hand, I enjoy participating in Artificial Intelligence panels open to the public at Barcelona because one of my aims is to bring the progress and concerns from the research centers to the public.

For all the details about my career, you can download my Curriculum Vitae.

📚 Publications

Here I list the most relevant publications. You can check out my Scholar profile for more information.

12 publications

631 citations

Index h: 9


Multi-page DocVQA

Authors: Rubèn Tito, D. Karatzas and E. Valveny

Journal: Pattern Recognition 2023

Privacy-Aware Document Visual Question Answering

Rubèn Tito, K. Nguyen, M. Tobaben, R. Kerkouche, ... E. Valveny, A. Honkela, M. Fritz, and D. Karatzas

Conference: Soon!

Document understanding Dataset and Evaluation (DUDE 😎)

Authors: J. Van Landeghem, Rubèn Tito, Ł. Borchmann, ... and T. Stanislawek

Conference: ICCV 2023

OCR-IDL

Authors: *A. Biten, *Rubèn Tito, L. Gomez, E. Valveny and D. Karatzas

Conference: ECCV 2022

InfographicsVQA

Authors: M. Mathew, V. Bagal, Rubèn Tito, D. Karatzas, E. Valveny and C.V. Jawahar

Conference: WACV 2022

Document Collection Visual Question Answering

Authors: Rubèn Tito, D. Karatzas and E. Valveny

Conference: ICDAR 2021

Single-Page Document Visual Question Answering

Authors: *M. Mathew, *Rubèn Tito, D. Karatzas and C.V. Jawahar

Conference: CVPR 2020

Scene Text Visual Question Answering

Authors: *A. Biten, *Rubèn Tito, *A. Mafla, L. Gomez, M. Rusiñol, E. Valveny and D. Karatzas

Conference: ICCV 2019

🎉 Beyond ML


TEACHING -- Barcelona Activa

🎵 Music

I founded Triple Trouble back in 2015 with my friends and together we made two albums and had a lot of unforgettable experiences. Check out our music on Spotify.
If you want to listen / discover more, check out my Linktree instead.

🙏🏽 Volunteering

I have been one of the founders of Solo Tu Puoi Farlo ("only you can do it"): our group have organized fundraisings, awareness campaigns and food collections for years. You can find all the insights and contacts on our Page.

🌍 Traveling

Everytime I have the chance I try to explore new parts of the world: it is an hard journey to complete it all. In the mean time I really like to keep track of the places I have touched. Check out my map 🗺 here.

Challenges - Open panels (Makers Fair)

👨🏻‍💻 Develop

I had the pleasure to develope the webpage of a 3D professional graphic designer, both to show his works and shorten the path within its client. You can find the website 🔗 here.

📬 Keep in touch!