example cover photo

Hi! I'm Marcos Treviso,
a Ph.D student in the DeepSPIN project,
and a researcher at Instituto de Telecomunicações.

Take a look at my ongoing projects and at some of my publications.
You can read more about me here.
Contact me by email.

About Me

I am a 1st year Ph.D student within DeepSPIN, an ERC-funded research project coordinated by Prof. André Martins at IST / University of Lisbon. I am also a Machine Learning researcher at Instituto de Telecomunicações, with focus on Natural Language Processing.

Recently, I worked for three months as a Research AI Intern at Unbabel. During this period I contributed in OpenKiwi, an open-source project of quality estimation of machine translation by implementing state-of-the-art deep learning methods using PyTorch.

I obtained my M.Sc. in Computer Science and Computational Mathematics at the University of São Paulo (USP), under the supervision of Prof. Sandra M. Aluísio. As part of my research, I developed a free and open-source tool called DeepBondd. In addition, I received the best Master's dissertation award from 2015 to 2018 granted by the Brazilian Society for Computing at PROPOR 2018.

I took my bachelor's degree in Computer Science at the Federal University of Pampa. I started to learn about ML and NLP for my final paper, under the supervision of Fabio N. Kepler. As a result, I developed a part-of-speech tagger based on deep neural networks.


The project that I'm currently working on is my Ph.D. That's all I can say at the moment :-P. In addition, I'm still contributing with Sandra to improve our sentence segmentation tool. For more information contact-me by email. Take a look at my finished projects below:

DeepBondd: MSc project

We proposed a recurrent conv. neural network to sentence segmentation and disfluency detection and evaluated it on narrative transcripts from neuropsychological tests. The source code and neural models are available at github.
Granted by the Brazilian Ministry of Education (CNPq).
Researchers: Marcos Treviso and Sandra Aluísio.
More info

DeepTagger: BSc final paper

We built two deep neural networks for Part-of-Speech tagging and evaluated them on brazilian portuguese corpora. The tool is available at github along with saved models.
Researchers: Marcos Treviso and Fabio Kepler.
More info

Undergraduate Research

We developed modules in Python for ScriptLattes in order to perform data mining of researchers curricula in the Lattes Platform.
Granted by the Rio Grande do Sul Research Foundation.
Researchers: Marcos Treviso and Fabio Kepler.
More info


Master's dissertation (ptbr):

  • Segmentação de sentenças e detecção de disfluências em narrativas transcritas de testes neuropsicológicos. Universidade de São Paulo. 2017. [pdf] [bib]


  1. * Sentence segmentation and disfluency detection in narrative transcripts from neuropsychological tests. PROPOR - International Conference on the Computational Processing of Portuguese 2018. * 1st place Best Dissertation on Language Technology for Portuguese (from 2015 to 2018) [pdf] [bib]
  2. Sentence Segmentation in Narrative Transcripts from Neuropsychological Tests using Recurrent Convolutional Neural Networks. EACL - European Chapter of the Association for Computational Linguistics. 2017. [pdf] [bib]
  3. * Evaluating Word Embeddings for Sentence Boundary Detection in Speech Transcripts. STIL - Symposium in Information and Human Language Technology. 2017. * 2nd best paper [pdf] [bib]
  4. Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks. STIL - Symposium in Information and Human Language Technology. 2017. (4th author) [pdf] [bib]
  5. PELESent: Cross-domain polarity classification using distant supervision. BRACIS - Brazilian Conference on Intelligent Systems. 2017. (5th author) [pdf] [bib]
  6. Demo: DeepBonDD: a Deep neural approach to Boundary and Disfluency Detection. PROPOR - International Conference on the Computational Processing of Portuguese 2018.


  1. Detecting mild cognitive impairment in narratives in Brazilian Portuguese: first steps towards a fully automated system. Revista Letras de Hoje. 2018. [pdf] [bib]