Danilo S. Carvalho, Ph.D. Assistant Professor - Computer Programming / AI / NLP

Vision & Research Focus

My research work deals with understanding and filling the gap between the realm of human though, and in particular human language and the realm of computer machinery, which is the key component the next generation of intelligent systems that will be able to automatically understand and process the meaning of information at scale.
I have been involved with Artificial Intelligence (AI) and Natural Language Processing work for almost a decade, in which this field has seen several big advancements in technology and practical applications, from the generation and organization of massive textual corpora to Neural-based Machine Translation. Now, the application of AI in many areas hangs in the ability to explain the answers it provides, with both the analysis of healthcare information and of the propagation of misinformation on online social media posing as challenging, but necessary testing grounds for the design of explainable AI, wherein lies my current efforts.


Danilo Carvalho is an Assistant Professor at the Computer Science Department (DCC) at the Federal University of Rio de Janeiro (UFRJ). Currently a postdoctoral research fellow at the Center for Technological Development in Health of Oswaldo Cruz Foundation (FIOCRUZ), working on Artificial Intelligence (AI) applied to innovation in healthcare.

Current Research

◈ Applied research in the fields of Natural Language Processing, Knowledge Representation, on the analysis of patent, bibliographical, and biotechnology databases.

◈ Technological development project, in the scope of data analysis automation and strategic monitoring on healthcare innovation.

◈ Online social media analysis and media literacy.

Areas of Interest

◈ General

  • Computational Linguistics / Natural Language Processing
  • Artificial Intelligence
  • Data Science
  • Software Engineering

◈ Specific (summary)

  • Open Information Extraction
  • Semantic Representation
  • Patent / Bibliographical Databases
  • Language Models
  • Explainable AI

Current Teaching

2020/1 - Computer Programming II (Java) (BSc) [Portuguese] -- DCC - UFRJ


TDV: Word vector representation based on Wiktionary meanings [more]

  • Morpheme to phrase representation
  • NLP features: Muilti-language, sense polarity, sense disambiguation by POS

EasyESA: Easy Semantic Approximation with Explicit Semantic Analysis [more]

  • Provides concept vectors and a semantic relatedness measure
  • Query explanations can give insights on relatedness results

Graphia: Extraction of Structured Discourse Graphs from text [more]

  • Performs Named Entity Resolution to DBpedia entities and co-reference resolution
  • Serialization of discourse graphs as RDF