Category Archives: SemBib

Services for bibliographic analysis

I present here the needs related to our approach of analysis of the production and publication of scientific documents – essentially articles – by Telecom ParisTech. It is the goal of the SemBib project. The articlesTélécom ParisTech has a bibliographical … Continue reading

Posted in NLP, SemBib | Leave a comment

Extract PDF text with Python

As part of our SemBib project to analyze the scientific production of Telecom ParisTech, I recover a lot of PDF files. To analyze the content, I need to get the raw text. In addition, as indicated in the blog Services … Continue reading

Posted in NLP, SemBib, Tutorial | Leave a comment

First contact with the tools of the Bibliographic Agency of Higher Education

As part of the SemBib project, I was led to choose a unique identifier for each author. Following my usual strategy, I started by using identifiers defined in our namespace, with our prefix. Thus, it was possible to produce results … Continue reading

Posted in Public data, Semantic taging, SemBib | 1 Comment

Unique Identifiers of Researchers versus Unicity of Identifiers of Researchers

As mentioned in the article “First contact with the tools of the ABES“, for the SemBib project, I started by using my own identifiers for the researchers. Then, I wanted to use identifiers coming from reference sources, starting with the … Continue reading

Posted in Semantic taging, SemBib | Leave a comment