CEA DAM structures its business language

By Hélène Jacquenet, document engineering consultant at ContentSide and Isabelle Gaillard, project manager at the CEA DAM Central Archives Office

The archives office of CEA DAM (Military Applications Directorate) has launched a project to re-engineer its thesaurus. This project is being carried out using an efficient and personalised methodology that will facilitate exchanges between the teams and the dissemination of knowledge.

A look back at this co-construction process carried out jointly by Isabelle Gaillard, project manager at the CEA DAM Central Archives Office, and Hélène Jacquenet, document engineering consultant at ContentSide.

How can we understand each other when we don't speak the same language? This is a question that does not only concern translators. " We initiated a thesaurus re-engineering project a few years ago. The aim is to act as an interpreter between the different professions in the organisation, particularly for researchers. It is an approach and a tool designed to facilitate and enrich exchanges," says Isabelle Gaillard.

" A mathematician and a physicist do not use the same language. The documentalist, and his tool, the thesaurus, are a bridge of intermediation ", illustrates Hélène Jacquenet.

The tool that existed before the project, based on a closed index with approximately 11,000 terms, was not entirely satisfactory. " Association and synonymy links facilitated searches, but the lack of a hierarchy quickly limited the number of bounces and therefore the scope of searches in the collection," says Isabelle Gaillard.

At the start of the project, Isabelle Gaillard quickly realised the need to involve a viewpoint that was both specialised in documentation and thesauri, and external to the organisation: " Thefact that Hélène came to frame this project with experienced methods played an essential role in the success of this project. You have to accept to be blind, to trust the method . This is a sensitive point in order not to give in to the temptation to describe knowledge according to the specificity of the profession and the vocabulary of the organisation.

A rigorous methodology...

While the first stage of cleaning up the data (standardisation of writing, removal of unnecessary words, etc.) did not pose any particular problems, the next stage required a methodology. " Existing standards and theories, such as ISO 25964, are a good basis, but above all you have to adapt to the context, especially when an index already exists," says Hélène Jacquenet.

With this in mind, the project provided an opportunity to develop a methodological guide, " which was enriched as it went along. Its backbone ", says Isabelle Gaillard. The second stage consisted of defining 11 semantic categories, independent of the trades, such as: properties, events, actions, materials, etc., and assigning each term of the thesaurus exclusively to it, " in a univocal manner ", summarises Hélène Jacquenet.

A task based first on a sorting between concrete and abstract entities, then on a "logigram" of a dozen questions, "Can we locate? Can we date? Is it a unique product? While many terms naturally found their place at the end of this process, others raised questions, " such as welding, for example, which refers to a material, a property and an action at the same time. In these rare cases, we looked at how the existing documents were already indexed. The collection decided," explains Hélène Jacquenet.

... adapted to the company's context

This step removed the problems of ambiguity and also reduced the number of terms. It was followed by the subdivision of these categories into smaller groups, to make them easier to understand, and the assignment of hierarchical links. While some groups were not very hierarchical, such as geographical locations or scientific disciplines, others required more in-depth analysis. For this, Isabelle Gaillard and her colleagues could call on the CEA researchers who participated in the project. The decision depends on each context, each organisation," insists Hélène Jacquenet.

For example, the term algorithm refers to a particular definition. In our case, we have defined it as 'Functionally autonomous' and 'Result of an assembly or model'.

Recently finalised, the tool is currently being used by indexers. " It is intended to evolve and a working group will issue and validate proposals for additions, modifications or deletions of terms," says Isabelle Gaillard. It is currently being integrated into the search interfaces for all users.

The next step is to support users in the use of this new tool. " This is not the end of this human adventure but the beginning of a new one. In particular, because it will be a component of an electronic archiving system project," concludes Isabelle Gaillard.

About us

About Isabelle Gaillard Isabelle Gaillard is a trained documentalist and has been a consultant in document engineering for 10 years. A specialist in documentary languages, she participated in the development of a natural language knowledge base for France Telecom. After having worked in the municipal archives of Bordeaux, she joined the team of the Central Archives Office of CEA DAM.

About Hélène Jacquenet Hélène Jacquenet is a doctor and qualified lecturer in Information and Communication Sciences, and a graduate of the CNAM INTD, with the title of Project Manager in Documentary Engineering. With 15 years of experience in consulting, training and project management in information management, Hélène also led the Lyon regional delegation of the ADBS for several years. A specialist in digital information-communication devices and social interactions, she is responsible for the consulting activity and COO at ContentSide.

