Using large amounts of data in a GDPR-compliant manner

Automatic pseudonymization of court documents

Court decisions are of fundamental importance to all judges, prosecutors, lawyers, and their clients. This is because the rulings issued provide clues for future trial preparation. Before the decisions are published in the legal information system of the Austrian federal government, personal data must be completely removed and replaced by pseudonyms. Until now, this process has been carried out manually - a considerable workload that is all the more serious in times of staff shortages.

For our solution, the Austrian Federal Ministry of Justice (BMJ) and the Federal Computing Center (BRZ) received the eAward 2022 from Report Verlag in the category "Machine Learning and Artificial Intelligence." The jury spoke of the "perfect use of AI in repetitive applications" and "efficient personnel deployment in the justice system." We are particularly pleased with the "export quality to many other areas."

Legal and digitilization

The challenge

Pseudonyms instead of redaction

To ensure that the texts remain comprehensible, the personal data should not be blacked out, but pseudonymized: Identical elements such as names, dates of birth, places of residence, professions, etc. are given identical pseudonyms.

We take into account all relevant data that would allow conclusions to be drawn about a party, as well as information that would allow conclusions to be drawn about a thing or person. In addition, there is personal information, such as names of judges or addresses of courts, which is excluded from pseudonymization and should therefore be preserved in the text.

The solution

Staged AI processes lead to success using NER

With combined use of different algorithms and artificial intelligence (NLP: Named Entity Recognition (NER)) for pseudonymization of court decisions, we automate the process as much as possible.

In this way, we help to ensure that the rights of all parties to a proceeding to data protection and confidentiality are preserved - with little manual effort.

Our solution is future-oriented and scalable, as it can be used to pseudonymize not only historical documents, but also all content that arises in the future. The quality of the results is comparable to manual pseudonymization.

Further developments

Storage of structured data

In the future, documents can be tailored for different interest groups or parties - depending on which details they are allowed to view. This is possible because our pseudonymization solution recognizes entities such as judges, parties to proceedings, and addresses, and can blacken them as needed depending on the release level.

This extracted data can also be stored in a structured way to show, for example, relationship networks, roles in proceedings, and mentions of specific individuals in different documents.

Bundesministerium Justiz Österreich

Federal Ministry of Justice Austria

The Austrian Federal Ministry of Justice (BMJ) is responsible for civil law (civil law, commercial law, copyright law, contract insurance law, antitrust law, bankruptcy and compensation law), judicial criminal law, the organization of the judiciary, prosecutorial authorities, the administration of justice, the penal system and data protection.

The judiciary requires a fully digital and structured handling of the various procedural steps. The digital judicial workstation (DJA) forms the bridge between the analog and digital worlds and guides the user through all steps of the digitized court process.

Digitilization facilitates the work of judges, prosecutors and legal assistants: In hearing rooms, they can examine evidence themselves in the form of digital documents and present witnesses on the stand.

Would you like to use your inventory data in a GDPR compliant way?

Contact our AI experts now

Loading HubSpot form...