The Experience of Creating the Database “Soviet Germans – Labor Army Members of Tagillag”

One of the rapidly developing areas of social history is prosopographical research, which involves studying specific social groups based on the individual biographical data of their members.

In 2000, under the leadership of Doctor of Historical Sciences V. M. Kirillov, the Problem Scientific Research Laboratory “Historical Informatics” at the Nizhny Tagil State Pedagogical Institute (NTSPI) created an electronic database, “Soviet Germans – Labor Army Members of Tagillag,” based on the personnel record cards of labor army members.

The cards, filled out during the existence of labor army formations composed of Soviet citizens of German nationality (1942–1946), contain data divided into two sections. The first section includes personal biographical and sociodemographic characteristics of the labor army member (full name, age, place of birth and residence before mobilization, etc.). The second section contains information about their mobilization and time in the labor column at an NKVD (People's Commissariat for Internal Affairs) facility (Tagillag or Bogoslovlag).

It is important to note that personnel record cards for labor army members began to be created in accordance with the temporary GULAG (Main Administration of Camps) instructions on prisoner record card forms only from May 19, 1942, that is, three months after the first train carrying mobilized Germans arrived at Tagillag. As a result, some personnel record cards of labor army members who had already left the camp by that time (due to escape or death) contain only their surnames, first names, and patronymics, as well as the year of birth or age. The other fields of the cards were evidently filled out based on statements made by the mobilized individuals and were not corroborated by appropriate documents. This conclusion is supported by the numerous discrepancies we discovered in geographical names, surnames, and even the names of military commissariats that conducted the mobilizations.

In the card indexes of corrective labor camps, the number of personnel record cards exceeds the total number of labor army members who passed through the labor columns of a given camp. When a labor army member returned to a labor column after leaving it earlier due to sentencing, a new personnel record card was often created. This new card frequently had a different personal file number and, in some cases, even a different surname from what was recorded on the original card.

The features of the source—consistency in document composition, continuity in content and form, and a high degree of structuring—allowed all the information from the personnel record cards to be incorporated into a standardized relational database (DB). Out of the many database management systems (DBMS) available, we selected the Access97 package (newer versions are now in use), a high-performance 32-bit system for managing relational databases. The Access97 package is designed for developing both local databases and distributed databases (client-server architecture) operating under WINDOWS 95–2000 and WINDOWS NT (now more modern operating systems).

The structure of the database is presented as follows: the DB consists of two tables and 37 fields, containing key personal characteristics (full name, age, gender), as well as information about the place of birth, occupation, education level, social background, and social status, along with details of the labor army member’s movements and types of work performed. To refine and automatically adjust the input data, the DB must be linked to external reference databases (directories of names and geographical locations have been created).

The principles of forming the described DB ensure the precise reproduction of the source content. Encoding qualitative characteristics allows for the aggregation and adjustment of input information, as well as data search and analysis.

Stages of Working with the Database

Input of information into the database. Information is entered into the database using a specialized graphical form. During this process, the spelling of names is corrected using an external database, "Names," and geographic names are verified against the external database "Region."

Processing and analysis of results. The mathematical tools of the database and the procedures embedded within it provide various capabilities to facilitate data processing and reanalysis (these have been partially implemented to date). For example, data can be grouped and filtered according to specified conditions, allowing for the segmentation of the total dataset based on one or several criteria.

A module written in the Visual Basic programming language, using SQL (Structured Query Language for managing relational databases), allows the creation of diverse queries and corresponding data selections based on 22 attributes and their combinations, with subsequent graphical representation of the results. The query creation form is presented in

For example, an automatic query identified records pertaining to labor-mobilized Soviet Germans. It was found that of the 7,232 individuals listed in the Tagillag labor army member index, the majority were indeed Soviet Germans. However, the index and database also include interned Germans from Germany, repatriated Soviet citizens, and labor army members of other nationalities. The following simple query algorithm made it possible to "filter out" all other categories except Soviet Germans:

Nationality
Personal File Number
Citizenship
Mobilized By

= German
< > Null
= USSR
< > Null

The database provides the capability for graphical representation of information (including query results) related to gender, age, social background, Party membership, nationality, education, profession, date of mobilization, work assignment, and the reason and date of departure of labor army members.

It is important to note that the greatest challenges arose when representing data on the professional activities of labor army members before their mobilization, which turned out to be highly inconsistent. In our case, the classification of professions from the 1939 census dictionary was used as a basis, with additional grouping applied.

The database can potentially be integrated into a larger data bank on the history of repressions.

The future prospects for using the “Soviet Germans – Labor Army Members of Tagillag” database involve incorporating it into larger data banks (primarily within the framework of the “Unified Electronic Data Bank of Victims of Political Repressions in the USSR”), which will require certain modifications to the database.

In addition to the personnel record card, which contains a relatively limited set of data, a large number of mass sources (personal files, record cards, questionnaires) are preserved in various state and departmental archives. These sources contain more extensive information about various aspects of an individual’s life during a specific time period, including family composition, professional and socio-political activities, criminal records, etc. (see Appendix 6). To convert the data from these sources into a machine-readable format, the creation of a more comprehensive database is required, for which there are at least two approaches.

A logical and simpler approach from a technical implementation perspective is the "person-centered approach," as the structure of the database in this case would be built around key aspects of a person's life activities—"family," "conviction," "education," and so on.

At the same time, it should be noted that there are several arguments in favor of incorporating elements of the "source-oriented approach" into the database structure.

In practical application, data entry into the database will occur such that each category of users will work with only one specific type of source, representing a single period in the individual’s life and containing a limited set of attributes. This eliminates the need (for this category of users) for a large number of "problem-oriented forms" in the database. Conversely, for such users, it would be optimal to input data into a single "source-oriented" form that visually mirrors the structure of the source.

The insufficient qualification of users directly entering data, particularly when working with a complex set of sources (containing numerous discrepancies) and a wide range of "problem-oriented" input forms, can lead to incorrect allocation of source information to database fields:

Data from various sources may differ, and all discrepancies must be recorded with reference to the source.

Some data are dynamic, meaning they can change over time (e.g., marital status, the number and composition of relatives, Party membership, education, etc.). As a result, sources created at different times may record different values for the same attribute. Recording only one "correct" value (e.g., the most recent) in the database field while discarding others is an incorrect approach.

Considering the above, we propose the following approach to building the database. While adhering to a "problem-oriented" structure at the data storage level, we suggest implementing a "source-oriented" interface: a set of forms replicating the structure of the most common sources (alongside conventional "problem-oriented" forms not tied to specific sources, such as "Conviction," "Family Members," "Work Activity," etc.). The advantage of this scheme is the ability to "emulate" one or more sources with minimal loss of the contained information, including discrepancies.

Thus, the described database, "Soviet Germans – Labor Army Members of Tagillag," can be used to characterize the social profile of a significant ethno-social group.

The authors have also outlined approaches that would allow the integration of the described database into larger data banks for studying the "life paths" of Soviet Germans.

As of now, within the work of the Historical Informatics Laboratory staff, the "Tagillag Labor Army Members" input form template has been applied to the creation of databases for individuals labor-mobilized to Tagillag, Bogoslovlag, Bakalstroi-Chelyabmetallurgstroi, Vosturallag, Sevurallag, and Ivdelag. To expand these databases on individual corrective labor camps (ITLs) in the Urals, two integrated data banks have been created: the "Electronic Memory Book of Russian Germans" (hosted on the RusDeutsch portal of the International Union of German Culture) and "Labor Army Members in ITLs of the Urals" (on the website of the Bavarian Cultural Center of Russian Germans).