Every survey collected in Survey Solutions is supplemented with a paradata file. No actions are needed from the questionnaire designers or headquarters team to collect it, it is produced automatically by the Survey Solutions software.
The paradata files describe the process of data collection. They explain how the data was entered, detailing all edits, who and when undertaken them. These files may be large and most conveniently processed using specialized statistical packages, rather than general purpose tools.
The paradata is supplied in a zip archive with a tab-delimited data file and supplementary meta-data files inside.
Contents of paradata export archive
paradata.tab | Paradata in tab-delimited format. Each line of this file corresponds to one recorded event. See below description of the columns of this file. |
paradata.do | Script for Stata statistical package to import tab-delimited paradata. Example |
export__readme.txt | Human-readable description file (in text format). Example |
export__info.json | Machine-readable description/identification file (in JSON format). Example |
Columns of paradata.tab
file
The paradata events are recorded (along with their attributes) placed in the following columns:
interview__id | 32-hexadecimal ID (GUID) of the interview affected by the event. | 75efdc0456fb4b35be4690bd19eab870 | |
order | Numeric sequential ID of the event (starts from 1 for every interview). | 1 | |
event | Type of the event that has been recorded (see coding of event types below). | AnswerSet | |
responsible | Login name of the person responsible for the event | Enumerator25 | |
role | Role of the person mentioned in the responsible column (see coding of roles below) | 1 | |
timestamp_utc | Date and time when the event occurred combined in a single timestamp (in UTC), using the following format: YYYY-MM-DDThh:mm:ss.msc | 2018-12-28T23:00:59.123 | |
tz_offset | Time zone offset (relative to UTC). | -05:00:00 | |
parameters | One or more parameters of the event, the interpretation of which depends on the type of event. | GPSLOC||16.73526463,75.93207878[13]27||2.0 |
Paradata events and associated codes
The following table outlines the types of the events tracked by Survey Solutions. You may encounter some of them (though not all) in the paradata. The table also provides the interpretation of the parameters column corresponding to each type of event. Presence of some of the types of events is dependent on the version of Survey Solutions.
(alphabetical) | |||
---|---|---|---|
AnswerRemoved | 3 | Question's answer was removed (cleared). | varname||OptionalRosterAddress |
AnswerSet | 2 | Question was answered in the interview. | varname||value||OptionalRosterAddress Values are mostly same as they are present in the tab-delimited export files, with a few exceptions where the value in the tab-delimited file is split among multiple columns. Values of multiselect questions are recorded as codes of selected items separated by commas: 323.0, 315.0, 147.0 Values of text list questions are recorded as specified items separated by the |-character: Sergiy|Maryna|Natalia Values of GPS questions are represented in the form latitude,longitude[accuracy]altitude, such as 16.73526463,75.93207878[13]27 |
ApproveByHeadquarter | 8 | Indicates when the interview was approved by an HQ user. | Comment entered by the HQ user during approval. |
ApproveBySupervisor | 7 | Indicates when the supervisor approved the interview. | Comment entered by the supervisor during approval. |
ClosedBySupervisor | 30 | Indicates when the supervisor closed the interview opened for a review. | NO PARAMETERS |
CommentSet | 4 | Occurs when a comment was written to a question in the interview. | varname||comment - if the question is not in any roster varname||comment||OptionalRosterAddress - if the question is part of a roster. |
Completed | 5 | Indicates when the interview was marked as completed by the interviewer. | Comment entered by the interviewer at completion. |
Deleted | 11 | Reserved | NO PARAMETERS. |
GroupDisabled | 16 | ⚠ This event is not included in the exported paradata file. Event that corresponds to the group (section, subsection) being declared as enabled (to be skipped, not skipped). | NO PARAMETERS. |
GroupEnabled | 15 | ⚠ This event is not included in the exported paradata file. Event that corresponds to the group (section, subsection) being declared as disabled (to be answered, not skipped). | NO PARAMETERS. |
InterviewCreated | 32 | Occurs when the interview is created. | NO PARAMETERS. |
InterviewerAssigned | 1 | Event that occurs when the interviewer becomes responsible for the interview (for example, when the interview is created from an assignment). | Name of the interviewer that became responsible for this interview. |
InterviewModeChanged | 31 | Event that occurs when the interview mode is set or changed (for example, when the interviewer switches from CAPI to CAWI). | New mode: CAPI or CAWI. |
KeyAssigned | 27 | Newly created interview is assigned an interview key. Also occurs when a key of the interview is modified tue to a collision with an existing interview's key. Latest event will reflect the current interview key. Event may once OR twice per interview only. | Interview key in the form: NN-NN-NN-NN |
OpenedBySupervisor | 29 | Indicates when the supervisor opened the interview for a review. | NO PARAMETERS |
Paused | 25 | Indicates a prolonged pause during the interviewing process, such as when the tablet goes into the sleep mode to conserve power. | NO PARAMETERS |
QuestionDeclaredInvalid | 18 | Event corresponding to the situation when the value of the question deemed to be invalid (not passing the specified validation). | varname||OptionalRosterAddress |
QuestionDeclaredValid | 17 | Event corresponding to the value of the question deemed to be valid (passing the specified validation). | varname||OptionalRosterAddress |
QuestionDisabled | 14 | ⚠ This event is not included in the exported paradata file. Event corresponding to the question being set to disabled state (question is to be skipped, not answered). | varname||OptionalRosterAddress |
QuestionEnabled | 13 | ⚠ This event is not included in the exported paradata file. Event corresponding to the question being set to enabled state (question is to be answered, not skipped). | varname||OptionalRosterAddress |
ReceivedByInterviewer | 20 | Indicates reception of the rejected interview by the interviewer on the tablet. | NO PARAMETERS |
ReceivedBySupervisor | 21 | Indicates when the completed interview was received by the supervisor. | NO PARAMETERS |
RejectedByHeadquarter | 10 | Occurs when an interview is rejected by a headquarters user. | Comment written by the HQ user when the interview was rejected. |
RejectedBySupervisor | 9 | Occurs when an interview is rejected by the supervisor. | Comment written by the supervisor when the interview was rejected. |
Restarted | 6 | Occurs when an interview is restarted on a tablet (from a completed status). | NO PARAMETERS |
Restored | 12 | Reserved. Not to be confused with 'restarted' above. | NO PARAMETERS |
Resumed | 26 | Indicates resuming work on the interview, such as when the tablet wakes up after going to a sleep mode. | NO PARAMETERS |
SupervisorAssigned | 0 | Newly created interview is assigned as responsibility to the team of the interviewer, which started the interview. | NO PARAMETERS |
TranslationSwitched | 28 | Occurs when the language (translation) of the interview is switched. | Name of the selected language. |
UnapproveByHeadquarters | 19 | Occurs when the interview was unapproved by the HQ (or an admin) user. | Comment (if provided) when the interview was unapproved. Typically the automatic message "[Approved by Headquarters was revoked]". |
VariableDisabled | 24 | Occurs when a variable is disabled (when it is part of a section which gets disabled). | varname||value||OptionalRosterAddress |
VariableEnabled | 23 | Occurs when a variable is enabled (when it is part of a section which gets enabled). | varname||value||OptionalRosterAddress |
VariableSet | 22 | Occurs when a variable is recalculated. | varname||value||OptionalRosterAddress |
Varname is the name of the data variable corresponding to a question or a calculated variable (as specified in the Questionnaire Designer).
Coding of roles
Numeric codes used to encode the role in the paradata records have the following meaning:
Numeric code | Meaning |
---|---|
0 | <UNKNOWN ROLE> |
1 | Interviewer |
2 | Supervisor |
3 | Headquarter |
4 | Administrator |
5 | API User |