|
| 1 | +Advanced csv import |
| 2 | +============================================= |
| 3 | +This is the documentation of the advanced csv import command `advanced_csv_import`. For the documentation of the less complicated |
| 4 | +bulk-upload command `bulk-add-officers` see [`bulk_upload`](bulk_upload.md). `bulk-add-officers` accepts one csv containing information |
| 5 | +about the officer, including badge-number, jobs and salary and makes decisions on whether to update rows in the database |
| 6 | +or create new entries based on the existing data. |
| 7 | + |
| 8 | +The advanced csv upload is for the most part a way to copy data for one department into the database with as little as possible logic added on. |
| 9 | +So the tables provided in csv form represent the data that will be inside the sql tables after running the command. |
| 10 | +(With a few exceptions for many-to-many relationships and auxiliary models like location and license plates) |
| 11 | + |
| 12 | +Before you start |
| 13 | +---------------- |
| 14 | +CSV uploads should always be tested locally or in other non-production environments, and it is strongly recommended |
| 15 | +to have the database backed up before running the command. The command is designed to fail early and will |
| 16 | +only commit the changes if it didn't encounter any problems. The command however is pretty powerful |
| 17 | +and can therefore lead to data loss and inconsistencies if the provided csv files are not prepared correctly. |
| 18 | + |
| 19 | +Explanation of the command |
| 20 | +-------------------------- |
| 21 | +```shell |
| 22 | + /usr/src/app/OpenOversight$ flask advanced-csv-import --help |
| 23 | + Usage: flask advanced-csv-import [OPTIONS] DEPARTMENT_NAME DEPARTMENT_STATE |
| 24 | + |
| 25 | + Add or update officers, assignments, salaries, links and incidents from |
| 26 | + csv files in the department using the DEPARTMENT_NAME and DEPARTMENT_STATE. |
| 27 | + |
| 28 | + The csv files are treated as the source of truth. Existing entries might |
| 29 | + be overwritten as a result, backing up the database and running the |
| 30 | + command locally first is highly recommended. |
| 31 | + |
| 32 | + See the documentation before running the command. |
| 33 | + |
| 34 | + Options: |
| 35 | + --officers-csv PATH |
| 36 | + --assignments-csv PATH |
| 37 | + --salaries-csv PATH |
| 38 | + --links-csv PATH |
| 39 | + --incidents-csv PATH |
| 40 | + --force-create Only for development/testing! |
| 41 | + --overwrite-assignments |
| 42 | + --help Show this message and exit. |
| 43 | +``` |
| 44 | + |
| 45 | + |
| 46 | +The command expects two mandatory arguments, the department name and department state. |
| 47 | +This is to reduce the chance of making changes to the wrong department by mixing up files. |
| 48 | +Then there are 5 options to include paths to officers, assignments, salaries, incidents and links csv files. |
| 49 | +Then there is a `--force-create` flag that allows to delete and overwrite existing entries in the database. |
| 50 | +This is only supposed to be used in non-production environments and to allow replication of the data of another (in most cases production) |
| 51 | +instance to local environments to test the import command locally first. More details on that flag at the end of the document: :ref:`ref-aci-force-create`. |
| 52 | +Finally, there is `--overwrite-assignments` which simplifies updating assignments. Instead of updating them, |
| 53 | +all assignments for the relevant officers are deleted and created new based on the provided data. This flag is only |
| 54 | +considered if an assignments-csv is provided and ignored otherwise. See the instructions in |
| 55 | +the section on assignment-csv for more details. |
| 56 | + |
| 57 | +General overview of the csv import |
| 58 | +----------------------------------- |
| 59 | +The following lists the header fields that each csv can contain. If the csv includes any other fields, the command will fail. |
| 60 | +However, the fields are not case-sensitive and spaces are treated as `_`. So `Officer ID` can be used instead of `officer_id`. |
| 61 | + |
| 62 | +*All optional fields can be left blank and will be inserted as* `NULL` *or empty string as appropriate.* |
| 63 | +**Warning:** When updating a record a field that is left blank might overwrite an existing record. |
| 64 | +This can only be prevented by not including the column in the csv at all. |
| 65 | + |
| 66 | +.. _ref-aci-formats: |
| 67 | + |
| 68 | +Formats: |
| 69 | +- `date` - The date should be provided in `YYYY-MM-DD` format. |
| 70 | +- `time` - Time should be provided in `HH:MM:SS` 24h-format in the respective timezone. |
| 71 | +- `DEPARTMENT_STATE` - The department state should be provided in the [standard two-letter abbreviation](https://www.faa.gov/air_traffic/publications/atpubs/cnt_html/appendix_a.html) format. |
| 72 | + |
| 73 | + |
| 74 | +The `id` field |
| 75 | +-------------- |
| 76 | +Each csv corresponds to a table in the OpenOversight database. And each csv file has to include `id` as a field in the table. |
| 77 | +That field has one primary purpose: If the field is blank, it is assumed that that row is a new entry. |
| 78 | +If the field contains a number however, it is assumed that a record with that particular id already exists in the database |
| 79 | +and the record will be updated according to the provided fields. Finally, in the case of officers and incidents |
| 80 | +there is a third option where the `id` field can contain a string that starts with `#`. This also indicates a new record, |
| 81 | +but that new record can be referenced in other provided tables. (for example as the `officer_id` in the salaries csv) |
| 82 | + |
| 83 | + |
| 84 | + |
| 85 | +Officers csv |
| 86 | +------------ |
| 87 | +- Required: `id, department_name, department_state` |
| 88 | +- Optional: `last_name, first_name, middle_initial, suffix, race, gender, employment_date, birth_year, unique_internal_identifier` |
| 89 | +- Ignored: `badge_number, job_title, most_recent_salary, unique_identifier` (Unused but command will not fail when field is present) |
| 90 | + |
| 91 | +### Details |
| 92 | + |
| 93 | +- `department_name` - Name of department exactly as it is in the server database. |
| 94 | + This needs to match the department name provided with the command. |
| 95 | +- `department_state` - Name of department state exactly as it is in the server database, which will be the |
| 96 | + [standard two-letter abbreviation](https://www.faa.gov/air_traffic/publications/atpubs/cnt_html/appendix_a.html) for the department's respective location. |
| 97 | + This needs to match the department state provided with the command. |
| 98 | +- `unique_internal_identifier` - A string or number that can be used to |
| 99 | + uniquely identify the officer, in departments in which the badge |
| 100 | + number stays with the officer using that number is fine. Can and should be left blank |
| 101 | + if no such number is available. |
| 102 | +- `first_name` & `last_name` Will be inserted into the database as is. |
| 103 | +- `middle_initial` - Usually up to one character, but can be more. |
| 104 | +- `suffix` - Choice of the following values: `Jr, Sr, II, III, IV, V`. |
| 105 | +- `gender` - One of the following values: `M`, `F`, `Other`. |
| 106 | +- `race` - One of the following values: `BLACK`, `WHITE`, `ASIAN`, `HISPANIC`, `NATIVE AMERICAN`, `PACIFIC ISLANDER`, `Other`. |
| 107 | +- `employment_date` - [Date](https://help.highbond.com/helpdocs/analytics/13/user-guide/en-us/Content/table_definition/c_formats_of_date_and_time_source_data.htm) representing the start of employment with this department. |
| 108 | +- `birth_year` - Integer representing the birth year of the officer in a `yyyy` format. |
| 109 | + |
| 110 | +Assignments csv |
| 111 | +--------------- |
| 112 | +- Required: `id, officer_id, job_title` |
| 113 | +- Optional: `badge_number, unit_id, unit_name, start_date, resign_date` |
| 114 | + |
| 115 | +### Details |
| 116 | + |
| 117 | +- `officer_id` - Number referring to `id` of existing officer or string starting with `#` referring to a newly created officer in the provided officers csv. |
| 118 | +- `badge_number` - Any string that represents the star or badge number of the officer. In some departments this number changes with the assignment. |
| 119 | +- `job_title` - The job title, will be created if it does not exist. |
| 120 | +- `unit_id` - ID of existing unit within the department. |
| 121 | +- `unit_name` - Name of the unit, only used if the `unit_id` column is not provided. |
| 122 | +- `start_date` - [Start date](https://help.highbond.com/helpdocs/analytics/13/user-guide/en-us/Content/table_definition/c_formats_of_date_and_time_source_data.htm) of this assignment. |
| 123 | +- `resign_date` - [End date](https://help.highbond.com/helpdocs/analytics/13/user-guide/en-us/Content/table_definition/c_formats_of_date_and_time_source_data.htm) of this assignment. |
| 124 | + |
| 125 | +### Special Flag |
| 126 | + |
| 127 | +The `--overwrite-assignments` in the command can be used to not merge new with existing assignments. |
| 128 | +Instead, all existing assignments belonging to officers named in the `officer_id` column are deleted first, |
| 129 | +before the new assignments contained in the provided csv are created in the database. |
| 130 | + |
| 131 | +This should only be used if the provided csv contains both the currently in the database and additional assignments, |
| 132 | +or is based on a better and more complete dataset, for example after receiving a dataset for historic assignment data. |
| 133 | + |
| 134 | +Salaries csv |
| 135 | +------------ |
| 136 | +- Required: `id, officer_id, salary, year` |
| 137 | +- Optional: `overtime_pay, is_fiscal_year` |
| 138 | + |
| 139 | +### Details |
| 140 | + |
| 141 | +- `officer_id` - Integer referring to `id` of existing officer or string starting with `#` referring to a newly created officer in the provided officers csv. |
| 142 | +- `salary` - Number representing the officer's salary in the given year. |
| 143 | +- `year` - Integer, the year this salary information refers to. |
| 144 | +- `overtime_pay` - Number representing the amount of overtime payment for offer in given year. |
| 145 | +- `is_fiscal_year` - Boolean value, indicating whether the provided year refers to calendar year or fiscal year. |
| 146 | + The values `true`, `t`, `yes` and `y` are treated as "yes, the salary is for the fiscal year", all others (including blank) as "no". |
| 147 | + |
| 148 | +Incidents csv |
| 149 | +------------- |
| 150 | +- Required: `id, department_name, department_state` |
| 151 | +- Optional: `date, time, report_number, description, street_name, cross_street1, cross_street2, city, state, zip_code, |
| 152 | + created_by, last_updated_by, officer_ids, license_plates` |
| 153 | + |
| 154 | +### Details |
| 155 | + |
| 156 | +- `department_name` - Name of department exactly as in the server database. |
| 157 | + This needs to match the department name provided with the command. |
| 158 | +- `department_state` - Name of department state exactly as it is in the server database, which will be the |
| 159 | + [standard two-letter abbreviation](https://www.faa.gov/air_traffic/publications/atpubs/cnt_html/appendix_a.html) for the department's respective location. |
| 160 | +- `date` - [Date](https://help.highbond.com/helpdocs/analytics/13/user-guide/en-us/Content/table_definition/c_formats_of_date_and_time_source_data.htm) of the incident |
| 161 | +- `time` - [Time](https://help.highbond.com/helpdocs/analytics/13/user-guide/en-us/Content/table_definition/c_formats_of_date_and_time_source_data.htm) of the incident |
| 162 | +- `report_number` - String representing any kind of number assigned to complaints or incidents by the police department. |
| 163 | +- `description` - Text description of the incident. |
| 164 | +- `street_name` - Name of the street the incident occurred, but should not include the street number. |
| 165 | +- `cross_street1`, `cross_street2` The two closest intersecting streets. |
| 166 | +- `city`, `state`, `zip_code` State needs to be in 2 letter abbreviated notation. |
| 167 | +- `created_by`, `last_updated_by` - ID of existing user shown as responsible for adding this entry. |
| 168 | +- `officer_ids` - IDs of officers involved in the incident, separated by `|`. |
| 169 | + - Each individual id can either be an integer referring to an existing officer or a string starting with `#` referring to a newly created officer. |
| 170 | + - Example: `123|#C1|1627` for three officers, one with id 123, one with 1627 and one whose record was created in the officers csv |
| 171 | + and whose id-field was the string `#C1`. |
| 172 | + |
| 173 | +- `license_plates` - All license plates involved in the incident. If there is more than one, they can be separated with a `|`. |
| 174 | + - Each license plate consists of the license plate number and optionally a state in abbreviated form separated by an underscore `_`. |
| 175 | + - Example: `ABC123_IL|B991` for one license plate with number `ABC123` from Illinois and one with number `B991` and no associated state. |
| 176 | + |
| 177 | + |
| 178 | +Links csv |
| 179 | +--------- |
| 180 | +- Required: `id, url` |
| 181 | +- Optional: `title, link_type, description, author, created_by, officer_ids, incident_ids` |
| 182 | + |
| 183 | +### Details |
| 184 | + |
| 185 | +- `url` - Full url of the link starting with `http://` or `https://`. |
| 186 | +- `title` - Text that will be displayed as the link. |
| 187 | +- `description` - A short description of the link. |
| 188 | +- `link_type` - Choice of `Link`, `YouTube Video` and `Other Video`. |
| 189 | +- `author` - The source or author of the linked article, report, video. |
| 190 | +- `created_by` - ID of existing user shown as responsible for adding this entry. |
| 191 | +- `officer_ids` - IDs of officer profiles this link should be visible on, separated by `|`. See same field in incidents above for more details. |
| 192 | +- `incidents_ids` - IDs of incidents this link should be associated with, separated by `|`. Just like `officer_ids` this can contain strings. |
| 193 | + starting with `#` to refer to an incident created in the incident csv. |
| 194 | + |
| 195 | +Examples |
| 196 | +--------- |
| 197 | +Example csvs can be found in the repository under `OpenOversight/tests/test_csvs`. |
| 198 | + |
| 199 | +Local development flag `--force-create` |
| 200 | +--------------------------------------- |
| 201 | +This flag changes the behavior when an integer is provided as `id`. Instead of updating an existing record, |
| 202 | +a new record will be created and assigned the given `id`. If a record with that `id` already exists in the |
| 203 | +database, it will be deleted before the new record is created. |
| 204 | + |
| 205 | +This functionality is intended to be used to import csv files downloaded from `OpenOversight download page </download/all>`_ |
| 206 | +to get a local copy of the production data for one department in the local development database. |
0 commit comments