We saw how to input a CSV file into a Python script in CSV. We saw how to create an SQLite database in Play. Now we will do both tasks in the same Python script.
The
rows
in
line
105
is a
list.
That’s why
line
105
calls
executemany
instead of
execute
.
Each item in
rows
is a list of 18 items.
That’s why the SQLite
insert
statement in
line
71
contains 18 question marks as placeholders.
The value of this string is
"insert into inspections values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"
See “placeholder”
in the
sqlite3
documentation.
The output is in chronological order because of the
order by
INSPECTION_DATE
clause in the SQLite
select
statement in
lines
93–96
of
sqlinspection1.py
.
WO HOP 17 Thursday, September 4, 2014 Facility not vermin proof. Harborage or conditions conducive to attracting vermin to the premises and/or allowing vermin to exist. WO HOP 17 Thursday, September 4, 2014 Evidence of mice or live mice present in facility's food and/or non-food areas. WO HOP 17 Friday, September 11, 2015 Food not protected from potential source of contamination during storage, preparation, transportation, display or service. WO HOP 17 Friday, September 11, 2015 Pesticide use not in accordance with label or applicable laws. Prohibited chemical used/stored. Open bait station used. WO HOP 17 Friday, September 11, 2015 Personal cleanliness inadequate. Outer garment soiled with possible contaminant. Effective hair restraint not worn in an area where food is prepared. WO HOP 17 Friday, September 11, 2015 Non-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit. WO HOP 17 Friday, September 11, 2015 Evidence of mice or live mice present in facility's food and/or non-food areas. WO HOP 17 Friday, October 16, 2015 Non-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit. WO HOP 17 Friday, October 16, 2015 Wiping cloths soiled or not stored in sanitizing solution. WO HOP 17 Friday, May 20, 2016 Food not protected from potential source of contamination during storage, preparation, transportation, display or service. WO HOP 17 Friday, May 20, 2016 Proper sanitization not provided for utensil ware washing operation. WO HOP 17 Friday, May 20, 2016 Non-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit. WO HOP 17 Monday, June 19, 2017 Non-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit. WO HOP 17 Monday, June 19, 2017 Cold food item held above 41º F (smoked fish and reduced oxygen packaged foods above 38 ºF) except during necessary preparation.
ls -l inspections.db -rw-r--r-- 1 myname mygroup 158588928 Aug 14 12:16 inspections.db sqlite3 inspections.db SQLite version 3.20.0 2017-08-01 13:24:15 Enter ".help" for usage hints. sqlite> .tables inspections sqlite> select count(*) from inspections; 401701 sqlite> .quit
inspections
contains a lot of redundant data.
For example, the CAMIS number, name (DBA: “doing business as”),
address, phone number, and cuisine description of each restaurant is
repeated in each row belonging to that restaurant.
To avoid this repetition, our database file
inspections.db
should have contained two tables.
The
CAMIS
key of the
inspections
table tells you what restaurant a given inspection belongs to.
This is called
normalizing
the database.
--A table containing exactly one row --for each of the approximately 26,000 restaurants. create table restaurants ( CAMIS integer primary key, DBA text, BORO text, BUILDING text, STREET text, ZIPCODE integer, PHONE text ); --A table containing exactly one row --for each of the approximately 413,000 inspections. create table inspections ( INSPECTION_DATE text, ACTION text, VIOLATION_CODE text, VIOLATION_DESCRIPTION text, CRITICAL_FLAG text, SCORE integer, GRADE text, GRADE_DATE text, RECORD_DATE text, INSPECTION_TYPE text, CAMIS integer, foreign key(CAMIS) references restaurants(CAMIS) );
1 megabyte
=
220 bytes
=
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 ×
2 bytes
=
1,048,576 bytes.
We saved
32,813,056 bytes ≅ 31.3 megabytes by storing the data into two tables
instead of one table.
See also
vacuum
.
ls -l inspections.db -rw-r--r-- 1 myname mygroup 125726720 Aug 15 18:18 inspections.db sqlite3 inspections.db SQLite version 3.20.0 2017-08-01 13:24:15 Enter ".help" for usage hints. sqlite> .tables inspections restaurants sqlite> select count(*) from restaurants; 26493 sqlite> select count(*) from inspections; 401701 sqlite> .quit bc -l scale = 5 158588928 - 125726720 32862208 32862208 / 2^20 31.33984 control-d