This is a project to practice database skills in Python. I will go through those most basic SQLite statements for creating a database and tables, and updating and querying into the database for useful information.
Firstly, create a table 'student' in a database 'student.db' file.
import sqlite3
con = sqlite3.connect('student.db')
cursor = con.cursor()
# CREATE a student table with columns id, name, age, major
cursor.execute('''
create table student (
id integer primary key autoincrement not null unique , # 'integer primary key autoincrement' fixed format
name text not null,
age integer not null,
major text not null);
''')
When creating id column, the standard way is 'id integer primary key autoincrement' and any other constraints come after. When I put 'autoincrement' in any other orders, the row can't be created.
Next, populate the table 'student' with some random data using INSERT. The id column is autoincrement, so no need to put in any value for that.
# INSERT records
cursor.execute('''
insert into student (name, age, major) values ('john smith', 22, 'cs'), ('hello kitty', 30, 'ee');
''')
for _ in range(10):
name = random.choice(['nancy', 'josh', 'peter', 'medline', 'kuma', 'ying'])
major = random.choice(['cs', 'nurse', 'education', 'economy', 'music'])
age = random.randint(16, 80)
cursor.execute('''
insert into student (name, age, major) values (?,?,?);
''', (name, age, major ))
Update the table, if a row meet the condition after WHERE clause, then make update defined by SET clause.
Also for DELETE statement, I deleted a specific row defined by WHERE clause.
# UPDATE records
cursor.execute('''
UPDATE student SET major = 'cs' WHERE name = 'ying' AND age = 24;
''')
# DELETE records
cursor.execute('''
DELETE from student where name= 'ying' and age = 24;
''')
To only show name column of the table, use SELECT statement.
# SELECT some records
cursor.execute('''
select name from student;
''')
students = cursor.fetchall()
# print(students) #[('john smith',), ('nancy',), ('ying',), ('peter',), ('peter',)....]
To only show DISTINCT names of the table 'student':
# SELECT DISTICT records
cursor.execute('''
select distinct name from student order by name;
''')
students = cursor.fetchall()
# print(students) #[('hello kitty',), ('john smith',), ('kuma',), ('medline',), ('nancy',), ('peter',), ('ying',)]
Now, after having created the table, I decided to add a new column 'country' using
ALTER TABLE :
# alter table - add column
cursor.execute('''
alter table student add column country;
''')
To populate the new column 'country' with some data, using UPDATE
# to populate a specific row with some data in its 'country' column
cursor.execute('update student set country = "India" where id = 15')
SQLite has 4 types of operators: arithmetic operators, comparison operators, logical operators and bitwise operators.
For detailed list of operators, check out
this page.
Now, let's do some operator practices:
# BETWEEN operator
cursor.execute('select name from student where age between 20 and 30 ')
print(cursor.fetchall())
# LIKE operator - find student with 'm' in the name, % is the wildcard
cursor.execute('select name from student where name like "%m%";')
print(cursor.fetchall())# [('john smith',), ('kuma',), ('medline',), ('john smith',)]
# IN operator - used when match multiple criteria
cursor.execute('select name,major from student where major in ("cs", "ee")')
print(cursor.fetchall()) #('john smith', 'cs'), ('medline', 'cs'), ('nancy', 'cs'), ('john smith', 'cs'), ('hello kitty', 'ee')]
So far, I only did some SQLite practice within one table. Now I am going to create another table 'major' in the same database that has relationship with the table 'student'. My current 'student' table has a 'major' column, I need to find out how many DISTINCT majors this column include.
# First check how many majors in the student table
cursor.execute('select distinct major from student')
print(cursor.fetchall()) #[('cs',), ('economy',), ('music',), ('education',), ('nurse',), ('ee',)]
Then create a 'major' table with two columns for the moment:
# Second create a major table (id, major )
cursor.executescript('''
create table major (id integer primary key autoincrement not null unique, majorName text);
insert into major (majorName) values ('cs'), ('economy'), ('music'), ('education'), ('nurse'), ('ee');
''')
Now, I need to update student major column from text datatype to integer datatype. A simple straight forward way to update the table is to go to
DB Browser -> database structure -> modify table
to update the datatype from TEXT to INTEGER. Another way to modify the major column in student is to create a new table and copy the original
table to the new one with modification, then drop the old table.
After done with the datatype update through DB Browser interface, I made the following UPDATEs in 'student' table so that the 'major' column has integers that referenced from 'id' column in 'major' table.
cursor.executescript('''
update student set major = 1 where major == "cs";
update student set major = 2 where major = "economy";
update student set major = 3 where major = "music";
update student set major = 4 where major = "education";
update student set major = 5 where major = "nurse";
update student set major = 6 where major = "ee";''')
In order to practice
JOIN operator later on, I deceided to add a 'professor' column in 'major' table using ALTER TABLE, then pupulate the 'professor' column as follows :
# add a new column 'professor' for table major
cursor.execute('alter table major add column professor text ')
# add professors name to major table
cursor.executescript('''
update major set professor = 'Mr. Johnson' where id ==1 ;
update major set professor = 'Mrs. Alice' where id ==2 ;
update major set professor = 'Mr. Chen' where id ==3 ;
update major set professor = 'Mr. Smith' where id ==4 ;
update major set professor = 'Mr. Doe' where id ==5 ;
update major set professor = 'Miss. Lopez' where id ==6 ;
''')
In order to query useful data from both 'student' and 'major' tables, we need to use
JOIN clause. JOIN operation is part of SELECT statement, its role is to link across tables. JOIN should be followed by
ON clause to define how you want the tables to be linked by which keys.
However, if use JOIN clause without a following ON clause will create table with all possible combinations of rows, which is to say, if 'student' has 10 rows, 'major' has 6 rows, then 10 x 6 rows of combinations will be created by JOIN with the ON as a match filter.
For JOIN multiple tables, use this format: FROM table1 JOIN table2 JOIN table3 ON table1.id = table2.id and table1.id = table3.id.
# join two tables
cursor.execute('''
select student.name, student.age, major.majorName, major.professor
from student join major
ON student.major == major.id;''')
print(len(cursor.fetchall()))
The print output is like this:
At this moment, although I've created two relational tables and successfully pulled useful data out of the two tables, without a foreign key constraint, actions of breaking links between the two tables can't be prevented. For example, for inserting a new row with a 'major' integer of 10 into the 'student' table, with a foreign key being created, this is not allowed, because no such 'majorName' with a reference of 10 can be found in the 'major' table.
So, let's add a foreign key in 'student' table that references to the 'major' table.
# add foreign key to an existing table
'''You can not use the ALTER TABLE statement to add a foreign key in SQLite.
Instead you will need to 1. rename the table, 2.create a new table with the foreign key,
and then 3. copy the data into the new table.'''
cursor.executescript('''
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
ALTER TABLE student RENAME TO _student_old;
CREATE TABLE student (
id integer primary key autoincrement not null unique,
name text not null,
age integer not null,
major integer not null,
country text ,
foreign key (major) references major (id)
);
INSERT INTO student SELECT * from _student_old;
COMMIT;
PRAGMA foreign_keys=ON;
''')
To add a foreign key to an existing table is a bit complicated. Instead of using ALTER TABLE directly, 3 steps should be taken to make this happen.
1. rename the table to '_student_old' ,
2. create a new table 'student' with the foreign key, and then
3. use INSERT INTO with SELECT * to copy the data from '_student_old' into the new table 'student'.
After foreign key being created, I want to test if it works for preventing links being broken.
# test foreign key constraits NOTE: each time the foreign key should be turned on first
cursor.executescript('''
PRAGMA foreign_keys=ON;
INSERT into student (name, age, major, country) values ('Fred', 25, 10, 'Italy');
''')
We can see the new row can't be inserted into the table due to the foreign key constraint.
Next, let's see how to use CONCAT and AS in statement and in table names. Actually, in SQLite, CONCAT is not allowed, instead it use concatenate operator (||) for joining two strings together. In the example below, I use student.name || '-' || student.age AS "NAME/AGE" for querying two columns from 'student' table and joining them as one data with the column name as 'NAME/AGE'. Since I used alias 's' for the table 'student', the 'student.name' is changed to 's.name'. Let's check out the code:
# use CONCAT and AS (alias) in statement, alias can also be used for table name
# SQLite does not support the CONCAT() function from SQL.
# Instead, it uses the concatenate operator (||) to join two strings into one.
cursor.execute('''
SELECT s.name || '-' || s.age AS "NAME/AGE", m.majorName as MAJOR, m.professor as PROF
FROM student as s
JOIN major as m
ON s.major = m.id;
''')
I executed the above SQLite command on DB Browser and get the table created like the following:
Next, let's talk about
FUNCTIONS in SQLite. There are actually many built-in
core scalar SQL functions that are available in SQLite. We'll only practice some aggregate functions here. SQLite has several built-in
aggregate functions, most common used are AVG, COUNT, MAX, MIN, SUM etc. :
# aggregate functions (AVG, MIN, MAX, SUM, COUNT, UPPER, LOWER)
cursor.execute('select avg(s.age) from student as s')
print(cursor.fetchall()) # output [(40.625, )]
cursor.execute('select min(s.age) from student as s')
print(cursor.fetchall()) # output: [(22,)]
SELECT statement has two optional clauses: GROUP BY and HAVING.
GROUP BY clause groups several rows into ONE summary row, for example, we can group data from a SELECT statement by 'age', then by 'last name', so you can see that following the GROUP BY clause should be a column name or a list of comma-separated columns' names, for specifying the group.
HAVING clause can only be used when GROUP BY clause is present. the HAVING clause filters group(s) results based on a specified condition, for the following example, I only need data if the condition 'count(age) >= 2' is met.
If there is a FROM or WHERE clause, they both should come before GROUP BY clause.
# combine aggregate functions with GROUP BY
cursor.execute('select age, count(age) from student group by age ')
print(cursor.fetchall())
# output: [(22, 2), (23, 1), (25, 1), (29, 1), (30, 1),
# (32, 2), (35, 2), (38, 1), (51, 1), (65, 1), (67, 1), (70, 1), (74, 1)]
# IF only need value of count to be more than 1, use HAVING for condition after group by
cursor.execute('select age, count(age) from student group by age Having count(age) >=2 ')
print(cursor.fetchall())
# output: [(22, 2), (32, 2), (35, 2)]
Before finish this practice, I deleted the old table with DROP TABLE for data cleaning.
# remove old table
cursor.execute('DROP Table if exists _student_old;')
con.commit()
con.close()
This is a very basic SQLite practice using Python Sqlite3 module. DB Browser is used for some handy executions and result display. For source code, please
go to github.