1 SQL and sqlite3#

1.1 Database#

Database Management System (DBMS): the software system that manages and interacts with databases, providing functionalities such as data storage, retrieval, manipulation, and security.

Database: the actual collection of data organized in a structured format.

Relational databases (or SQL databases) and nonrelational databases (or NoSQL databases) are two major types of database management systems (DBMS) that differ in their structure, storage models, and suitability for different types of applications.

Feature

Relational Databases

Non-Relational Databases

Schema

Fixed schema with predefined tables and columns

Flexible schema without fixed structure

Query Language

Structured Query Language (SQL)

Various (e.g., MongoDB Query Language)

Scalability

Vertical scaling (adding more power to a single server)

Horizontal scaling (adding more servers)

Data Model

Tabular, structured data

Various (document, key-value, graph)

Data Relationships

Enforced through foreign keys and joins

Managed within the application or inherent to the database type (e.g., graph)

Use Cases

Financial systems, Enterprise management applications, E-commerce

Large scale, Big data, Real-Time applications, Content management

Popular Databases

MySQL, PostgreSQL, Oracle Database, Microsoft SQL Server, SQLite

MongoDB, Redis, Neo4j (Graph database)*

*Graph databases are not based on tables with rows and columns, so they are not SQL databases or traditional relational databases.

1.2 sqlite3#

SQLite is an embedded relational database management system (RDBMS). It’s a software library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine.

sqlite3 refers to both the C library used to interface with SQLite databases and the command-line tool provided with SQLite for interacting with SQLite databases. In addition, sqlite3 is a Python library that provides an interface for interacting with SQLite databases.

import sqlite3

1.2.1 Connect / create db#

# Connect/create to the database 
conn = sqlite3.connect('MCU.db')

1.2.2 Create a cursor#

A cursor is a database object used in various database management systems (DBMS) to retrieve, manipulate, and navigate through the rows of a result set obtained from a query.

cur = conn.cursor()
print(cur)
<sqlite3.Cursor object at 0x10b9e1140>

1.2.3 Execute a query#

# create table with multi-lines (use '''  ''')
cur.execute(
    '''
    CREATE TABLE MCU_Movies (
    movie_id INTEGER PRIMARY KEY,
    title TEXT NOT NULL,
    year INTEGER,
    date DATE,
    time TIME,
    score REAL)
    '''
    )
<sqlite3.Cursor at 0x10b9e1140>
# create data with a Python list
movie_data = [
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
(1005, 'Captain America: The First Avenger', 2011, '2011-03-15', '06:33:00', 6.9),
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
(1009, 'Captain America: The Winter Soldier', 2014, '2014-11-12', '13:56:00', 7.7),
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)
]
# Insert data
# INSERT INTO: This is the standard SQL command to add new records (rows) into a specified table.
# OR IGNORE: This clause modifies the behaviour of the INSERT command. 
# If there is a conflict (such as a duplicate primary key or a unique constraint violation), 
# the OR IGNORE clause tells SQLite to ignore the insertion of that particular row instead of throwing an error. 
#This is useful for avoiding errors when you are not sure if the data might already exist in the table.
# The VALUES clause specifies the values to be inserted into the columns of the table.
# Each ? corresponds to a value for each column in the MCU_Movies table. 
# For example, if the table has four columns, the placeholders will be replaced with actual values when the query is executed.

cur.executemany("INSERT OR IGNORE INTO MCU_Movies VALUES(?,?,?,?,?,?)", movie_data)
<sqlite3.Cursor at 0x10b9e1140>

1.2.4. Result fetchall and fetchone#

.fetchall() is used to retrieve each row sequentially from the result set.

.fetchone() is a method used in database programming to retrieve the next row of a query result set.

#  The result of the execute
res = cur.execute('SELECT * FROM MCU_Movies')
res.fetchall()
[(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
 (1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
 (1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
 (1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
 (1005,
  'Captain America: The First Avenger',
  2011,
  '2011-03-15',
  '06:33:00',
  6.9),
 (1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
 (1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
 (1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
 (1009,
  'Captain America: The Winter Soldier',
  2014,
  '2014-11-12',
  '13:56:00',
  7.7),
 (1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
 (1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
 (1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
 (1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
 (1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
 (1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
 (1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
 (1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
 (1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
 (1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
 (1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
 (1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
 (1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
#  The result of the execute
res.execute('SELECT title FROM MCU_Movies')
res.fetchone()
('Iron Man',)
#  The result of the execute
res.execute('SELECT title FROM MCU_Movies')
res.fetchall()
[('Iron Man',),
 ('The Incredible Hulk',),
 ('Iron Man 2',),
 ('Thor',),
 ('Captain America: The First Avenger',),
 ('The Avengers',),
 ('Iron Man 3',),
 ('Thor: The Dark World',),
 ('Captain America: The Winter Soldier',),
 ('Guardians of the Galaxy',),
 ('Avengers: Age of Ultron',),
 ('Ant-Man',),
 ('Captain America: Civil War',),
 ('Doctor Strange',),
 ('Guardians of the Galaxy Vol. 2',),
 ('Spider-Man: Homecoming',),
 ('Thor: Ragnarok',),
 ('Black Panther',),
 ('Avengers: Infinity War',),
 ('Ant-Man and The Wasp',),
 ('Captain Marvel',),
 ('Avengers: Endgame',)]

1.2.5. Commit#

con.commit() is used in the context of database operations, specifically when working with SQL databases through a connection object (named con). This function is crucial for managing transactions.

# commit the connection object to the database (transaction from table MCU_movies to MCU.db now)
conn.commit()

1.2.6. Close the cursor#

# Close the cursor
cur.close()

1.2.7. Close the connection#

# Close the connection
conn.close()

1.3 SQL syntax (PART 1)#

In SQL, a clause is a component of a query that specifies a particular operation or condition. Each clause performs a specific function in the query and contributes to the overall task of retrieving, modifying, or managing data within a database.

In this section, we will practice primary clauses in SQL and compare them to Pandas.

1.3.1 SELECT FROM#

Identifies the columns and tables to be retrieved from the database.

# As the connection has been closed we need to connect to the database created in the last section
conn = sqlite3.connect('MCU.db')
# create cur 
cur = conn.cursor()
cur
<sqlite3.Cursor at 0x10b9e33c0>

Select all columns from the table

# select all columns from the table 
query = "SELECT * FROM MCU_Movies" 
res = cur.execute(query)
res.fetchall()
[(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
 (1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
 (1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
 (1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
 (1005,
  'Captain America: The First Avenger',
  2011,
  '2011-03-15',
  '06:33:00',
  6.9),
 (1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
 (1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
 (1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
 (1009,
  'Captain America: The Winter Soldier',
  2014,
  '2014-11-12',
  '13:56:00',
  7.7),
 (1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
 (1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
 (1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
 (1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
 (1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
 (1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
 (1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
 (1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
 (1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
 (1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
 (1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
 (1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
 (1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
# use pandas open the table
import pandas as pd
df = pd.read_sql_query(query, conn)
# To be notified, the DataFrame is saved in a different memory location with a different index compared to the cursor.
print(hex(id(df)))  
0x11f4f4350
df
movie_id title year date time score
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9
1 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0
3 1004 Thor 2011 2011-01-01 07:47:00 7.0
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
6 1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2
7 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9
8 1009 Captain America: The Winter Soldier 2014 2014-11-12 13:56:00 7.7
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0
10 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3
11 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3
12 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8
13 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5
14 1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6
15 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4
16 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9
17 1018 Black Panther 2018 2018-04-23 03:05:00 7.3
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
19 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1
20 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4

Select one columns from the table

res = cur.execute("SELECT title FROM MCU_Movies")
res.fetchall()
[('Iron Man',),
 ('The Incredible Hulk',),
 ('Iron Man 2',),
 ('Thor',),
 ('Captain America: The First Avenger',),
 ('The Avengers',),
 ('Iron Man 3',),
 ('Thor: The Dark World',),
 ('Captain America: The Winter Soldier',),
 ('Guardians of the Galaxy',),
 ('Avengers: Age of Ultron',),
 ('Ant-Man',),
 ('Captain America: Civil War',),
 ('Doctor Strange',),
 ('Guardians of the Galaxy Vol. 2',),
 ('Spider-Man: Homecoming',),
 ('Thor: Ragnarok',),
 ('Black Panther',),
 ('Avengers: Infinity War',),
 ('Ant-Man and The Wasp',),
 ('Captain Marvel',),
 ('Avengers: Endgame',)]
df.title
0                                Iron Man
1                     The Incredible Hulk
2                              Iron Man 2
3                                    Thor
4      Captain America: The First Avenger
5                            The Avengers
6                              Iron Man 3
7                    Thor: The Dark World
8     Captain America: The Winter Soldier
9                 Guardians of the Galaxy
10                Avengers: Age of Ultron
11                                Ant-Man
12             Captain America: Civil War
13                         Doctor Strange
14         Guardians of the Galaxy Vol. 2
15                 Spider-Man: Homecoming
16                         Thor: Ragnarok
17                          Black Panther
18                 Avengers: Infinity War
19                   Ant-Man and The Wasp
20                         Captain Marvel
21                      Avengers: Endgame
Name: title, dtype: object
df['title']
0                                Iron Man
1                     The Incredible Hulk
2                              Iron Man 2
3                                    Thor
4      Captain America: The First Avenger
5                            The Avengers
6                              Iron Man 3
7                    Thor: The Dark World
8     Captain America: The Winter Soldier
9                 Guardians of the Galaxy
10                Avengers: Age of Ultron
11                                Ant-Man
12             Captain America: Civil War
13                         Doctor Strange
14         Guardians of the Galaxy Vol. 2
15                 Spider-Man: Homecoming
16                         Thor: Ragnarok
17                          Black Panther
18                 Avengers: Infinity War
19                   Ant-Man and The Wasp
20                         Captain Marvel
21                      Avengers: Endgame
Name: title, dtype: object

1.3.2 WHERE#

Filters the records based on specified conditions.

This section also introduces df.query(), which is a powerful data manipulation library in python. This method provides a more readable and concise way to filter rows based on column values, compared to using traditional indexing and boolean indexing.

score >= 8

res = cur.execute("SELECT * FROM MCU_Movies WHERE score >= 8")
res.fetchall()
[(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
 (1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
 (1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
 (1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
df[df.score >= 8]
movie_id title year date time score
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4
df.query('score >= 8')
movie_id title year date time score
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4

year >= 2017

res = cur.execute("SELECT * FROM MCU_Movies WHERE year >= 2017")
res.fetchall()
[(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
 (1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
 (1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
 (1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
 (1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
 (1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
 (1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
 (1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
df.year.dtype 
dtype('int64')
df[df.year >= 2017] # 2017 is integer 
movie_id title year date time score
14 1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6
15 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4
16 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9
17 1018 Black Panther 2018 2018-04-23 03:05:00 7.3
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
19 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1
20 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4

date < 2018-01-01

res = cur.execute("SELECT * FROM MCU_Movies WHERE date < '2018-01-01'")
res.fetchall()
[(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
 (1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
 (1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
 (1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
 (1005,
  'Captain America: The First Avenger',
  2011,
  '2011-03-15',
  '06:33:00',
  6.9),
 (1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
 (1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
 (1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
 (1009,
  'Captain America: The Winter Soldier',
  2014,
  '2014-11-12',
  '13:56:00',
  7.7),
 (1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
 (1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
 (1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
 (1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
 (1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
 (1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
 (1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
 (1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9)]

Note:

When you read date or datetime columns from a SQL table into a pandas DataFrame and they are of object type (strings), you can still use them for filtering conditions directly, but it requires understanding how string comparisons work.

In pandas, if a date column is stored as an object type (string), you can still perform filtering operations because string comparisons in Python follow lexicographical order, which can be useful for date strings in the format ‘YYYY-MM-DD’ or ‘YYYY-MM-DD HH:MM:SS’.

This means you cannot use some datetime attribute like ‘date.dayofweek’ or ‘date.dayofyear’ unless the object type changed to pandas datetime type.

print(df.date.dtype) # Here, date and time column have not been changed to datetime.
object
type(df.date[1])
str
df[df.date < '2018-01-01']
movie_id title year date time score
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9
1 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0
3 1004 Thor 2011 2011-01-01 07:47:00 7.0
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
6 1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2
7 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9
8 1009 Captain America: The Winter Soldier 2014 2014-11-12 13:56:00 7.7
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0
10 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3
11 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3
12 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8
13 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5
14 1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6
15 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4
16 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9
df.query("date < '2018-01-01'")
movie_id title year date time score
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9
1 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0
3 1004 Thor 2011 2011-01-01 07:47:00 7.0
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
6 1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2
7 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9
8 1009 Captain America: The Winter Soldier 2014 2014-11-12 13:56:00 7.7
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0
10 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3
11 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3
12 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8
13 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5
14 1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6
15 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4
16 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9

time < 12:20:01

res = cur.execute("SELECT * FROM MCU_Movies WHERE date < time < '12:20:01'")
res.fetchall()
[(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
 (1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
 (1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
 (1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
 (1005,
  'Captain America: The First Avenger',
  2011,
  '2011-03-15',
  '06:33:00',
  6.9),
 (1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
 (1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
 (1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
 (1009,
  'Captain America: The Winter Soldier',
  2014,
  '2014-11-12',
  '13:56:00',
  7.7),
 (1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
 (1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
 (1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
 (1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
 (1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
 (1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
 (1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
 (1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
 (1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
 (1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
 (1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
 (1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
 (1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
df[df.time < '12:20:01']
movie_id title year date time score
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0
3 1004 Thor 2011 2011-01-01 07:47:00 7.0
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
12 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8
15 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4
16 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9
17 1018 Black Panther 2018 2018-04-23 03:05:00 7.3
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4
df.query("time < '12:20:01'")
movie_id title year date time score
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0
3 1004 Thor 2011 2011-01-01 07:47:00 7.0
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
12 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8
15 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4
16 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9
17 1018 Black Panther 2018 2018-04-23 03:05:00 7.3
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4

title = Thor

res = cur.execute("SELECT * FROM MCU_Movies WHERE title = 'Thor'")
res.fetchall()
[(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)]
df[df.title == 'Thor']
movie_id title year date time score
3 1004 Thor 2011 2011-01-01 07:47:00 7.0
df.query("title == 'Thor'")
movie_id title year date time score
3 1004 Thor 2011 2011-01-01 07:47:00 7.0

1.3.3 AND, OR, NOT#

AND

res = cur.execute("SELECT title FROM MCU_Movies WHERE year > 2018 AND score > 7")
res.fetchall()
[('Avengers: Endgame',)]
df[(df.year > 2018) & (df.score > 7)].title
21    Avengers: Endgame
Name: title, dtype: object
df.query('year > 2018 & score > 7') .title
21    Avengers: Endgame
Name: title, dtype: object

OR

res = cur.execute("SELECT title FROM MCU_Movies WHERE date < '2012-01-01' OR score > 8")
res.fetchall()
[('Iron Man',),
 ('The Incredible Hulk',),
 ('Iron Man 2',),
 ('Thor',),
 ('Captain America: The First Avenger',),
 ('Avengers: Infinity War',),
 ('Avengers: Endgame',)]
df[(df.date < '2012-01-01') | (df.score > 8)].title
0                               Iron Man
1                    The Incredible Hulk
2                             Iron Man 2
3                                   Thor
4     Captain America: The First Avenger
18                Avengers: Infinity War
21                     Avengers: Endgame
Name: title, dtype: object
df.query('date < "2012-01-01" | score > 8').title # you can use '' or "" to quote the syntax. 
0                               Iron Man
1                    The Incredible Hulk
2                             Iron Man 2
3                                   Thor
4     Captain America: The First Avenger
18                Avengers: Infinity War
21                     Avengers: Endgame
Name: title, dtype: object

NOT

res = cur.execute("SELECT title FROM MCU_Movies WHERE NOT date < '2016-01-01' AND score != 6")
res.fetchall()
[('Captain America: Civil War',),
 ('Doctor Strange',),
 ('Guardians of the Galaxy Vol. 2',),
 ('Spider-Man: Homecoming',),
 ('Thor: Ragnarok',),
 ('Black Panther',),
 ('Avengers: Infinity War',),
 ('Ant-Man and The Wasp',),
 ('Captain Marvel',),
 ('Avengers: Endgame',)]
df[~(df.date < '2016-01-01') & (df.score != 6)].title
12        Captain America: Civil War
13                    Doctor Strange
14    Guardians of the Galaxy Vol. 2
15            Spider-Man: Homecoming
16                    Thor: Ragnarok
17                     Black Panther
18            Avengers: Infinity War
19              Ant-Man and The Wasp
20                    Captain Marvel
21                 Avengers: Endgame
Name: title, dtype: object
df.query('~ (date < "2016-01-01" &  score != 6)').title
12        Captain America: Civil War
13                    Doctor Strange
14    Guardians of the Galaxy Vol. 2
15            Spider-Man: Homecoming
16                    Thor: Ragnarok
17                     Black Panther
18            Avengers: Infinity War
19              Ant-Man and The Wasp
20                    Captain Marvel
21                 Avengers: Endgame
Name: title, dtype: object

1.3.4 COUNT and DISTINCT#

The COUNT function in SQL returns the number of rows.

COUNT(DISTINCT column) returns the number of unique, non-null values in a specified column.

res = cur.execute("SELECT COUNT(*) FROM MCU_Movies WHERE score > 8")
res.fetchall()
[(2,)]
len(df[df.score > 8])
2
res = cur.execute("SELECT  DISTINCT year FROM MCU_Movies")
res.fetchall()
[(2008,),
 (2010,),
 (2011,),
 (2012,),
 (2013,),
 (2014,),
 (2015,),
 (2016,),
 (2017,),
 (2018,),
 (2019,)]
df.year.unique()
array([2008, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019])
res = cur.execute("SELECT COUNT(DISTINCT year) FROM MCU_Movies")
res.fetchall()
[(11,)]
df.year.nunique()
11
len(df.year.unique())
11

1.3.5 MAX, MIN, AVG#

res = cur.execute("SELECT MIN(score) FROM MCU_Movies")
res.fetchall()
[(6.7,)]
df.score.min()
6.7
res = cur.execute("SELECT MAX(score) FROM MCU_Movies")
res.fetchall()
[(8.4,)]
df.score.max()
8.4
res = cur.execute("SELECT AVG(score) FROM MCU_Movies")
res.fetchall()
[(7.463636363636363,)]
 df.score.mean()
7.463636363636365

1.3.6 IN NOT IN#

IN operator checks if a value matches any value in a list or subquery.

NOT IN checks if a value does not match any value in the list or subquery.

IN

res = cur.execute("SELECT * FROM MCU_Movies WHERE year IN (2011, 2015, 2016)")
res.fetchall()
[(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
 (1005,
  'Captain America: The First Avenger',
  2011,
  '2011-03-15',
  '06:33:00',
  6.9),
 (1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
 (1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
 (1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
 (1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5)]
select_years = [2011, 2015, 2016]
df[df.year.isin(select_years)]
movie_id title year date time score
3 1004 Thor 2011 2011-01-01 07:47:00 7.0
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9
10 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3
11 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3
12 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8
13 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5
df.query('year in [2011, 2015, 2016]')
movie_id title year date time score
3 1004 Thor 2011 2011-01-01 07:47:00 7.0
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9
10 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3
11 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3
12 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8
13 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5
df.query('year in @select_years') # the @ symbol is used to denote a variable name when constructing a query string with parameters.
movie_id title year date time score
3 1004 Thor 2011 2011-01-01 07:47:00 7.0
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9
10 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3
11 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3
12 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8
13 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5

NOT IN

res = cur.execute("SELECT * FROM MCU_Movies WHERE year NOT IN (2008, 2011, 2012, 2013, 2015, 2016, 2017)")
res.fetchall()
[(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
 (1009,
  'Captain America: The Winter Soldier',
  2014,
  '2014-11-12',
  '13:56:00',
  7.7),
 (1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
 (1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
 (1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
 (1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
 (1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
 (1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
select_years = [2008, 2011, 2012, 2013, 2015, 2016, 2017]
df[~df.year.isin(select_years)]
movie_id title year date time score
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0
8 1009 Captain America: The Winter Soldier 2014 2014-11-12 13:56:00 7.7
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0
17 1018 Black Panther 2018 2018-04-23 03:05:00 7.3
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
19 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1
20 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4
df.query('year not in @select_years')
movie_id title year date time score
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0
8 1009 Captain America: The Winter Soldier 2014 2014-11-12 13:56:00 7.7
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0
17 1018 Black Panther 2018 2018-04-23 03:05:00 7.3
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
19 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1
20 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4

1.3.7 BETWEEN#

BETWEEN operator is used to filter the result set within a specified range, inclusive of the boundary values.

res = cur.execute("SELECT * FROM MCU_Movies WHERE score BETWEEN 8 AND 9")
res.fetchall()
[(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
 (1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
 (1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
 (1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
df[df.score.between(8, 9)]
movie_id title year date time score
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4
df[(df.score >= 8) & (df.score <= 9)]
movie_id title year date time score
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4
df.query('score >= 8 and score <= 9')
movie_id title year date time score
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4

1.3.8 LIKE#

LIKE operator is used to search for a specified pattern within a column.

res = cur.execute("SELECT * FROM MCU_Movies WHERE title LIKE 'a%'") # upper and lower case is not sensitive in SQL
res.fetchall()
[(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
 (1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
 (1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
 (1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
 (1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
df[df.title.str.startswith('A')] # upper and lower case is sensitive in python 
movie_id title year date time score
10 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3
11 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
19 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4
df.query("title.str.startswith('A')")
movie_id title year date time score
10 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3
11 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
19 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4
res = cur.execute("SELECT * FROM MCU_Movies WHERE title LIKE '%k'")
res.fetchall()
[(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
 (1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9)]
df[df.title.str.endswith('k')]
movie_id title year date time score
1 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7
16 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9
res = cur.execute("SELECT * FROM MCU_Movies WHERE title LIKE 'T__r'")
res.fetchall()
[(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)]

^: Start with

T: Followed by a literal ‘T’

..: Followed by any two characters

r: Followed by a literal ‘r’

$: End with

df[df.title.str.contains(r'^T..r$')]
movie_id title year date time score
3 1004 Thor 2011 2011-01-01 07:47:00 7.0

1.3.9 ORDER BY#

ORDER BY in SQL is used to sort the result set of a query based on specified columns in ascending or descending order.

res = cur.execute("SELECT * FROM MCU_Movies ORDER BY title DESC")
res.fetchall()
[(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
 (1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
 (1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
 (1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
 (1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
 (1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
 (1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
 (1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
 (1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
 (1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
 (1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
 (1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
 (1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
 (1009,
  'Captain America: The Winter Soldier',
  2014,
  '2014-11-12',
  '13:56:00',
  7.7),
 (1005,
  'Captain America: The First Avenger',
  2011,
  '2011-03-15',
  '06:33:00',
  6.9),
 (1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
 (1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
 (1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
 (1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4),
 (1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
 (1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
 (1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3)]
df.sort_values(by='title', ascending=False)
movie_id title year date time score
7 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9
16 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9
3 1004 Thor 2011 2011-01-01 07:47:00 7.0
1 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
15 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4
6 1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9
14 1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0
13 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5
20 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9
8 1009 Captain America: The Winter Soldier 2014 2014-11-12 13:56:00 7.7
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9
12 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8
17 1018 Black Panther 2018 2018-04-23 03:05:00 7.3
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4
10 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3
19 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1
11 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3

1.3.10 LIMIT / OFFSET#

LIMIT / OFFSET controls the number of rows returned by a query and to specify where to start returning rows from, respectively.

res = cur.execute("SELECT * FROM MCU_Movies LIMIT 3")
res.fetchall()
[(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
 (1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
 (1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0)]
df.head(3)
movie_id title year date time score
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9
1 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0
df.iloc[0:3]
movie_id title year date time score
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9
1 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0
res = cur.execute("SELECT * FROM MCU_Movies LIMIT 3 OFFSET 5")
res.fetchall()
[(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
 (1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
 (1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)]
df.iloc[5:8]
movie_id title year date time score
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0
6 1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2
7 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9

1.4 SQL syntax (PART 2)#

PRAGMA

The PRAGMA statement in SQL is used to query or modify various internal parameters of the SQLite database engine.

res = cur.execute("PRAGMA table_info(MCU_movies)")
res.fetchall()
[(0, 'movie_id', 'INTEGER', 0, None, 1),
 (1, 'title', 'TEXT', 1, None, 0),
 (2, 'year', 'INTEGER', 0, None, 0),
 (3, 'date', 'DATE', 0, None, 0),
 (4, 'time', 'TIME', 0, None, 0),
 (5, 'score', 'REAL', 0, None, 0)]
# No primary key found in MCU_movies
res = cur.execute("PRAGMA primary_key_list(MCU_movies)")
res.fetchall()
[]
# No foreign key found in MCU_movies
res = cur.execute("PRAGMA foreign_key_list(MCU_movies)")
res.fetchall()
[]
# create table Actors 
res = cur.execute('''
    CREATE TABLE Actors (
    actor_id INTEGER PRIMARY KEY,
    actor_name TEXT,
    movie_id INTEGER,
    gender TEXT,
    nationality TEXT,
    date_birth DATE,
    FOREIGN KEY (movie_id) REFERENCES MCU_Movies(movie_id) 
)''')
res.fetchall()
[]

The sqlite_master table is a special system table that stores information about all the schema objects in the database, such as tables, indexes, triggers, and views.

# Here we have table Actors without any values
res = cur.execute("SELECT name FROM sqlite_master WHERE type='table'") 
res.fetchall()
[('MCU_Movies',), ('Actors',)]
actor_data = [
(1, 'Robert Downey Jr.', 1001, 'Male', 'American', '1965-04-04'),
 (2, 'Chris Evans', 1005, 'Male', 'American', '1981-06-13'),
 (3, 'Mark Ruffalo', 1006, 'Male', 'American', '1967-11-22'),
 (4, 'Chris Hemsworth', 1004, 'Male', 'Australian', '1983-08-11'),
 (5, 'Scarlett Johansson', 1006, 'Female', 'American', '1984-11-22'),
 (6, 'Jeremy Renner', 1006, 'Male', 'American', '1971-01-07'),
 (7, 'Tom Hiddleston', 1004, 'Male', 'British', '1981-02-09'),
 (8, 'Clark Gregg', 1001, 'Male', 'American', '1962-04-02'),
 (9, 'Paul Rudd', 1012, 'Male', 'American', '1969-04-06'),
 (10, 'Benedict Cumberbatch', 1014, 'Male', 'British', '1976-07-19'),
 (11, 'Tom Holland', 1008, 'Male', 'British', '1996-06-01'),
 (12, 'Elizabeth Olsen', 1022, 'Female', 'American', '1989-02-16'),
 (13, 'Anthony Mackie', 1017, 'Male', 'American', '1978-09-23'),
 (14, 'Chadwick Boseman', 1008, 'Male', 'American', '1976-11-29'),
 (15, 'Chris Pratt', 1017, 'Male', 'American', '1979-06-21'),
 (16, 'Zoe Saldana', 1004, 'Female', 'American', '1978-06-19'),
 (17, 'Karen Gillan', 1008, 'Female', 'British', '1987-11-28'),
 (18, 'Dave Bautista', 1013, 'Male', 'American', '1969-01-18'),
 (19, 'Vin Diesel', 1013, 'Male', 'American', '1967-07-18'),
 (20, 'Bradley Cooper', 1006, 'Male', 'American', '1975-01-05'),
 (21, 'Brie Larson', 1008, 'Female', 'American', '1989-10-01'),
 (22, 'Michael B. Jordan', 1022, 'Male', 'American', '1987-02-09'),
 (23, 'Tessa Thompson', 1002, 'Female', 'American', '1983-10-03'),
 (24, 'Samuel L. Jackson', 1012, 'Male', 'American', '1948-12-21'),
 (25, 'Gwyneth Paltrow', 1013, 'Female', 'American', '1972-09-27'),
 (26, 'Don Cheadle', 1014, 'Male', 'American', '1964-11-29'),
 (27, 'Paul Bettany', 1021, 'Male', 'British', '1971-05-27'),
 (28, 'Sebastian Stan', 1007, 'Male', 'Romanian', '1982-08-13'),
 (29, 'Josh Brolin', 1019, 'Male', 'American', '1968-02-12'),
 (30, 'Karen Gillan', 1019, 'Female', 'British', '1987-11-28'),
 (31, 'Pom Klementieff', 1002, 'Female', 'French', '1986-05-03'),
 (32, 'Evangeline Lilly', 1020, 'Female', 'Canadian', '1979-08-03'),
 (33, 'Tilda Swinton', 1019, 'Female', 'British', '1960-11-05'),
 (34, 'Jeff Goldblum', 1021, 'Male', 'American', '1952-10-22'),
 (35, 'Jon Favreau', 1018, 'Male', 'American', '1966-10-19'),
 (36, 'Ben Kingsley', 1013, 'Male', 'British', '1943-12-31'),
 (37, 'Cobie Smulders', 1022, 'Female', 'Canadian', '1982-04-03'),
 (38, 'Hayley Atwell', 1003, 'Female', 'British', '1982-04-05'),
 (39, 'Natalie Portman', 1008, 'Female', 'Israeli', '1981-06-09'),
 (40, 'Rachel McAdams', 1004, 'Female', 'Canadian', '1978-11-17'),
 (41, 'Benedict Wong', 1011, 'Male', 'British', '1971-07-03'),
 (42, 'Letitia Wright', 1004, 'Female', 'British', '1993-10-31'),
 (43, 'Winston Duke', 1018, 'Male', 'Tobagonian', '1986-11-15'),
 (44, 'Danai Gurira', 1016, 'Female', 'American', '1978-02-14'),
 (45, 'Lee Pace', 1002, 'Male', 'American', '1979-03-25'),
 (46, 'Djimon Hounsou', 1014, 'Male', 'Beninese', '1964-04-24'),
 (47, 'Clark Gregg', 1006, 'Male', 'American', '1962-04-02'),
 (48, 'Kat Dennings', 1018, 'Female', 'American', '1986-06-13'),
 (49, 'Stellan Skarsgård', 1014, 'Male', 'Swedish', '1951-06-13'),
 (50, 'Idris Elba', 1015, 'Male', 'British', '1972-09-06')]
cur.executemany("INSERT OR IGNORE INTO Actors VALUES(?,?,?,?,?,?)", actor_data) # Define the SQL statement with placeholders for multiple rows

conn.commit() # Commit the changes to the database
res = cur.execute("SELECT * FROM Actors")
res.fetchall()
[(1, 'Robert Downey Jr.', 1001, 'Male', 'American', '1965-04-04'),
 (2, 'Chris Evans', 1005, 'Male', 'American', '1981-06-13'),
 (3, 'Mark Ruffalo', 1006, 'Male', 'American', '1967-11-22'),
 (4, 'Chris Hemsworth', 1004, 'Male', 'Australian', '1983-08-11'),
 (5, 'Scarlett Johansson', 1006, 'Female', 'American', '1984-11-22'),
 (6, 'Jeremy Renner', 1006, 'Male', 'American', '1971-01-07'),
 (7, 'Tom Hiddleston', 1004, 'Male', 'British', '1981-02-09'),
 (8, 'Clark Gregg', 1001, 'Male', 'American', '1962-04-02'),
 (9, 'Paul Rudd', 1012, 'Male', 'American', '1969-04-06'),
 (10, 'Benedict Cumberbatch', 1014, 'Male', 'British', '1976-07-19'),
 (11, 'Tom Holland', 1008, 'Male', 'British', '1996-06-01'),
 (12, 'Elizabeth Olsen', 1022, 'Female', 'American', '1989-02-16'),
 (13, 'Anthony Mackie', 1017, 'Male', 'American', '1978-09-23'),
 (14, 'Chadwick Boseman', 1008, 'Male', 'American', '1976-11-29'),
 (15, 'Chris Pratt', 1017, 'Male', 'American', '1979-06-21'),
 (16, 'Zoe Saldana', 1004, 'Female', 'American', '1978-06-19'),
 (17, 'Karen Gillan', 1008, 'Female', 'British', '1987-11-28'),
 (18, 'Dave Bautista', 1013, 'Male', 'American', '1969-01-18'),
 (19, 'Vin Diesel', 1013, 'Male', 'American', '1967-07-18'),
 (20, 'Bradley Cooper', 1006, 'Male', 'American', '1975-01-05'),
 (21, 'Brie Larson', 1008, 'Female', 'American', '1989-10-01'),
 (22, 'Michael B. Jordan', 1022, 'Male', 'American', '1987-02-09'),
 (23, 'Tessa Thompson', 1002, 'Female', 'American', '1983-10-03'),
 (24, 'Samuel L. Jackson', 1012, 'Male', 'American', '1948-12-21'),
 (25, 'Gwyneth Paltrow', 1013, 'Female', 'American', '1972-09-27'),
 (26, 'Don Cheadle', 1014, 'Male', 'American', '1964-11-29'),
 (27, 'Paul Bettany', 1021, 'Male', 'British', '1971-05-27'),
 (28, 'Sebastian Stan', 1007, 'Male', 'Romanian', '1982-08-13'),
 (29, 'Josh Brolin', 1019, 'Male', 'American', '1968-02-12'),
 (30, 'Karen Gillan', 1019, 'Female', 'British', '1987-11-28'),
 (31, 'Pom Klementieff', 1002, 'Female', 'French', '1986-05-03'),
 (32, 'Evangeline Lilly', 1020, 'Female', 'Canadian', '1979-08-03'),
 (33, 'Tilda Swinton', 1019, 'Female', 'British', '1960-11-05'),
 (34, 'Jeff Goldblum', 1021, 'Male', 'American', '1952-10-22'),
 (35, 'Jon Favreau', 1018, 'Male', 'American', '1966-10-19'),
 (36, 'Ben Kingsley', 1013, 'Male', 'British', '1943-12-31'),
 (37, 'Cobie Smulders', 1022, 'Female', 'Canadian', '1982-04-03'),
 (38, 'Hayley Atwell', 1003, 'Female', 'British', '1982-04-05'),
 (39, 'Natalie Portman', 1008, 'Female', 'Israeli', '1981-06-09'),
 (40, 'Rachel McAdams', 1004, 'Female', 'Canadian', '1978-11-17'),
 (41, 'Benedict Wong', 1011, 'Male', 'British', '1971-07-03'),
 (42, 'Letitia Wright', 1004, 'Female', 'British', '1993-10-31'),
 (43, 'Winston Duke', 1018, 'Male', 'Tobagonian', '1986-11-15'),
 (44, 'Danai Gurira', 1016, 'Female', 'American', '1978-02-14'),
 (45, 'Lee Pace', 1002, 'Male', 'American', '1979-03-25'),
 (46, 'Djimon Hounsou', 1014, 'Male', 'Beninese', '1964-04-24'),
 (47, 'Clark Gregg', 1006, 'Male', 'American', '1962-04-02'),
 (48, 'Kat Dennings', 1018, 'Female', 'American', '1986-06-13'),
 (49, 'Stellan Skarsgård', 1014, 'Male', 'Swedish', '1951-06-13'),
 (50, 'Idris Elba', 1015, 'Male', 'British', '1972-09-06')]
# Check the foreign keys in Actors
res = cur.execute("PRAGMA foreign_key_list(Actors)")
print(res.fetchall())
[(0, 0, 'MCU_Movies', 'movie_id', 'movie_id', 'NO ACTION', 'NO ACTION', 'NONE')]

cur.description is a useful attribute for accessing metadata about query results, helping you to dynamically process and handle database query results based on column information.

# id: The ID of the foreign key constraint.
# seq: The sequence number of the column within the foreign key constraint.
# table: The name of the table that the foreign key references.
# from: The name of the column in the current table (in this case, Actors) that is part of the foreign key.
# to: The name of the column in the referenced table that the foreign key points to.
# on_update: The action that occurs if the referenced column is updated.
# on_delete: The action that occurs if the referenced column is deleted.
# match: The matching algorithm used for enforcing the foreign key constraint.
cur.description
(('id', None, None, None, None, None, None),
 ('seq', None, None, None, None, None, None),
 ('table', None, None, None, None, None, None),
 ('from', None, None, None, None, None, None),
 ('to', None, None, None, None, None, None),
 ('on_update', None, None, None, None, None, None),
 ('on_delete', None, None, None, None, None, None),
 ('match', None, None, None, None, None, None))
cur.close()
conn.close()

1.4.1 JOIN#

The JOIN in SQL is used to combine rows from two or more tables based on a related column between them.

# create connection with db
conn = sqlite3.connect('MCU.db')
# create cur
cur = conn.cursor()
# load movie table as df_m
df_m = pd.read_sql_query('SELECT * FROM MCU_movies', conn)
df_m.head()
movie_id title year date time score
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9
1 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0
3 1004 Thor 2011 2011-01-01 07:47:00 7.0
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9
# load actor table as df_a
df_a = pd.read_sql_query('SELECT * FROM Actors', conn)
df_a.head()
actor_id actor_name movie_id gender nationality date_birth
0 1 Robert Downey Jr. 1001 Male American 1965-04-04
1 2 Chris Evans 1005 Male American 1981-06-13
2 3 Mark Ruffalo 1006 Male American 1967-11-22
3 4 Chris Hemsworth 1004 Male Australian 1983-08-11
4 5 Scarlett Johansson 1006 Female American 1984-11-22

INNER JOIN

res = cur.execute('''
    SELECT  *
    FROM MCU_movies
    INNER JOIN Actors
    ON MCU_movies.movie_id = Actors.movie_id
    ORDER BY MCU_movies.movie_id
 ''')
for i in res.fetchall():
    print(i)
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9, 1, 'Robert Downey Jr.', 1001, 'Male', 'American', '1965-04-04')
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9, 8, 'Clark Gregg', 1001, 'Male', 'American', '1962-04-02')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 23, 'Tessa Thompson', 1002, 'Female', 'American', '1983-10-03')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 31, 'Pom Klementieff', 1002, 'Female', 'French', '1986-05-03')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 45, 'Lee Pace', 1002, 'Male', 'American', '1979-03-25')
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0, 38, 'Hayley Atwell', 1003, 'Female', 'British', '1982-04-05')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 4, 'Chris Hemsworth', 1004, 'Male', 'Australian', '1983-08-11')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 7, 'Tom Hiddleston', 1004, 'Male', 'British', '1981-02-09')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 16, 'Zoe Saldana', 1004, 'Female', 'American', '1978-06-19')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 40, 'Rachel McAdams', 1004, 'Female', 'Canadian', '1978-11-17')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 42, 'Letitia Wright', 1004, 'Female', 'British', '1993-10-31')
(1005, 'Captain America: The First Avenger', 2011, '2011-03-15', '06:33:00', 6.9, 2, 'Chris Evans', 1005, 'Male', 'American', '1981-06-13')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 3, 'Mark Ruffalo', 1006, 'Male', 'American', '1967-11-22')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 5, 'Scarlett Johansson', 1006, 'Female', 'American', '1984-11-22')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 6, 'Jeremy Renner', 1006, 'Male', 'American', '1971-01-07')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 20, 'Bradley Cooper', 1006, 'Male', 'American', '1975-01-05')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 47, 'Clark Gregg', 1006, 'Male', 'American', '1962-04-02')
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2, 28, 'Sebastian Stan', 1007, 'Male', 'Romanian', '1982-08-13')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 11, 'Tom Holland', 1008, 'Male', 'British', '1996-06-01')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 14, 'Chadwick Boseman', 1008, 'Male', 'American', '1976-11-29')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 17, 'Karen Gillan', 1008, 'Female', 'British', '1987-11-28')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 21, 'Brie Larson', 1008, 'Female', 'American', '1989-10-01')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 39, 'Natalie Portman', 1008, 'Female', 'Israeli', '1981-06-09')
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3, 41, 'Benedict Wong', 1011, 'Male', 'British', '1971-07-03')
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3, 9, 'Paul Rudd', 1012, 'Male', 'American', '1969-04-06')
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3, 24, 'Samuel L. Jackson', 1012, 'Male', 'American', '1948-12-21')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 18, 'Dave Bautista', 1013, 'Male', 'American', '1969-01-18')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 19, 'Vin Diesel', 1013, 'Male', 'American', '1967-07-18')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 25, 'Gwyneth Paltrow', 1013, 'Female', 'American', '1972-09-27')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 36, 'Ben Kingsley', 1013, 'Male', 'British', '1943-12-31')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 10, 'Benedict Cumberbatch', 1014, 'Male', 'British', '1976-07-19')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 26, 'Don Cheadle', 1014, 'Male', 'American', '1964-11-29')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 46, 'Djimon Hounsou', 1014, 'Male', 'Beninese', '1964-04-24')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 49, 'Stellan Skarsgård', 1014, 'Male', 'Swedish', '1951-06-13')
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6, 50, 'Idris Elba', 1015, 'Male', 'British', '1972-09-06')
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4, 44, 'Danai Gurira', 1016, 'Female', 'American', '1978-02-14')
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9, 13, 'Anthony Mackie', 1017, 'Male', 'American', '1978-09-23')
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9, 15, 'Chris Pratt', 1017, 'Male', 'American', '1979-06-21')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 35, 'Jon Favreau', 1018, 'Male', 'American', '1966-10-19')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 43, 'Winston Duke', 1018, 'Male', 'Tobagonian', '1986-11-15')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 48, 'Kat Dennings', 1018, 'Female', 'American', '1986-06-13')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 29, 'Josh Brolin', 1019, 'Male', 'American', '1968-02-12')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 30, 'Karen Gillan', 1019, 'Female', 'British', '1987-11-28')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 33, 'Tilda Swinton', 1019, 'Female', 'British', '1960-11-05')
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1, 32, 'Evangeline Lilly', 1020, 'Female', 'Canadian', '1979-08-03')
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9, 27, 'Paul Bettany', 1021, 'Male', 'British', '1971-05-27')
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9, 34, 'Jeff Goldblum', 1021, 'Male', 'American', '1952-10-22')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 12, 'Elizabeth Olsen', 1022, 'Female', 'American', '1989-02-16')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 22, 'Michael B. Jordan', 1022, 'Male', 'American', '1987-02-09')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 37, 'Cobie Smulders', 1022, 'Female', 'Canadian', '1982-04-03')
df_m.set_index('movie_id').join(df_a.set_index('movie_id'), on='movie_id', how='inner', validate='m:m')
title year date time score actor_id actor_name gender nationality date_birth
movie_id
1001 Iron Man 2008 2008-01-01 15:22:00 7.9 1 Robert Downey Jr. Male American 1965-04-04
1001 Iron Man 2008 2008-01-01 15:22:00 7.9 8 Clark Gregg Male American 1962-04-02
1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 23 Tessa Thompson Female American 1983-10-03
1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 31 Pom Klementieff Female French 1986-05-03
1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 45 Lee Pace Male American 1979-03-25
1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 38 Hayley Atwell Female British 1982-04-05
1004 Thor 2011 2011-01-01 07:47:00 7.0 4 Chris Hemsworth Male Australian 1983-08-11
1004 Thor 2011 2011-01-01 07:47:00 7.0 7 Tom Hiddleston Male British 1981-02-09
1004 Thor 2011 2011-01-01 07:47:00 7.0 16 Zoe Saldana Female American 1978-06-19
1004 Thor 2011 2011-01-01 07:47:00 7.0 40 Rachel McAdams Female Canadian 1978-11-17
1004 Thor 2011 2011-01-01 07:47:00 7.0 42 Letitia Wright Female British 1993-10-31
1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9 2 Chris Evans Male American 1981-06-13
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 3 Mark Ruffalo Male American 1967-11-22
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 5 Scarlett Johansson Female American 1984-11-22
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 6 Jeremy Renner Male American 1971-01-07
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 20 Bradley Cooper Male American 1975-01-05
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 47 Clark Gregg Male American 1962-04-02
1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2 28 Sebastian Stan Male Romanian 1982-08-13
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 11 Tom Holland Male British 1996-06-01
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 14 Chadwick Boseman Male American 1976-11-29
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 17 Karen Gillan Female British 1987-11-28
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 21 Brie Larson Female American 1989-10-01
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 39 Natalie Portman Female Israeli 1981-06-09
1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3 41 Benedict Wong Male British 1971-07-03
1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 9 Paul Rudd Male American 1969-04-06
1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 24 Samuel L. Jackson Male American 1948-12-21
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 18 Dave Bautista Male American 1969-01-18
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 19 Vin Diesel Male American 1967-07-18
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 25 Gwyneth Paltrow Female American 1972-09-27
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 36 Ben Kingsley Male British 1943-12-31
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 10 Benedict Cumberbatch Male British 1976-07-19
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 26 Don Cheadle Male American 1964-11-29
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 46 Djimon Hounsou Male Beninese 1964-04-24
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 49 Stellan Skarsgård Male Swedish 1951-06-13
1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6 50 Idris Elba Male British 1972-09-06
1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4 44 Danai Gurira Female American 1978-02-14
1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 13 Anthony Mackie Male American 1978-09-23
1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 15 Chris Pratt Male American 1979-06-21
1018 Black Panther 2018 2018-04-23 03:05:00 7.3 35 Jon Favreau Male American 1966-10-19
1018 Black Panther 2018 2018-04-23 03:05:00 7.3 43 Winston Duke Male Tobagonian 1986-11-15
1018 Black Panther 2018 2018-04-23 03:05:00 7.3 48 Kat Dennings Female American 1986-06-13
1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 29 Josh Brolin Male American 1968-02-12
1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 30 Karen Gillan Female British 1987-11-28
1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 33 Tilda Swinton Female British 1960-11-05
1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1 32 Evangeline Lilly Female Canadian 1979-08-03
1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 27 Paul Bettany Male British 1971-05-27
1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 34 Jeff Goldblum Male American 1952-10-22
1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 12 Elizabeth Olsen Female American 1989-02-16
1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 22 Michael B. Jordan Male American 1987-02-09
1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 37 Cobie Smulders Female Canadian 1982-04-03
pd.merge(df_m, df_a, left_on='movie_id', right_on='movie_id', how='inner')
movie_id title year date time score actor_id actor_name gender nationality date_birth
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 1 Robert Downey Jr. Male American 1965-04-04
1 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 8 Clark Gregg Male American 1962-04-02
2 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 23 Tessa Thompson Female American 1983-10-03
3 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 31 Pom Klementieff Female French 1986-05-03
4 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 45 Lee Pace Male American 1979-03-25
5 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 38 Hayley Atwell Female British 1982-04-05
6 1004 Thor 2011 2011-01-01 07:47:00 7.0 4 Chris Hemsworth Male Australian 1983-08-11
7 1004 Thor 2011 2011-01-01 07:47:00 7.0 7 Tom Hiddleston Male British 1981-02-09
8 1004 Thor 2011 2011-01-01 07:47:00 7.0 16 Zoe Saldana Female American 1978-06-19
9 1004 Thor 2011 2011-01-01 07:47:00 7.0 40 Rachel McAdams Female Canadian 1978-11-17
10 1004 Thor 2011 2011-01-01 07:47:00 7.0 42 Letitia Wright Female British 1993-10-31
11 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9 2 Chris Evans Male American 1981-06-13
12 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 3 Mark Ruffalo Male American 1967-11-22
13 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 5 Scarlett Johansson Female American 1984-11-22
14 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 6 Jeremy Renner Male American 1971-01-07
15 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 20 Bradley Cooper Male American 1975-01-05
16 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 47 Clark Gregg Male American 1962-04-02
17 1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2 28 Sebastian Stan Male Romanian 1982-08-13
18 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 11 Tom Holland Male British 1996-06-01
19 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 14 Chadwick Boseman Male American 1976-11-29
20 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 17 Karen Gillan Female British 1987-11-28
21 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 21 Brie Larson Female American 1989-10-01
22 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 39 Natalie Portman Female Israeli 1981-06-09
23 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3 41 Benedict Wong Male British 1971-07-03
24 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 9 Paul Rudd Male American 1969-04-06
25 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 24 Samuel L. Jackson Male American 1948-12-21
26 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 18 Dave Bautista Male American 1969-01-18
27 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 19 Vin Diesel Male American 1967-07-18
28 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 25 Gwyneth Paltrow Female American 1972-09-27
29 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 36 Ben Kingsley Male British 1943-12-31
30 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 10 Benedict Cumberbatch Male British 1976-07-19
31 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 26 Don Cheadle Male American 1964-11-29
32 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 46 Djimon Hounsou Male Beninese 1964-04-24
33 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 49 Stellan Skarsgård Male Swedish 1951-06-13
34 1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6 50 Idris Elba Male British 1972-09-06
35 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4 44 Danai Gurira Female American 1978-02-14
36 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 13 Anthony Mackie Male American 1978-09-23
37 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 15 Chris Pratt Male American 1979-06-21
38 1018 Black Panther 2018 2018-04-23 03:05:00 7.3 35 Jon Favreau Male American 1966-10-19
39 1018 Black Panther 2018 2018-04-23 03:05:00 7.3 43 Winston Duke Male Tobagonian 1986-11-15
40 1018 Black Panther 2018 2018-04-23 03:05:00 7.3 48 Kat Dennings Female American 1986-06-13
41 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 29 Josh Brolin Male American 1968-02-12
42 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 30 Karen Gillan Female British 1987-11-28
43 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 33 Tilda Swinton Female British 1960-11-05
44 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1 32 Evangeline Lilly Female Canadian 1979-08-03
45 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 27 Paul Bettany Male British 1971-05-27
46 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 34 Jeff Goldblum Male American 1952-10-22
47 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 12 Elizabeth Olsen Female American 1989-02-16
48 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 22 Michael B. Jordan Male American 1987-02-09
49 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 37 Cobie Smulders Female Canadian 1982-04-03

LEFT JOIN

res = cur.execute('''
    SELECT  *
    FROM MCU_movies
    LEFT JOIN Actors
    ON MCU_movies.movie_id = Actors.movie_id
    ORDER BY MCU_movies.movie_id
 ''')
for i in res.fetchall():
    print(i)
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9, 8, 'Clark Gregg', 1001, 'Male', 'American', '1962-04-02')
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9, 1, 'Robert Downey Jr.', 1001, 'Male', 'American', '1965-04-04')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 45, 'Lee Pace', 1002, 'Male', 'American', '1979-03-25')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 31, 'Pom Klementieff', 1002, 'Female', 'French', '1986-05-03')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 23, 'Tessa Thompson', 1002, 'Female', 'American', '1983-10-03')
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0, 38, 'Hayley Atwell', 1003, 'Female', 'British', '1982-04-05')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 4, 'Chris Hemsworth', 1004, 'Male', 'Australian', '1983-08-11')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 42, 'Letitia Wright', 1004, 'Female', 'British', '1993-10-31')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 40, 'Rachel McAdams', 1004, 'Female', 'Canadian', '1978-11-17')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 7, 'Tom Hiddleston', 1004, 'Male', 'British', '1981-02-09')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 16, 'Zoe Saldana', 1004, 'Female', 'American', '1978-06-19')
(1005, 'Captain America: The First Avenger', 2011, '2011-03-15', '06:33:00', 6.9, 2, 'Chris Evans', 1005, 'Male', 'American', '1981-06-13')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 20, 'Bradley Cooper', 1006, 'Male', 'American', '1975-01-05')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 47, 'Clark Gregg', 1006, 'Male', 'American', '1962-04-02')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 6, 'Jeremy Renner', 1006, 'Male', 'American', '1971-01-07')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 3, 'Mark Ruffalo', 1006, 'Male', 'American', '1967-11-22')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 5, 'Scarlett Johansson', 1006, 'Female', 'American', '1984-11-22')
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2, 28, 'Sebastian Stan', 1007, 'Male', 'Romanian', '1982-08-13')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 21, 'Brie Larson', 1008, 'Female', 'American', '1989-10-01')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 14, 'Chadwick Boseman', 1008, 'Male', 'American', '1976-11-29')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 17, 'Karen Gillan', 1008, 'Female', 'British', '1987-11-28')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 39, 'Natalie Portman', 1008, 'Female', 'Israeli', '1981-06-09')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 11, 'Tom Holland', 1008, 'Male', 'British', '1996-06-01')
(1009, 'Captain America: The Winter Soldier', 2014, '2014-11-12', '13:56:00', 7.7, None, None, None, None, None, None)
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0, None, None, None, None, None, None)
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3, 41, 'Benedict Wong', 1011, 'Male', 'British', '1971-07-03')
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3, 9, 'Paul Rudd', 1012, 'Male', 'American', '1969-04-06')
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3, 24, 'Samuel L. Jackson', 1012, 'Male', 'American', '1948-12-21')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 36, 'Ben Kingsley', 1013, 'Male', 'British', '1943-12-31')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 18, 'Dave Bautista', 1013, 'Male', 'American', '1969-01-18')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 25, 'Gwyneth Paltrow', 1013, 'Female', 'American', '1972-09-27')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 19, 'Vin Diesel', 1013, 'Male', 'American', '1967-07-18')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 10, 'Benedict Cumberbatch', 1014, 'Male', 'British', '1976-07-19')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 46, 'Djimon Hounsou', 1014, 'Male', 'Beninese', '1964-04-24')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 26, 'Don Cheadle', 1014, 'Male', 'American', '1964-11-29')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 49, 'Stellan Skarsgård', 1014, 'Male', 'Swedish', '1951-06-13')
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6, 50, 'Idris Elba', 1015, 'Male', 'British', '1972-09-06')
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4, 44, 'Danai Gurira', 1016, 'Female', 'American', '1978-02-14')
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9, 13, 'Anthony Mackie', 1017, 'Male', 'American', '1978-09-23')
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9, 15, 'Chris Pratt', 1017, 'Male', 'American', '1979-06-21')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 35, 'Jon Favreau', 1018, 'Male', 'American', '1966-10-19')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 48, 'Kat Dennings', 1018, 'Female', 'American', '1986-06-13')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 43, 'Winston Duke', 1018, 'Male', 'Tobagonian', '1986-11-15')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 29, 'Josh Brolin', 1019, 'Male', 'American', '1968-02-12')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 30, 'Karen Gillan', 1019, 'Female', 'British', '1987-11-28')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 33, 'Tilda Swinton', 1019, 'Female', 'British', '1960-11-05')
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1, 32, 'Evangeline Lilly', 1020, 'Female', 'Canadian', '1979-08-03')
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9, 34, 'Jeff Goldblum', 1021, 'Male', 'American', '1952-10-22')
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9, 27, 'Paul Bettany', 1021, 'Male', 'British', '1971-05-27')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 37, 'Cobie Smulders', 1022, 'Female', 'Canadian', '1982-04-03')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 12, 'Elizabeth Olsen', 1022, 'Female', 'American', '1989-02-16')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 22, 'Michael B. Jordan', 1022, 'Male', 'American', '1987-02-09')
df_m.set_index('movie_id').join(df_a.set_index('movie_id'), on='movie_id', how='left', validate='m:m')
title year date time score actor_id actor_name gender nationality date_birth
movie_id
1001 Iron Man 2008 2008-01-01 15:22:00 7.9 1.0 Robert Downey Jr. Male American 1965-04-04
1001 Iron Man 2008 2008-01-01 15:22:00 7.9 8.0 Clark Gregg Male American 1962-04-02
1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 23.0 Tessa Thompson Female American 1983-10-03
1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 31.0 Pom Klementieff Female French 1986-05-03
1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 45.0 Lee Pace Male American 1979-03-25
1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 38.0 Hayley Atwell Female British 1982-04-05
1004 Thor 2011 2011-01-01 07:47:00 7.0 4.0 Chris Hemsworth Male Australian 1983-08-11
1004 Thor 2011 2011-01-01 07:47:00 7.0 7.0 Tom Hiddleston Male British 1981-02-09
1004 Thor 2011 2011-01-01 07:47:00 7.0 16.0 Zoe Saldana Female American 1978-06-19
1004 Thor 2011 2011-01-01 07:47:00 7.0 40.0 Rachel McAdams Female Canadian 1978-11-17
1004 Thor 2011 2011-01-01 07:47:00 7.0 42.0 Letitia Wright Female British 1993-10-31
1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9 2.0 Chris Evans Male American 1981-06-13
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 3.0 Mark Ruffalo Male American 1967-11-22
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 5.0 Scarlett Johansson Female American 1984-11-22
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 6.0 Jeremy Renner Male American 1971-01-07
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 20.0 Bradley Cooper Male American 1975-01-05
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 47.0 Clark Gregg Male American 1962-04-02
1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2 28.0 Sebastian Stan Male Romanian 1982-08-13
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 11.0 Tom Holland Male British 1996-06-01
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 14.0 Chadwick Boseman Male American 1976-11-29
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 17.0 Karen Gillan Female British 1987-11-28
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 21.0 Brie Larson Female American 1989-10-01
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 39.0 Natalie Portman Female Israeli 1981-06-09
1009 Captain America: The Winter Soldier 2014 2014-11-12 13:56:00 7.7 NaN NaN NaN NaN NaN
1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0 NaN NaN NaN NaN NaN
1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3 41.0 Benedict Wong Male British 1971-07-03
1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 9.0 Paul Rudd Male American 1969-04-06
1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 24.0 Samuel L. Jackson Male American 1948-12-21
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 18.0 Dave Bautista Male American 1969-01-18
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 19.0 Vin Diesel Male American 1967-07-18
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 25.0 Gwyneth Paltrow Female American 1972-09-27
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 36.0 Ben Kingsley Male British 1943-12-31
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 10.0 Benedict Cumberbatch Male British 1976-07-19
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 26.0 Don Cheadle Male American 1964-11-29
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 46.0 Djimon Hounsou Male Beninese 1964-04-24
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 49.0 Stellan Skarsgård Male Swedish 1951-06-13
1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6 50.0 Idris Elba Male British 1972-09-06
1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4 44.0 Danai Gurira Female American 1978-02-14
1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 13.0 Anthony Mackie Male American 1978-09-23
1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 15.0 Chris Pratt Male American 1979-06-21
1018 Black Panther 2018 2018-04-23 03:05:00 7.3 35.0 Jon Favreau Male American 1966-10-19
1018 Black Panther 2018 2018-04-23 03:05:00 7.3 43.0 Winston Duke Male Tobagonian 1986-11-15
1018 Black Panther 2018 2018-04-23 03:05:00 7.3 48.0 Kat Dennings Female American 1986-06-13
1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 29.0 Josh Brolin Male American 1968-02-12
1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 30.0 Karen Gillan Female British 1987-11-28
1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 33.0 Tilda Swinton Female British 1960-11-05
1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1 32.0 Evangeline Lilly Female Canadian 1979-08-03
1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 27.0 Paul Bettany Male British 1971-05-27
1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 34.0 Jeff Goldblum Male American 1952-10-22
1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 12.0 Elizabeth Olsen Female American 1989-02-16
1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 22.0 Michael B. Jordan Male American 1987-02-09
1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 37.0 Cobie Smulders Female Canadian 1982-04-03
pd.merge(df_m, df_a, left_on='movie_id', right_on='movie_id', how='left')
movie_id title year date time score actor_id actor_name gender nationality date_birth
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 1.0 Robert Downey Jr. Male American 1965-04-04
1 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 8.0 Clark Gregg Male American 1962-04-02
2 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 23.0 Tessa Thompson Female American 1983-10-03
3 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 31.0 Pom Klementieff Female French 1986-05-03
4 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 45.0 Lee Pace Male American 1979-03-25
5 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 38.0 Hayley Atwell Female British 1982-04-05
6 1004 Thor 2011 2011-01-01 07:47:00 7.0 4.0 Chris Hemsworth Male Australian 1983-08-11
7 1004 Thor 2011 2011-01-01 07:47:00 7.0 7.0 Tom Hiddleston Male British 1981-02-09
8 1004 Thor 2011 2011-01-01 07:47:00 7.0 16.0 Zoe Saldana Female American 1978-06-19
9 1004 Thor 2011 2011-01-01 07:47:00 7.0 40.0 Rachel McAdams Female Canadian 1978-11-17
10 1004 Thor 2011 2011-01-01 07:47:00 7.0 42.0 Letitia Wright Female British 1993-10-31
11 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9 2.0 Chris Evans Male American 1981-06-13
12 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 3.0 Mark Ruffalo Male American 1967-11-22
13 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 5.0 Scarlett Johansson Female American 1984-11-22
14 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 6.0 Jeremy Renner Male American 1971-01-07
15 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 20.0 Bradley Cooper Male American 1975-01-05
16 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 47.0 Clark Gregg Male American 1962-04-02
17 1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2 28.0 Sebastian Stan Male Romanian 1982-08-13
18 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 11.0 Tom Holland Male British 1996-06-01
19 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 14.0 Chadwick Boseman Male American 1976-11-29
20 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 17.0 Karen Gillan Female British 1987-11-28
21 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 21.0 Brie Larson Female American 1989-10-01
22 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 39.0 Natalie Portman Female Israeli 1981-06-09
23 1009 Captain America: The Winter Soldier 2014 2014-11-12 13:56:00 7.7 NaN NaN NaN NaN NaN
24 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0 NaN NaN NaN NaN NaN
25 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3 41.0 Benedict Wong Male British 1971-07-03
26 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 9.0 Paul Rudd Male American 1969-04-06
27 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 24.0 Samuel L. Jackson Male American 1948-12-21
28 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 18.0 Dave Bautista Male American 1969-01-18
29 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 19.0 Vin Diesel Male American 1967-07-18
30 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 25.0 Gwyneth Paltrow Female American 1972-09-27
31 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 36.0 Ben Kingsley Male British 1943-12-31
32 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 10.0 Benedict Cumberbatch Male British 1976-07-19
33 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 26.0 Don Cheadle Male American 1964-11-29
34 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 46.0 Djimon Hounsou Male Beninese 1964-04-24
35 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 49.0 Stellan Skarsgård Male Swedish 1951-06-13
36 1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6 50.0 Idris Elba Male British 1972-09-06
37 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4 44.0 Danai Gurira Female American 1978-02-14
38 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 13.0 Anthony Mackie Male American 1978-09-23
39 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 15.0 Chris Pratt Male American 1979-06-21
40 1018 Black Panther 2018 2018-04-23 03:05:00 7.3 35.0 Jon Favreau Male American 1966-10-19
41 1018 Black Panther 2018 2018-04-23 03:05:00 7.3 43.0 Winston Duke Male Tobagonian 1986-11-15
42 1018 Black Panther 2018 2018-04-23 03:05:00 7.3 48.0 Kat Dennings Female American 1986-06-13
43 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 29.0 Josh Brolin Male American 1968-02-12
44 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 30.0 Karen Gillan Female British 1987-11-28
45 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 33.0 Tilda Swinton Female British 1960-11-05
46 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1 32.0 Evangeline Lilly Female Canadian 1979-08-03
47 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 27.0 Paul Bettany Male British 1971-05-27
48 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 34.0 Jeff Goldblum Male American 1952-10-22
49 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 12.0 Elizabeth Olsen Female American 1989-02-16
50 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 22.0 Michael B. Jordan Male American 1987-02-09
51 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 37.0 Cobie Smulders Female Canadian 1982-04-03

RIGHT JOIN

you may get an error message in sqlite3: RIGHT and FULL OUTER JOINs are not currently supported.

By swapping the tables in the LEFT JOIN, you get the same result as a RIGHT JOIN.

# OperationalError: RIGHT and FULL OUTER JOINs are not currently supported
# res = cur.execute('''
#     SELECT  *
#     FROM MCU_movies
#     RIGHT JOIN Actors
#     ON MCU_movies.movie_id = Actors.movie_id
#     ORDER BY MCU_movies.movie_id
#  ''')
# for i in res.fetchall():
#     print(i)
# swapping left and ringht table
res = cur.execute('''
    SELECT  *
    FROM Actors
    LEFT JOIN MCU_movies
    ON MCU_movies.movie_id = Actors.movie_id
    ORDER BY Actors.actor_id
 ''')
for i in res.fetchall():
    print(i)
(1, 'Robert Downey Jr.', 1001, 'Male', 'American', '1965-04-04', 1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9)
(2, 'Chris Evans', 1005, 'Male', 'American', '1981-06-13', 1005, 'Captain America: The First Avenger', 2011, '2011-03-15', '06:33:00', 6.9)
(3, 'Mark Ruffalo', 1006, 'Male', 'American', '1967-11-22', 1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0)
(4, 'Chris Hemsworth', 1004, 'Male', 'Australian', '1983-08-11', 1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)
(5, 'Scarlett Johansson', 1006, 'Female', 'American', '1984-11-22', 1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0)
(6, 'Jeremy Renner', 1006, 'Male', 'American', '1971-01-07', 1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0)
(7, 'Tom Hiddleston', 1004, 'Male', 'British', '1981-02-09', 1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)
(8, 'Clark Gregg', 1001, 'Male', 'American', '1962-04-02', 1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9)
(9, 'Paul Rudd', 1012, 'Male', 'American', '1969-04-06', 1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3)
(10, 'Benedict Cumberbatch', 1014, 'Male', 'British', '1976-07-19', 1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5)
(11, 'Tom Holland', 1008, 'Male', 'British', '1996-06-01', 1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)
(12, 'Elizabeth Olsen', 1022, 'Female', 'American', '1989-02-16', 1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)
(13, 'Anthony Mackie', 1017, 'Male', 'American', '1978-09-23', 1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9)
(14, 'Chadwick Boseman', 1008, 'Male', 'American', '1976-11-29', 1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)
(15, 'Chris Pratt', 1017, 'Male', 'American', '1979-06-21', 1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9)
(16, 'Zoe Saldana', 1004, 'Female', 'American', '1978-06-19', 1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)
(17, 'Karen Gillan', 1008, 'Female', 'British', '1987-11-28', 1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)
(18, 'Dave Bautista', 1013, 'Male', 'American', '1969-01-18', 1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8)
(19, 'Vin Diesel', 1013, 'Male', 'American', '1967-07-18', 1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8)
(20, 'Bradley Cooper', 1006, 'Male', 'American', '1975-01-05', 1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0)
(21, 'Brie Larson', 1008, 'Female', 'American', '1989-10-01', 1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)
(22, 'Michael B. Jordan', 1022, 'Male', 'American', '1987-02-09', 1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)
(23, 'Tessa Thompson', 1002, 'Female', 'American', '1983-10-03', 1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7)
(24, 'Samuel L. Jackson', 1012, 'Male', 'American', '1948-12-21', 1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3)
(25, 'Gwyneth Paltrow', 1013, 'Female', 'American', '1972-09-27', 1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8)
(26, 'Don Cheadle', 1014, 'Male', 'American', '1964-11-29', 1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5)
(27, 'Paul Bettany', 1021, 'Male', 'British', '1971-05-27', 1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9)
(28, 'Sebastian Stan', 1007, 'Male', 'Romanian', '1982-08-13', 1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2)
(29, 'Josh Brolin', 1019, 'Male', 'American', '1968-02-12', 1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4)
(30, 'Karen Gillan', 1019, 'Female', 'British', '1987-11-28', 1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4)
(31, 'Pom Klementieff', 1002, 'Female', 'French', '1986-05-03', 1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7)
(32, 'Evangeline Lilly', 1020, 'Female', 'Canadian', '1979-08-03', 1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1)
(33, 'Tilda Swinton', 1019, 'Female', 'British', '1960-11-05', 1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4)
(34, 'Jeff Goldblum', 1021, 'Male', 'American', '1952-10-22', 1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9)
(35, 'Jon Favreau', 1018, 'Male', 'American', '1966-10-19', 1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3)
(36, 'Ben Kingsley', 1013, 'Male', 'British', '1943-12-31', 1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8)
(37, 'Cobie Smulders', 1022, 'Female', 'Canadian', '1982-04-03', 1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)
(38, 'Hayley Atwell', 1003, 'Female', 'British', '1982-04-05', 1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0)
(39, 'Natalie Portman', 1008, 'Female', 'Israeli', '1981-06-09', 1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)
(40, 'Rachel McAdams', 1004, 'Female', 'Canadian', '1978-11-17', 1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)
(41, 'Benedict Wong', 1011, 'Male', 'British', '1971-07-03', 1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3)
(42, 'Letitia Wright', 1004, 'Female', 'British', '1993-10-31', 1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)
(43, 'Winston Duke', 1018, 'Male', 'Tobagonian', '1986-11-15', 1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3)
(44, 'Danai Gurira', 1016, 'Female', 'American', '1978-02-14', 1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4)
(45, 'Lee Pace', 1002, 'Male', 'American', '1979-03-25', 1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7)
(46, 'Djimon Hounsou', 1014, 'Male', 'Beninese', '1964-04-24', 1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5)
(47, 'Clark Gregg', 1006, 'Male', 'American', '1962-04-02', 1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0)
(48, 'Kat Dennings', 1018, 'Female', 'American', '1986-06-13', 1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3)
(49, 'Stellan Skarsgård', 1014, 'Male', 'Swedish', '1951-06-13', 1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5)
(50, 'Idris Elba', 1015, 'Male', 'British', '1972-09-06', 1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6)
df_m.set_index('movie_id').join(df_a.set_index('movie_id'), on='movie_id', how='right', validate='m:m')
title year date time score actor_id actor_name gender nationality date_birth
movie_id
1001 Iron Man 2008 2008-01-01 15:22:00 7.9 1 Robert Downey Jr. Male American 1965-04-04
1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9 2 Chris Evans Male American 1981-06-13
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 3 Mark Ruffalo Male American 1967-11-22
1004 Thor 2011 2011-01-01 07:47:00 7.0 4 Chris Hemsworth Male Australian 1983-08-11
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 5 Scarlett Johansson Female American 1984-11-22
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 6 Jeremy Renner Male American 1971-01-07
1004 Thor 2011 2011-01-01 07:47:00 7.0 7 Tom Hiddleston Male British 1981-02-09
1001 Iron Man 2008 2008-01-01 15:22:00 7.9 8 Clark Gregg Male American 1962-04-02
1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 9 Paul Rudd Male American 1969-04-06
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 10 Benedict Cumberbatch Male British 1976-07-19
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 11 Tom Holland Male British 1996-06-01
1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 12 Elizabeth Olsen Female American 1989-02-16
1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 13 Anthony Mackie Male American 1978-09-23
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 14 Chadwick Boseman Male American 1976-11-29
1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 15 Chris Pratt Male American 1979-06-21
1004 Thor 2011 2011-01-01 07:47:00 7.0 16 Zoe Saldana Female American 1978-06-19
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 17 Karen Gillan Female British 1987-11-28
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 18 Dave Bautista Male American 1969-01-18
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 19 Vin Diesel Male American 1967-07-18
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 20 Bradley Cooper Male American 1975-01-05
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 21 Brie Larson Female American 1989-10-01
1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 22 Michael B. Jordan Male American 1987-02-09
1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 23 Tessa Thompson Female American 1983-10-03
1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 24 Samuel L. Jackson Male American 1948-12-21
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 25 Gwyneth Paltrow Female American 1972-09-27
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 26 Don Cheadle Male American 1964-11-29
1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 27 Paul Bettany Male British 1971-05-27
1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2 28 Sebastian Stan Male Romanian 1982-08-13
1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 29 Josh Brolin Male American 1968-02-12
1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 30 Karen Gillan Female British 1987-11-28
1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 31 Pom Klementieff Female French 1986-05-03
1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1 32 Evangeline Lilly Female Canadian 1979-08-03
1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 33 Tilda Swinton Female British 1960-11-05
1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 34 Jeff Goldblum Male American 1952-10-22
1018 Black Panther 2018 2018-04-23 03:05:00 7.3 35 Jon Favreau Male American 1966-10-19
1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 36 Ben Kingsley Male British 1943-12-31
1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 37 Cobie Smulders Female Canadian 1982-04-03
1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 38 Hayley Atwell Female British 1982-04-05
1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 39 Natalie Portman Female Israeli 1981-06-09
1004 Thor 2011 2011-01-01 07:47:00 7.0 40 Rachel McAdams Female Canadian 1978-11-17
1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3 41 Benedict Wong Male British 1971-07-03
1004 Thor 2011 2011-01-01 07:47:00 7.0 42 Letitia Wright Female British 1993-10-31
1018 Black Panther 2018 2018-04-23 03:05:00 7.3 43 Winston Duke Male Tobagonian 1986-11-15
1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4 44 Danai Gurira Female American 1978-02-14
1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 45 Lee Pace Male American 1979-03-25
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 46 Djimon Hounsou Male Beninese 1964-04-24
1006 The Avengers 2012 2012-05-10 08:30:00 8.0 47 Clark Gregg Male American 1962-04-02
1018 Black Panther 2018 2018-04-23 03:05:00 7.3 48 Kat Dennings Female American 1986-06-13
1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 49 Stellan Skarsgård Male Swedish 1951-06-13
1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6 50 Idris Elba Male British 1972-09-06
pd.merge(df_m, df_a, left_on='movie_id', right_on='movie_id', how='right')
movie_id title year date time score actor_id actor_name gender nationality date_birth
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 1 Robert Downey Jr. Male American 1965-04-04
1 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9 2 Chris Evans Male American 1981-06-13
2 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 3 Mark Ruffalo Male American 1967-11-22
3 1004 Thor 2011 2011-01-01 07:47:00 7.0 4 Chris Hemsworth Male Australian 1983-08-11
4 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 5 Scarlett Johansson Female American 1984-11-22
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 6 Jeremy Renner Male American 1971-01-07
6 1004 Thor 2011 2011-01-01 07:47:00 7.0 7 Tom Hiddleston Male British 1981-02-09
7 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 8 Clark Gregg Male American 1962-04-02
8 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 9 Paul Rudd Male American 1969-04-06
9 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 10 Benedict Cumberbatch Male British 1976-07-19
10 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 11 Tom Holland Male British 1996-06-01
11 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 12 Elizabeth Olsen Female American 1989-02-16
12 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 13 Anthony Mackie Male American 1978-09-23
13 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 14 Chadwick Boseman Male American 1976-11-29
14 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 15 Chris Pratt Male American 1979-06-21
15 1004 Thor 2011 2011-01-01 07:47:00 7.0 16 Zoe Saldana Female American 1978-06-19
16 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 17 Karen Gillan Female British 1987-11-28
17 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 18 Dave Bautista Male American 1969-01-18
18 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 19 Vin Diesel Male American 1967-07-18
19 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 20 Bradley Cooper Male American 1975-01-05
20 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 21 Brie Larson Female American 1989-10-01
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 22 Michael B. Jordan Male American 1987-02-09
22 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 23 Tessa Thompson Female American 1983-10-03
23 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 24 Samuel L. Jackson Male American 1948-12-21
24 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 25 Gwyneth Paltrow Female American 1972-09-27
25 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 26 Don Cheadle Male American 1964-11-29
26 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 27 Paul Bettany Male British 1971-05-27
27 1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2 28 Sebastian Stan Male Romanian 1982-08-13
28 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 29 Josh Brolin Male American 1968-02-12
29 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 30 Karen Gillan Female British 1987-11-28
30 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 31 Pom Klementieff Female French 1986-05-03
31 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1 32 Evangeline Lilly Female Canadian 1979-08-03
32 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 33 Tilda Swinton Female British 1960-11-05
33 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 34 Jeff Goldblum Male American 1952-10-22
34 1018 Black Panther 2018 2018-04-23 03:05:00 7.3 35 Jon Favreau Male American 1966-10-19
35 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 36 Ben Kingsley Male British 1943-12-31
36 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 37 Cobie Smulders Female Canadian 1982-04-03
37 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 38 Hayley Atwell Female British 1982-04-05
38 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 39 Natalie Portman Female Israeli 1981-06-09
39 1004 Thor 2011 2011-01-01 07:47:00 7.0 40 Rachel McAdams Female Canadian 1978-11-17
40 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3 41 Benedict Wong Male British 1971-07-03
41 1004 Thor 2011 2011-01-01 07:47:00 7.0 42 Letitia Wright Female British 1993-10-31
42 1018 Black Panther 2018 2018-04-23 03:05:00 7.3 43 Winston Duke Male Tobagonian 1986-11-15
43 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4 44 Danai Gurira Female American 1978-02-14
44 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 45 Lee Pace Male American 1979-03-25
45 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 46 Djimon Hounsou Male Beninese 1964-04-24
46 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 47 Clark Gregg Male American 1962-04-02
47 1018 Black Panther 2018 2018-04-23 03:05:00 7.3 48 Kat Dennings Female American 1986-06-13
48 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 49 Stellan Skarsgård Male Swedish 1951-06-13
49 1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6 50 Idris Elba Male British 1972-09-06

OUTER JOIN

we use LEFT JOIN and UNION to create the OUTER JOIN in sqlite3

# OperationalError: RIGHT and FULL OUTER JOINs are not currently supported
res = cur.execute('''
    SELECT M.title, A.actor_name
    FROM (SELECT * FROM MCU_movies LIMIT 5) AS M 
    LEFT JOIN (SELECT * FROM Actors LIMIT 5) AS A 
    ON M.movie_id = A.movie_id
    UNION
    SELECT M.title, A.actor_name
    FROM (SELECT * FROM Actors LIMIT 5) AS A
    LEFT JOIN (SELECT * FROM MCU_movies LIMIT 5) AS M
    ON M.movie_id = A.movie_id
    ORDER BY A.actor_name ASC 
 ''')
for i in res.fetchall():
    print(i)
('Iron Man 2', None)
('The Incredible Hulk', None)
('Captain America: The First Avenger', 'Chris Evans')
('Thor', 'Chris Hemsworth')
(None, 'Mark Ruffalo')
('Iron Man', 'Robert Downey Jr.')
(None, 'Scarlett Johansson')
df_m.set_index('movie_id').iloc[0:5].join(df_a.set_index('movie_id').iloc[0:5], how='outer', validate='m:m').sort_values(['actor_name'], ascending=True)[['title','actor_name']]
title actor_name
movie_id
1005 Captain America: The First Avenger Chris Evans
1004 Thor Chris Hemsworth
1006 NaN Mark Ruffalo
1001 Iron Man Robert Downey Jr.
1006 NaN Scarlett Johansson
1002 The Incredible Hulk NaN
1003 Iron Man 2 NaN
pd.merge(df_m.iloc[0:5], df_a.iloc[0:5], left_on='movie_id', right_on='movie_id', how='outer').sort_values(['actor_name'], ascending=True)[['title','actor_name']]
title actor_name
4 Captain America: The First Avenger Chris Evans
3 Thor Chris Hemsworth
5 NaN Mark Ruffalo
0 Iron Man Robert Downey Jr.
6 NaN Scarlett Johansson
1 The Incredible Hulk NaN
2 Iron Man 2 NaN

CROSS

res = cur.execute('''
    SELECT M.title, A.actor_name
    FROM (SELECT * FROM MCU_movies LIMIT 3) AS M 
    CROSS JOIN (SELECT * FROM Actors LIMIT 3) AS A 
 ''')

for i in res.fetchall():
    print(i)
('Iron Man', 'Robert Downey Jr.')
('Iron Man', 'Chris Evans')
('Iron Man', 'Mark Ruffalo')
('The Incredible Hulk', 'Robert Downey Jr.')
('The Incredible Hulk', 'Chris Evans')
('The Incredible Hulk', 'Mark Ruffalo')
('Iron Man 2', 'Robert Downey Jr.')
('Iron Man 2', 'Chris Evans')
('Iron Man 2', 'Mark Ruffalo')
df_m.iloc[0:3].join(df_a.iloc[0:3], validate='m:m', how='cross', lsuffix='_l', rsuffix='_r')
movie_id_l title year date time score actor_id actor_name movie_id_r gender nationality date_birth
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 1 Robert Downey Jr. 1001 Male American 1965-04-04
1 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 2 Chris Evans 1005 Male American 1981-06-13
2 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 3 Mark Ruffalo 1006 Male American 1967-11-22
3 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 1 Robert Downey Jr. 1001 Male American 1965-04-04
4 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 2 Chris Evans 1005 Male American 1981-06-13
5 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 3 Mark Ruffalo 1006 Male American 1967-11-22
6 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 1 Robert Downey Jr. 1001 Male American 1965-04-04
7 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 2 Chris Evans 1005 Male American 1981-06-13
8 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 3 Mark Ruffalo 1006 Male American 1967-11-22
pd.merge(df_m[0:3], df_a[0:3], how='cross')
movie_id_x title year date time score actor_id actor_name movie_id_y gender nationality date_birth
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 1 Robert Downey Jr. 1001 Male American 1965-04-04
1 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 2 Chris Evans 1005 Male American 1981-06-13
2 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 3 Mark Ruffalo 1006 Male American 1967-11-22
3 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 1 Robert Downey Jr. 1001 Male American 1965-04-04
4 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 2 Chris Evans 1005 Male American 1981-06-13
5 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 3 Mark Ruffalo 1006 Male American 1967-11-22
6 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 1 Robert Downey Jr. 1001 Male American 1965-04-04
7 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 2 Chris Evans 1005 Male American 1981-06-13
8 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 3 Mark Ruffalo 1006 Male American 1967-11-22

1.4.2 UNION#

The UNION in SQL combines the results of two or more SELECT statements into a single result set.

res = cur.execute('''
    SELECT movie_id
    FROM (SELECT movie_id FROM MCU_movies LIMIT 5)
    UNION
    SELECT movie_id
    FROM (SELECT movie_id FROM Actors LIMIT 5);
    ''')

for i in res.fetchall():
    print(i)
(1001,)
(1002,)
(1003,)
(1004,)
(1005,)
(1006,)
pd.concat([df_m.iloc[0:5][['movie_id']], df_a.iloc[0:5][['movie_id']]]).movie_id.unique()
array([1001, 1002, 1003, 1004, 1005, 1006])

1.4.3 CASE#

The CASE statement is used to perform conditional logic within queries, allowing for conditional evaluation and result selection based on specified conditions.

res = cur.execute('''
SELECT *,
  CASE
  WHEN score >= 8 THEN 'High'
  WHEN score >= 7 THEN 'Medium'
  ELSE 'Low'
  END AS quality
  FROM MCU_movies
  ''')
for i in res.fetchall():
    print(i)
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9, 'Medium')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 'Low')
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0, 'Medium')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 'Medium')
(1005, 'Captain America: The First Avenger', 2011, '2011-03-15', '06:33:00', 6.9, 'Low')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 'High')
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2, 'Medium')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 'Low')
(1009, 'Captain America: The Winter Soldier', 2014, '2014-11-12', '13:56:00', 7.7, 'Medium')
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0, 'High')
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3, 'Medium')
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3, 'Medium')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 'Medium')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 'Medium')
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6, 'Medium')
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4, 'Medium')
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9, 'Medium')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 'Medium')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 'High')
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1, 'Medium')
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9, 'Low')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 'High')
# use lambda in pandas
df_m['quality'] = df.score.apply(lambda x: 'High' if x >= 8 else ('Medium' if x >= 7 else 'Low'))
df_m
movie_id title year date time score quality
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 Medium
1 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 Low
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 Medium
3 1004 Thor 2011 2011-01-01 07:47:00 7.0 Medium
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9 Low
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 High
6 1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2 Medium
7 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 Low
8 1009 Captain America: The Winter Soldier 2014 2014-11-12 13:56:00 7.7 Medium
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0 High
10 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3 Medium
11 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 Medium
12 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 Medium
13 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 Medium
14 1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6 Medium
15 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4 Medium
16 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 Medium
17 1018 Black Panther 2018 2018-04-23 03:05:00 7.3 Medium
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 High
19 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1 Medium
20 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 Low
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 High
# use a definition in python
def quality_level(x):
    if x >= 8:
        y ='High'
    elif x >= 7:
        y = 'Medium'
    else: 
        y = 'Low'
    return y
df_m['quality_1'] = df.score.apply(lambda x: quality_level(x))
df_m
movie_id title year date time score quality quality_1
0 1001 Iron Man 2008 2008-01-01 15:22:00 7.9 Medium Medium
1 1002 The Incredible Hulk 2008 2008-03-04 21:16:00 6.7 Low Low
2 1003 Iron Man 2 2010 2010-01-01 05:38:00 7.0 Medium Medium
3 1004 Thor 2011 2011-01-01 07:47:00 7.0 Medium Medium
4 1005 Captain America: The First Avenger 2011 2011-03-15 06:33:00 6.9 Low Low
5 1006 The Avengers 2012 2012-05-10 08:30:00 8.0 High High
6 1007 Iron Man 3 2013 2013-07-20 19:12:00 7.2 Medium Medium
7 1008 Thor: The Dark World 2013 2013-09-25 18:27:00 6.9 Low Low
8 1009 Captain America: The Winter Soldier 2014 2014-11-12 13:56:00 7.7 Medium Medium
9 1010 Guardians of the Galaxy 2014 2014-12-03 22:25:00 8.0 High High
10 1011 Avengers: Age of Ultron 2015 2015-02-21 20:50:00 7.3 Medium Medium
11 1012 Ant-Man 2015 2015-04-30 17:01:00 7.3 Medium Medium
12 1013 Captain America: Civil War 2016 2016-06-06 10:04:00 7.8 Medium Medium
13 1014 Doctor Strange 2016 2016-08-16 23:11:00 7.5 Medium Medium
14 1015 Guardians of the Galaxy Vol. 2 2017 2017-10-20 12:41:00 7.6 Medium Medium
15 1016 Spider-Man: Homecoming 2017 2017-12-14 02:39:00 7.4 Medium Medium
16 1017 Thor: Ragnarok 2017 2017-11-03 11:54:00 7.9 Medium Medium
17 1018 Black Panther 2018 2018-04-23 03:05:00 7.3 Medium Medium
18 1019 Avengers: Infinity War 2018 2018-04-27 14:09:00 8.4 High High
19 1020 Ant-Man and The Wasp 2018 2018-07-06 16:29:00 7.1 Medium Medium
20 1021 Captain Marvel 2019 2019-03-08 14:19:00 6.9 Low Low
21 1022 Avengers: Endgame 2019 2019-01-01 01:30:00 8.4 High High