A
pd.Series
object contains a one-dimensional
np.ndarray
.
When you print the
pd.Series
,
the
np.ndarray
is the column on the right.
A
pd.Series
object also contains a parallel column, called the
index
of the
pd.Series
.
When you print the
pd.Series
,
the
index is the column on the left.
We didn’t ask for an index,
but we get one automatically even if we don’t ask for it.
The index defaults to a
pd.RangeIndex
containing the consecutive
np.int64
s
0, 1, 2, ….
"""
Create a pd.Series object and demonstrate that it contains (something
just like) an np.ndarray.
"""
import sys
import pandas as pd
data = [0.0, 10.0, 20.0, 30.0, 40.0] #a list of floats
series = pd.Series(data = data, name = "temperature") #don't need to say data =
print(series) #means print(series.to_string())
print()
print(f"{type(series) = }")
print(f"{series.name = }")
print(f"{series.dtype.name = }")
print(f"{series.index.dtype.name = }")
print()
print(series.array) #series.values is deprecated
print()
print(f"{series.array.dtype.name = }")
sys.exit(0)
A
PandasArray
is a subclass of class
ExtensionArray
,
which
“provides all the array-like functionality”.
You can pretend it’s an
np.ndarray
.
0 0.0
1 10.0
2 20.0
3 30.0
4 40.0
Name: temperature, dtype: float64
type(series) = <class 'pandas.core.series.Series'>
series.name = 'temperature'
series.dtype.name = 'float64'
series.index.dtype.name = 'int64'
<PandasArray>
[0.0, 10.0, 20.0, 30.0, 40.0]
Length: 5, dtype: float64
series.array.dtype.name = 'float64'
data
to one of the following.
The first one is straight Python;
the second one is simpler and runs faster.
#Python list comprehension data = [float(i) for i in range(0, 50, 10)] #data is a Python list
#remember to import numpy as np data = np.arange(0.0, 50.0, 10.0) #data is a np.ndarray
pd.Series
that is longer than
pd.options.display.max_rows
.
See
Options
and settings.
The
options_context
function creates and returns a Python
context
manager.
See
Context
Manager Types.
"Print a pd.Series that is longer than pd.options.display.max_rows." import sys import numpy as np import pandas as pd print(f'{pd.get_option("display.min_rows") = }') #default values print(f'{pd.get_option("display.max_rows") = }') print() series = pd.Series(data = np.arange(61.0), name = "temperature") print(series) print() with pd.option_context("display.min_rows", 6): #first three and last three rows print(series) print() with pd.option_context("display.max_rows", None): #Print all the rows. print(series) sys.exit(0)
pd.get_option("display.min_rows") = 10 pd.get_option("display.max_rows") = 60 0 0.0 1 1.0 2 2.0 3 3.0 4 4.0 ... 56 56.0 57 57.0 58 58.0 59 59.0 60 60.0 Name: temperature, Length: 61, dtype: float64 0 0.0 1 1.0 2 2.0 ... 58 58.0 59 59.0 60 60.0 Name: temperature, Length: 61, dtype: float64 0 0.0 1 1.0 2 2.0 3 3.0 4 4.0 5 5.0 6 6.0 7 7.0 8 8.0 9 9.0 10 10.0 11 11.0 12 12.0 13 13.0 14 14.0 15 15.0 16 16.0 17 17.0 18 18.0 19 19.0 20 20.0 21 21.0 22 22.0 23 23.0 24 24.0 25 25.0 26 26.0 27 27.0 28 28.0 29 29.0 30 30.0 31 31.0 32 32.0 33 33.0 34 34.0 35 35.0 36 36.0 37 37.0 38 38.0 39 39.0 40 40.0 41 41.0 42 42.0 43 43.0 44 44.0 45 45.0 46 46.0 47 47.0 48 48.0 49 49.0 50 50.0 51 51.0 52 52.0 53 53.0 54 54.0 55 55.0 56 56.0 57 57.0 58 58.0 59 59.0 60 60.0 Name: temperature, dtype: float64
head
method creates and returns a new
pd.Series
containing the first five rows of the original
pd.Series
.
"Print the first (and last) few rows of a pd.Series that is too long to print in its entirety." import sys import numpy as np import pandas as pd series = pd.Series(data = np.arange(50.0), name = "temperature") print(series.head()) #Just the first 5 rows. print() print(series.tail(3)) #Just the last 3 rows. sys.exit(0)
0 0.0 1 1.0 2 2.0 3 3.0 4 4.0 Name: temperature, dtype: float64 47 47.0 48 48.0 49 49.0 Name: temperature, dtype: float64
describe
method creates and returns a new
pd.Series
including the
standard
deviation
and some percentiles.
anotherSeries = series.describe() print(anotherSeries)
count 5.000000 mean 20.000000 std 15.811388 min 0.000000 25% 10.000000 50% 20.000000 75% 30.000000 max 40.000000 Name: temperature, dtype: float64
pd.Series
to a Python
list
?
What happens to the numbers in the index?
listOfFloats = list(series) for f in listOfFloats: print(f)
0.0 10.0 20.0 30.0 40.0What do you get when you convert a
pd.Series
to a Python
dict
?
pd.Series
.
pip3 install matplotlib
"Plot a pd.Series with matplotlib.pyplot." import pandas as pd import matplotlib.pyplot as plt data = [0.0, 10.0, 20.0, 30.0, 40.0] series = pd.Series(data = data, name = "temperature") series.index.name = "day" #The Figure fills the window. figure = plt.figure(figsize = [6.4, 4.8]) #width and height in inches figure.canvas.set_window_title("matplotlib.pyplot Series.plot") #A Figure can contain one or more Axes. #In this program, the Figure contains only one Axes. axes = series.plot(color = "#1f77b4", grid = True, linewidth = 1.5) axes.set_title("Temperature at each day") axes.set_ylabel(series.name) plt.show() #infinite loop
pd.Series
.