Create a pd.Series object

A pd.Series object contains a one-dimensional np.ndarray. When you print the pd.Series, the np.ndarray is the column on the right. A pd.Series object also contains a parallel column, called the index of the pd.Series. When you print the pd.Series, the index is the column on the left.

We didn’t ask for an index, but we get one automatically even if we don’t ask for it. The index defaults to a pd.RangeIndex containing the consecutive np.int64s 0, 1, 2, ….

"""
Create a pd.Series object and demonstrate that it contains (something
just like) an np.ndarray.
"""

import sys
import pandas as pd

data = [0.0, 10.0, 20.0, 30.0, 40.0]                  #a list of floats
series = pd.Series(data = data, name = "temperature") #don't need to say data =
print(series)                                         #means print(series.to_string())
print()

print(f"{type(series) = }")
print(f"{series.name = }")
print(f"{series.dtype.name = }")
print(f"{series.index.dtype.name = }")
print()

print(series.array)     #series.values is deprecated
print()

print(f"{series.array.dtype.name = }")
sys.exit(0)

A PandasArray is a subclass of class ExtensionArray, which “provides all the array-like functionality”. You can pretend it’s an np.ndarray.

0     0.0
1    10.0
2    20.0
3    30.0
4    40.0
Name: temperature, dtype: float64

type(series) = <class 'pandas.core.series.Series'>
series.name = 'temperature'
series.dtype.name = 'float64'
series.index.dtype.name = 'int64'

<PandasArray>
[0.0, 10.0, 20.0, 30.0, 40.0]
Length: 5, dtype: float64

series.array.dtype.name = 'float64'

Things to try

Change the above data to one of the following. The first one is straight Python; the second one is simpler and runs faster.

#Python list comprehension

data = [float(i) for i in range(0, 50, 10)] #data is a Python list

#remember to import numpy as np

data = np.arange(0.0, 50.0, 10.0)           #data is a np.ndarray

Print a pd.Series that is longer than pd.options.display.max_rows. See Options and settings. The options_context function creates and returns a Python context manager. See Context Manager Types.

"Print a pd.Series that is longer than pd.options.display.max_rows."

import sys
import numpy as np
import pandas as pd

print(f'{pd.get_option("display.min_rows") = }')   #default values
print(f'{pd.get_option("display.max_rows") = }')
print()

series = pd.Series(data = np.arange(61.0), name = "temperature")
print(series)
print()

with pd.option_context("display.min_rows", 6):    #first three and last three rows
    print(series)
print()

with pd.option_context("display.max_rows", None): #Print all the rows.
    print(series)

sys.exit(0)

pd.get_option("display.min_rows") = 10
pd.get_option("display.max_rows") = 60

0      0.0
1      1.0
2      2.0
3      3.0
4      4.0
      ...
56    56.0
57    57.0
58    58.0
59    59.0
60    60.0
Name: temperature, Length: 61, dtype: float64

0      0.0
1      1.0
2      2.0
      ...
58    58.0
59    59.0
60    60.0
Name: temperature, Length: 61, dtype: float64

0      0.0
1      1.0
2      2.0
3      3.0
4      4.0
5      5.0
6      6.0
7      7.0
8      8.0
9      9.0
10    10.0
11    11.0
12    12.0
13    13.0
14    14.0
15    15.0
16    16.0
17    17.0
18    18.0
19    19.0
20    20.0
21    21.0
22    22.0
23    23.0
24    24.0
25    25.0
26    26.0
27    27.0
28    28.0
29    29.0
30    30.0
31    31.0
32    32.0
33    33.0
34    34.0
35    35.0
36    36.0
37    37.0
38    38.0
39    39.0
40    40.0
41    41.0
42    42.0
43    43.0
44    44.0
45    45.0
46    46.0
47    47.0
48    48.0
49    49.0
50    50.0
51    51.0
52    52.0
53    53.0
54    54.0
55    55.0
56    56.0
57    57.0
58    58.0
59    59.0
60    60.0
Name: temperature, dtype: float64

The head method creates and returns a new pd.Series containing the first five rows of the original pd.Series.

"Print the first (and last) few rows of a pd.Series that is too long to print in its entirety."

import sys
import numpy as np
import pandas as pd

series = pd.Series(data = np.arange(50.0), name = "temperature")

print(series.head())   #Just the first 5 rows.
print()

print(series.tail(3))  #Just the last 3 rows.
sys.exit(0)

0    0.0
1    1.0
2    2.0
3    3.0
4    4.0
Name: temperature, dtype: float64

47    47.0
48    48.0
49    49.0
Name: temperature, dtype: float64

The describe method creates and returns a new pd.Series including the standard deviation and some percentiles.

anotherSeries = series.describe()
print(anotherSeries)

count     5.000000
mean     20.000000
std      15.811388
min       0.000000
25%      10.000000
50%      20.000000
75%      30.000000
max      40.000000
Name: temperature, dtype: float64

What do you get when you convert a pd.Series to a Python list? What happens to the numbers in the index?
```
listOfFloats = list(series)

for f in listOfFloats:
    print(f)
```
```
0.0
10.0
20.0
30.0
40.0
```
What do you get when you convert a pd.Series to a Python dict?

Plot the pd.Series.

pip3 install matplotlib

"Plot a pd.Series with matplotlib.pyplot."

import pandas as pd
import matplotlib.pyplot as plt

data = [0.0, 10.0, 20.0, 30.0, 40.0]
series = pd.Series(data = data, name = "temperature")
series.index.name = "day"

#The Figure fills the window.

figure = plt.figure(figsize = [6.4, 4.8])   #width and height in inches
figure.canvas.set_window_title("matplotlib.pyplot Series.plot")

#A Figure can contain one or more Axes.
#In this program, the Figure contains only one Axes.

axes = series.plot(color = "#1f77b4", grid = True, linewidth = 1.5)
axes.set_title("Temperature at each day")
axes.set_ylabel(series.name)

plt.show()   #infinite loop

For Space Cadets only: examine the source code for class pd.Series.