Create a pd.Series object

A pd.Series object contains a one-dimensional np.ndarray. When you print the pd.Series, the np.ndarray is the column on the right. A pd.Series object also contains a parallel column, called the index of the pd.Series. When you print the pd.Series, the index is the column on the left.

We didn’t ask for an index, but we get one automatically even if we don’t ask for it. The index defaults to a pd.RangeIndex containing the consecutive np.int64s 0, 1, 2, ….

"""
Create a pd.Series object and demonstrate that it contains (something
just like) an np.ndarray.
"""

import sys
import pandas as pd

data = [0.0, 10.0, 20.0, 30.0, 40.0]                  #a list of floats
series = pd.Series(data = data, name = "temperature") #don't need to say data =
print(series)                                         #means print(series.to_string())
print()

print(f"{type(series) = }")
print(f"{series.name = }")
print(f"{series.dtype.name = }")
print(f"{series.index.dtype.name = }")
print()

print(series.array)     #series.values is deprecated
print()

print(f"{series.array.dtype.name = }")
sys.exit(0)

A PandasArray is a subclass of class ExtensionArray, which “provides all the array-like functionality”. You can pretend it’s an np.ndarray.

0     0.0
1    10.0
2    20.0
3    30.0
4    40.0
Name: temperature, dtype: float64

type(series) = <class 'pandas.core.series.Series'>
series.name = 'temperature'
series.dtype.name = 'float64'
series.index.dtype.name = 'int64'

<PandasArray>
[0.0, 10.0, 20.0, 30.0, 40.0]
Length: 5, dtype: float64

series.array.dtype.name = 'float64'

Things to try

  1. Change the above data to one of the following. The first one is straight Python; the second one is simpler and runs faster.
    #Python list comprehension
    
    data = [float(i) for i in range(0, 50, 10)] #data is a Python list
    
    #remember to import numpy as np
    
    data = np.arange(0.0, 50.0, 10.0)           #data is a np.ndarray
    
  2. Print a pd.Series that is longer than pd.options.display.max_rows. See Options and settings. The options_context function creates and returns a Python context manager. See Context Manager Types.
    "Print a pd.Series that is longer than pd.options.display.max_rows."
    
    import sys
    import numpy as np
    import pandas as pd
    
    print(f'{pd.get_option("display.min_rows") = }')   #default values
    print(f'{pd.get_option("display.max_rows") = }')
    print()
    
    series = pd.Series(data = np.arange(61.0), name = "temperature")
    print(series)
    print()
    
    with pd.option_context("display.min_rows", 6):    #first three and last three rows
        print(series)
    print()
    
    with pd.option_context("display.max_rows", None): #Print all the rows.
        print(series)
    
    sys.exit(0)
    
    pd.get_option("display.min_rows") = 10
    pd.get_option("display.max_rows") = 60
    
    0      0.0
    1      1.0
    2      2.0
    3      3.0
    4      4.0
          ...
    56    56.0
    57    57.0
    58    58.0
    59    59.0
    60    60.0
    Name: temperature, Length: 61, dtype: float64
    
    0      0.0
    1      1.0
    2      2.0
          ...
    58    58.0
    59    59.0
    60    60.0
    Name: temperature, Length: 61, dtype: float64
    
    0      0.0
    1      1.0
    2      2.0
    3      3.0
    4      4.0
    5      5.0
    6      6.0
    7      7.0
    8      8.0
    9      9.0
    10    10.0
    11    11.0
    12    12.0
    13    13.0
    14    14.0
    15    15.0
    16    16.0
    17    17.0
    18    18.0
    19    19.0
    20    20.0
    21    21.0
    22    22.0
    23    23.0
    24    24.0
    25    25.0
    26    26.0
    27    27.0
    28    28.0
    29    29.0
    30    30.0
    31    31.0
    32    32.0
    33    33.0
    34    34.0
    35    35.0
    36    36.0
    37    37.0
    38    38.0
    39    39.0
    40    40.0
    41    41.0
    42    42.0
    43    43.0
    44    44.0
    45    45.0
    46    46.0
    47    47.0
    48    48.0
    49    49.0
    50    50.0
    51    51.0
    52    52.0
    53    53.0
    54    54.0
    55    55.0
    56    56.0
    57    57.0
    58    58.0
    59    59.0
    60    60.0
    Name: temperature, dtype: float64
    
  3. The head method creates and returns a new pd.Series containing the first five rows of the original pd.Series.
    "Print the first (and last) few rows of a pd.Series that is too long to print in its entirety."
    
    import sys
    import numpy as np
    import pandas as pd
    
    series = pd.Series(data = np.arange(50.0), name = "temperature")
    
    print(series.head())   #Just the first 5 rows.
    print()
    
    print(series.tail(3))  #Just the last 3 rows.
    sys.exit(0)
    
    0    0.0
    1    1.0
    2    2.0
    3    3.0
    4    4.0
    Name: temperature, dtype: float64
    
    47    47.0
    48    48.0
    49    49.0
    Name: temperature, dtype: float64
    
  4. The describe method creates and returns a new pd.Series including the standard deviation and some percentiles.
    anotherSeries = series.describe()
    print(anotherSeries)
    
    count     5.000000
    mean     20.000000
    std      15.811388
    min       0.000000
    25%      10.000000
    50%      20.000000
    75%      30.000000
    max      40.000000
    Name: temperature, dtype: float64
    
  5. What do you get when you convert a pd.Series to a Python list? What happens to the numbers in the index?
    listOfFloats = list(series)
    
    for f in listOfFloats:
        print(f)
    
    0.0
    10.0
    20.0
    30.0
    40.0
    
    What do you get when you convert a pd.Series to a Python dict?
  6. Plot the pd.Series.
    pip3 install matplotlib
    
    "Plot a pd.Series with matplotlib.pyplot."
    
    import pandas as pd
    import matplotlib.pyplot as plt
    
    data = [0.0, 10.0, 20.0, 30.0, 40.0]
    series = pd.Series(data = data, name = "temperature")
    series.index.name = "day"
    
    #The Figure fills the window.
    
    figure = plt.figure(figsize = [6.4, 4.8])   #width and height in inches
    figure.canvas.set_window_title("matplotlib.pyplot Series.plot")
    
    #A Figure can contain one or more Axes.
    #In this program, the Figure contains only one Axes.
    
    axes = series.plot(color = "#1f77b4", grid = True, linewidth = 1.5)
    axes.set_title("Temperature at each day")
    axes.set_ylabel(series.name)
    
    plt.show()   #infinite loop
    
  7. For Space Cadets only: examine the source code for class pd.Series.