Not counting the header line,
the input file has 33 lines.
Each of the three dates appears on 11 lines.
The only fields we will use are
field 0: Date
field 2: Long Exp (exposure)
field 3: Long Perf (performance)
Date,Sector,Long Exp,Long Perf,Short Exp,Short Perf,,,,, 1/1/04,Consumer Discretionary,299192447.68,-299192.45,653769973.85,-653769.97,,,,, 1/1/04,Consumer Staples,806817013.88,-806817.01,155938778.63,-155938.78,,,,, 1/1/04,Energy,495331847.14,-495331.85,331916571.89,331916.57,,,,, 1/1/04,Financials,438551829.34,438551.83,375651108.20,213241.20,,,,, 1/1/04,Health Care,353562123.03,-353562.12,434401773.16,434401.77,,,,, 1/1/04,Industrials,85710566.66,85710.57,620244575.40,127634.20,,,,, 1/1/04,Information Technology,731335638.19,731335.64,513751410.08,-513751.41,,,,, 1/1/04,Materials,606915176.25,606915.18,130279981.29,342723.20,,,,, 1/1/04,Real Estate,690597304.21,452367.23,836765311.59,-836765.31,,,,, 1/1/04,Telecommunication Services,432774295.12,234872.30,542978125.25,542978.13,,,,, 1/1/04,Utilities,577593316.82,577593.32,335109028.81,-335109.03,,,,, 1/2/14,Consumer Discretionary,298893255.23,-125425.22,654423743.83,-134607.22,,,,, 1/2/14,Consumer Staples,806010196.87,-588526.41,156094717.41,-127539.85,,,,, 1/2/14,Energy,494836515.30,-262532.72,331584655.32,280881.29,,,,, 1/2/14,Financials,438990381.17,57815.38,375437867.00,86311.46,,,,, 1/2/14,Health Care,353208560.91,-214415.23,433967371.39,193435.12,,,,, 1/2/14,Industrials,85796277.22,1955.27,620116941.20,126538.15,,,,, 1/2/14,Information Technology,732066973.82,631985.26,514265161.49,-105652.38,,,,, 1/2/14,Materials,607522091.43,563465.03,129937258.09,169649.01,,,,, 1/2/14,Real Estate,691049671.44,424365.40,837602076.91,-371225.04,,,,, 1/2/14,Telecommunication Services,433009167.42,101597.69,542435147.13,408266.07,,,,, 1/2/14,Utilities,578170910.14,515323.44,335444137.84,-327443.68,,,,, 1/3/14,Consumer Discretionary,298767830.02,-77320.87,654558351.05,-8764.87,,,,, 1/3/14,Consumer Staples,805421670.46,-154150.06,156222257.26,-9682.94,,,,, 1/3/14,Energy,494573982.58,-34813.51,331303774.02,25961.21,,,,, 1/3/14,Financials,439048196.55,38058.06,375351555.53,23288.89,,,,, 1/3/14,Health Care,352994145.67,-173723.40,433773936.27,117726.12,,,,, 1/3/14,Industrials,85798232.49,1367.79,619990403.05,10269.29,,,,, 1/3/14,Information Technology,732698959.08,627507.50,514370813.88,-6261.26,,,,, 1/3/14,Materials,608085556.46,376862.73,129767609.08,88234.13,,,,, 1/3/14,Real Estate,691474036.84,199717.67,837973301.95,-68413.24,,,,, 1/3/14,Telecommunication Services,433110765.11,50143.34,542026881.06,402088.95,,,,, 1/3/14,Utilities,578686233.57,209828.94,335771581.52,-174046.48,,,,,
A student rhapsodized about the power of the two yellow statements.
Lines
19–26
of
pandas.py
will do the work of the two yellow statements,
using only the machinery covered in this course.
First, install these three packages.
(Microsoft people have to say
py.exe -m pip
instead of
pip3
.)
pip3 install pandas pip3 install matplotlib pip3 install openpyxl pip3 list
The input file contains 33 lines of data,
not counting the header line.
That’s why the
list
lines
in
line
51
contains 33 items.
Each of these items is a
tuple
containing three items:
a
datetime.date
and two numbers.
(We saw class
datetime.date
in
Class Date.)
For example, the first three tuples in
lines
are
(2004-01-01, 299192447.68, -299192.45)
(2004-01-01, 806817013.88, -806817.01)
(2004-01-01, 495331847.14, -495331.85)
The input file contains only three different dates:
January 1, 2004
January 2, 2014
January 3, 2014
That’s why the
list
dates
in
line
57
contains only three items.
The
list
returned by the
sorted
in
line
64
contains 11 items,
but the
highestExps
list
in that line contains only the last
n
items because of the index
[-n:]
which we saw in
Indexing.
Ditto for
line
75.
1.538153e+09
means
1.538153 ×
10 ×
10 ×
10 ×
10 ×
10 ×
10 ×
10 ×
10 ×
10
=
1.538153 × 109
=
1,538,153,000
In other words, move the decimal point nine places to the right.
The
e
stands for
exponent.
top3LongExp = Date 2004-01-01 1.538153e+09 2014-01-02 1.538077e+09 2014-01-03 1.538121e+09 Name: Long Exp, dtype: float64 top3LongPerf = Date 2004-01-01 -75481.37 2014-01-02 43458.85 2014-01-03 473357.44 dtype: float64 my top3LongExp = 2004-01-01 1.538153e+09 2014-01-02 1.538077e+09 2014-01-03 1.538121e+09 my top3LongPerf = 2004-01-01 -75481.37 2014-01-02 43458.85 2014-01-03 473357.44