INFO1-CE9990 Student code Aug 3, 2017

if

It looks like we’re steering the computer in one of three possible directions:

    if avg >= n[0]:
        cityavgs[city_name] = "Everyone seems to want to live in "+city_name+" as prices are on the rise."
    elif avg < n[0]:
        cityavgs[city_name] = "Prices are low in "+city_name+". Is it a good deal? Or will it be the next Detroit?"
    else:
        cityavgs[city_name] = None
Lines 49–50 will never be executed, so we can delete them:
    if avg >= n[0]:
        cityavgs[city_name] = "Everyone seems to want to live in " + city_name + " as prices are on the rise."
    elif avg < n[0]:
        cityavgs[city_name] = "Prices are low in " + city_name + ". Is it a good deal? Or will it be the next Detroit?"
The comparison in line 47 is always true, so we can delete it:
    if avg >= n[0]:
        cityavgs[city_name] = "Everyone seems to want to live in " + city_name + " as prices are on the rise."
    else:
        cityavgs[city_name] = "Prices are low in " + city_name + ". Is it a good deal? Or will it be the next Detroit?"

Don’t write line longer than 80 characters:

    if avg >= n[0]:
        cityavgs[city_name] = "Everyone seems to want to live in " + city_name \
            + " as prices are on the rise."
    else:
        cityavgs[city_name] = "Prices are low in " + city_name \
            + ".  Is it a good deal?  Or will it be the next Detroit?"

In-depth study: the if the user selects the city of Cayce, South Carolina, the URL will download this JSON file. In line 39,
Data is the list of 16 items created in lines 17 and 33;
Data[i] is a dictionary with a key named dataset (see the JSON file);
Data[i]['dataset'] is a dictionary with a key named data;
Data[i]['dataset']['data'] is a list. Each item on the list is a short list containing a date and a price. That’s what the d and p in line 40 stand for.

The n[0] in lines 45 and 47 is the first item in the list n created in lines 38 and 41. But the script uses only the first item on this list. Making the rest of the list is therefore a waste of time. Futhermore, whenever we use n[0] (which happens only in lines 45 and 47), the value of n[0] is always equal to the value of Data[i]['dataset']['data'][0][1]. (Remember that
Data[i]['dataset']['data'][0][0]
is a date, and
Data[i]['dataset']['data'][0][1]
is a price. They’re the date and price of the first sale in the list Data[i]['dataset']['data'].) The following changes will therefore simplify the program without changing its output at all, and make it run faster.

  1. In lines 45 and 47, change n[0] to Data[i]['dataset']['data'][0][1].
  2. Remove the list n. In line 38, remove the n = [] and the preceding semicolon. Remove line 41.

if

                if guess == secretNumber:
                    print("You got it! My number was {}.".format(secretNumber))
                    break
                if guess in guesses:
                    print("You already guessed that number!")
                # gives users a hint.
                elif guess < secretNumber:
                    print("My number is higher than {}.".format(guess))
                elif guess > secretNumber:
                    print("My number is lower than {}.".format(guess))

                else:
                    print("That's not it!")

Lines 36–38 can never be executed, so we can remove them from the script without changing the behavior of the script.

                if guess == secretNumber:
                    print("You got it! My number was {}.".format(secretNumber))
                    break
                if guess in guesses:
                    print("You already guessed that number!")
                # gives users a hint.
                elif guess < secretNumber:
                    print("My number is higher than {}.".format(guess))
                elif guess > secretNumber:
                    print("My number is lower than {}.".format(guess))

guess is always greater than secretNumber whenever we arrive at line 34, so we can simplify line 34 to the single word else: without changing the behavior of the script.

                if guess == secretNumber:
                    print("You got it! My number was {}.".format(secretNumber))
                    break
                if guess in guesses:
                    print("You already guessed that number!")
                # gives users a hint.
                elif guess < secretNumber:
                    print("My number is higher than {}.".format(guess))
                else:
                    print("My number is lower than {}.".format(guess))

The four print statements are mutually exclusive: exactly one of them should be executed. The notation that everybody expects you to use for four mutually exclusive statements is if/elif/elif/else. I think it would be clearest if the <, >, === cases were consecutive.

                if guess in guesses:
                    print("You already guessed that number!")
                elif guess < secretNumber:   #Give users a hint.
                    print("My number is higher than {}.".format(guess))
                elif gess > secretNumber:
                    print("My number is lower than {}.".format(guess))
                else:
                    print("You got it! My number was {}.".format(secretNumber))
                    break

float(dept[data[10]][0]) vs. float(dept[data[10]][1])

Here’s the very interesting output:

ADMIN HEARNG
Minimum annual salary:  40392
Maximum annual salary:  156420

ANIMAL CONTRL
Minimum annual salary:  39528
Maximum annual salary:  130008

AVIATION
Minimum annual salary:  45696
Maximum annual salary:  300000

etc.

TRANSPORTN
Minimum annual salary:  36840
Maximum annual salary:  169500

TREASURER
Minimum annual salary:  46356
Maximum annual salary:  137700

WATER MGMNT
Minimum annual salary:  43632
Maximum annual salary:  169512

Here’s the JSON file.

In line 38,
dictionary is a dictionary holding the entire JSON file;
dictionary["data"] is a list of people. (The script therefore would be easier to read if the variable data in Line 38 had been named person or employee.) Each of these people is list is a list of 17 items. Item number 10 (i.e., the item whose index is 10) is the name of their department; item number 17 is their salary.

Line 36 begins to create a dictionary. Each key in this dictionary is the name of a department. The corresponding value is a list of two numbers, the minimum amd maximum salary for that department. Lines 43–46 check if each incoming person sets a new record for lowest or highest salary. Before these lines can do that, however, lines 39–40 have to make sure that the department, with its minimum and maximum known salaries, is already in the dictionary dept.

On Thursday, August 27, we saw in class how to eliminate the need for lines 39–40 by using the collections.defaultdict in Popular. That lets us change lines 36–48 from

dept = {}

for data in dictionary["data"]:
    if data[10] not in dept and data[14]!=None:
        dept[data[10]] = [data[14], data[14]]
    else:
        try:
            if float(data[14]) < float(dept[data[10]][0]):
                dept[data[10]][0] = data[14]
            if float(data[14]) > float(dept[data[10]][1]):
                dept[data[10]][1] = data[14]
        except TypeError:
            continue
to the following. Note that the salaries have decimal points, so they’re floats.
import collections
def defaultValue():
    return [1000000000.00, 0.00]   #minimum salary, maximum salary

dept = collections.defaultdict(defaultValue)

for data in dictionary["data"]:
    try:
        if float(data[14]) < float(dept[data[10]][0]):
            dept[data[10]][0] = data[14]
        if float(data[14]) > float(dept[data[10]][1]):
            dept[data[10]][1] = data[14]
    except TypeError:
        continue
or to
import collections
dept = collections.defaultdict(lambda: [1000000000.00, 0.00])

for data in dictionary["data"]:
    try:
        if float(data[14]) < float(dept[data[10]][0]):
            dept[data[10]][0] = data[14]
        if float(data[14]) > float(dept[data[10]][1]):
            dept[data[10]][1] = data[14]
    except TypeError:
        continue

Make the code easier to read by giving names to the departmentName, department, and salary. Since data[14] is now mentioned in only once place, I had to write float in only one place. That ensures that every value entering the system is a float, so no further conversions are necessary. It will also make it easier to print each salary with exacty two digits to the right of the decimal point, but we won’t do that now. I changed the second if to elif because it is impossible for the same person to set a new record for lowest salary and highest salary. The elif avoids unnecessary work.

dept = collections.defaultdict(lambda: [1000000000.00, 0.00])

for data in dictionary["data"]:
    try:
        salary = float(data[14])
    except TypeError:
        continue

    departmentName = data[10]
    department = dept[departmentName]

    if salary < department[0]:
        department[0] = salary   #sets a new record for lowest salary
    elif salary > department[1]:
        department[1] = salary   #sets a new record for highest salary

Each value in the data variable represents a person, so let’s rename this variable person. And let’s rename the dept dictionary to departments, to agree with the variable department.

departments = collections.defaultdict(lambda: [1000000000.00, 0.00])

for person in dictionary["data"]:
    try:
        salary = float(person[14])
    except TypeError:
        continue

    departmentName = person[10]
    department = departments[departmentName]

    if salary < department[0]:
        department[0] = salary #This person set a new record for lowest salary.
    elif salary > department[1]:
        department[1] = salary #This person set a new record for highest salary.

In-depth study: instead of saying department[0] and department[1] for the minimum and maximum salaries, is there any way we could use attributes to say department.minimum and department.maximum? We’ve already come a long way towards making the script easier to read: originally these two numbers were called float(dept[data[10]][0]) and float(dept[data[10]][1]), with nested square brackets.

Was it worth it?