Encode as String

Sometimes it is very important to handle different input object types differently in a function. This problem will exercise your understanding of types, control-flow, dictionaries, and more.

We want to encode a sequence of Python objects as a single string. The following describes the encoding method that we want to use for each type of object. Each object’s transcription in should be separated by " | ", and the result should be one large string.

  • If the object is an integer, convert it into a string by spelling out each digit in base-10 in this format: 142 \(\rightarrow\) one-four-two; -12 \(\rightarrow\) neg-one-two.

  • If the object is a float, just append its integer part (obtained by rounding down) the same way and the string "and float": 12.324 \(\rightarrow\) one-two and float.

  • If the object is a string, keep it as is.

  • If the object is of any other type, return '<OTHER>'.

# example behavior
>>> s = concat_to_str([12,-14.23,"hello", True,
...                    "Aha", 10.1, None, 5])
>>> s
'one-two | neg-one-four and float | hello | <OTHER> | Aha | one-zero and float | <OTHER> | five'

Tips: check out the isinstance function introduced here for handling different types. Also, consider creating a helper function for the conversion from integer to our special-format string, since we have to do it twice. It’s always good to extrapolate repeated tasks into functions. You’ll also need to hard-code the conversion from each digit to its English spell-out.

Solution

Our solution is broken down into three simple functions. int_to_str is used to map signed integers to English words. item_to_transcript is capable of mapping an object of any type to its string representation, in accordance with the prescription made by the problem statement. Finally, concat_to_str orchestrates these two helper functions, looping over each object in our input list, mapping each object to its string representation, and joining these strings with ' | '.

def int_to_str(n):
    """
    Takes an integer and formats it into a special string
        e.g. 142 -> "one-four-two"
             -12 -> "neg-one-two"
    """
    mapping = {"0": "zero", "1": "one", "2": "two", "3": "three",
               "4": "four", "5": "five", "6": "six", "7": "seven",
               "8": "eight", "9": "nine", "-": "neg"}
    return "-".join(mapping[digit] for digit in str(n))

def item_to_transcript(item):
    """ Any -> str """
    if isinstance(item, bool): return '<OTHER>'
    if isinstance(item, int): return int_to_str(item)
    if isinstance(item, float): return int_to_str(int(item)) + " and float"
    if isinstance(item, str): return item
    return '<OTHER>'

def concat_to_str(l):
    """
    Maps a list of objects to their string
    representations concatenated together.

    Parameters
    ----------
    l: List[Any]
        Input list of objects

    Returns
    -------
    str

    Examples
    --------
    >>> concat_to_str([1, None, 'hi', 2.0])
    one | <OTHER> | hi | two and float
    """
    return " | ".join(item_to_transcript(item) for item in l)

We use the str.join function along with a generator comprehensions in a couple places in our solution. Recall that

"<hi>".join(x for x in some_iterable_of_strings)

is equivalent to the long-form code:

out = ""
for x in some_iterable_of_strings:
    out += "<hi>" + x

out = out.lstrip("<hi>")  # get rid of the extra leading "<hi>"

int_to_str plays a clever trick to convert each integer, digit-by-digit, into its string form - it calls str on the integer. This converts the integer into a string, which is a sequence. This permits us to access each digit of the integer and even iterate over them:

# casting an integer to a string makes its
# sign and digits accessible via indexing/iteration
>>> x = str(-123)
>>> x
'-123'
>>> x[0]
'-'
>>> x[-1]
'3'

Thus, in total "-".join(mapping[digit] for digit in str(n)) is responsible for casting an integer to a string, iterating over each of its digits and mapping them to their corresponding word using the dictionary that we defined in the function.

item_to_transcript it an especially slick function. First, let’s make clear the fancy use of the inline syntax here. This function:

def item_to_transcript(item):
    """ Any -> str """
    if isinstance(item, bool): return '<OTHER>'
    if isinstance(item, int): return int_to_str(item)
    if isinstance(item, float): return int_to_str(int(item)) + " and float"
    if isinstance(item, str): return item
    return '<OTHER>'

is entirely equivalent to this function:

def item_to_transcript_alt(item):
    """ Any -> str """
    if isinstance(item, bool):
        return '<OTHER>'
    elif isinstance(item, int):
        return int_to_str(item)
    elif isinstance(item, float):
        return int_to_str(int(item)) + " and float"
    elif isinstance(item, str):
        return item
    else:
        return '<OTHER>'

The latter uses the familiar pattern of if-elif-else statements and makes for a completely satisfactory version of the function. See, however, that each of the multiple return statements in item_to_transcript guarantees the same logic, in that if a condition is meant a value will be returned by the function and none of its subsequent code can be visited. That is, if item is an integer the second if-condition will evaluate to True and int_to_str(item) will be returned, immediately expelling the point of execution from the body of the function.

Ultimately, the preference of one function over the other is merely a matter of stylistic preference. You also have likely noted the peculiar in-line if-return expressions. These too are only stylistic choices;

if isinstance(item, int): return int_to_str(item)

is no different from

if isinstance(item, int):
    return int_to_str(item)

The use of in-line if-return expressions in item_to_transcript does a nice job emphasizing the dictionary-like mapping behavior of the function: the form of the code suits its functionality nicely. That being said, these should generally be used sparingly. Some may call this a “cute” trick. And it is. This code is cute. I write cute code.

Finally, you may have noticed what looks like a redundancy in our code: our first if statement returns '<OTHER>' if item is True or False, and our final line of code returns '<OTHER>' if none of the preceding conditions were met (i.e. item is not a bool, int, float, or str type object). Why then did we not just merge our first if clause with this ultimate catch-all? The reason is that True and False are not only instances of the boolean type, they are also integers! True behaves like the integer 1 and False like 0:

>>> isinstance(True, int) and isinstance(True, bool)
True

>>> 3*True + True - False
4

Thus, had we not taken care to check for booleans up front, True and False would have been mapped to 'one' and 'zero', respectively, rather than '<OTHER>'. This is a relatively subtle edge case to catch.