The Easiest Way to Adjust Currency in Python
I’m working on a machine learning project that involves modeling art auction prices. The data that I’ve collected includes auction results from nearly 40 years. In general, I want to leave realized prices in nominal terms, since the model will predict sales in nominal terms: A major component of how much a work fetches on the secondary market is how long ago the auction occurred. However, as part of my preliminary exploratory data analysis work, it would be useful to have an adjusted currency feature in constant dollars so that I can get a better sense for how certain features correlate with realized price.
For instance, let’s say we’re interested in seeing whether there’s a relationship between size (i.e., painting area) and price realized. Without constant dollars, this would be complicated, since the realized price is a function of when the auction was, as well. But if we’re dealing with constant dollars, then we can compare apples with apples.
To accomplish this currency adjustment, I discovered the cpi module. Let’s take a look at how easy it is to use.
CPI basics
The first time you load cpi
, you may get a warning that the CPI data needs to be updated. That’s easily solved by calling cpi.update()
.
To adjust nominal currency amounts to a constant currency, the fundamental method we need is cpi.inflate()
.
# Inflate $150 in 1950 to present (i.e., 2022)
cpi.inflate(150, 1950)
1821.504149377593
There you have it: \$150 in 1950 is approximately identical to \$1,821.50 today. It’s as easy as that.
You can also inflate currencies to other years besides the present.
# Inflate $150 in 1950 to 2000
cpi.inflate(150, 1950, to=2000)
1071.7842323651453
CPI and Pandas
For my purposes, I needed to take auction_year
and price_realized
features from a DataFrame and inflate those values to the present. This is also pretty darn easy.
# Load data
df = pd.read_csv('../../art_auction/data/auction_data_cleaned.csv', index_col=0)
# Select columns of interest
cols = ['artist_name', 'title', 'date', 'auction_year', 'price_realized_USD']
# Preview
df[cols].sample(10)
artist_name | title | date | auction_year | price_realized_USD | |
---|---|---|---|---|---|
22892 | George Condo | NUDE WITH PURPLE HAIR | NaN | 2017 | 535772.0 |
38131 | Sam Francis | Untitled | 1990 | 1997 | 16335.0 |
21527 | Chu Teh-Chun | NaN | NaN | 2016 | 474497.0 |
28172 | Richard Prince | Song 2120 South Michigan Avenue | 1989 | 2005 | 140000.0 |
52689 | Leonora Carrington | TUESDAY | 1946 | 2020 | 1100000.0 |
8103 | Gerhard Richter | Porträt Schniewind | 1964 | 2010 | 1746520.0 |
38665 | Zhang Xiaogang | Hills And Spirits | NaN | 2018 | 863251.0 |
2396 | Andy Warhol | KEY SERVICE (NEGATIVE) | 1985 | 2019 | 73149.0 |
4884 | Andy Warhol | Children Paintings, Parrot | 1983 | 2004 | NaN |
27498 | Zhou Chunya | Red mountain rock series - nude | 1992 | 2011 | 158882.0 |
As you can see, we have auctions from a variety of years. For each of these works, we’ll need to call cpi.inflate()
for the auction_year
and price_realized_USD
. The best way to do this is with a lambda function. But because the cpi
library only supports dates through 2022, we’ll have to make sure we only apply this function to dates prior to then, otherwise it’ll throw an error.
# Create mask for pre-2023 auctions
pre2023 = (df['auction_year'] < 2023)
# Inflate pre-2023 prices
df['adjusted_price'] = (
df.loc[pre2023].apply(lambda x: cpi.inflate(x['price_realized_USD'], x['auction_year']), axis=1)
)
# Assign 2023 prices directly
df.loc[~pre2023, 'adjusted_price'] = df.loc[~pre2023, 'price_realized_USD']
# Select columns
cols = ['artist_name', 'title', 'date', 'auction_year', 'price_realized_USD', 'adjusted_price']
# Preview
df[cols].sample(25)
artist_name | title | date | auction_year | price_realized_USD | adjusted_price | |
---|---|---|---|---|---|---|
40696 | Kazuo Shiraga | Composition, | 1970 | 2012 | NaN | NaN |
15276 | Jean Dubuffet | Riant ete | 1954 | 2002 | NaN | NaN |
14711 | Yoshitomo Nara | Guitar girl | 2003 | 2005 | 1884.0 | 2.823154e+03 |
7215 | Zao Wou-Ki | Untitled | 1958 | 1998 | 90771.0 | 1.629729e+05 |
9661 | Yayoi Kusama | Fallen Blossoms | NaN | 2018 | 225079.0 | 2.623204e+05 |
16962 | Joan Mitchell | After April, Bernie | 1925-1992 | 2006 | 688000.0 | 9.987433e+05 |
6621 | Zao Wou-Ki | 16.9.91 | 1991 | 2015 | 2609047.0 | 3.221502e+06 |
22988 | George Condo | Blue Sky Portrait | 1997 | 2013 | 365000.0 | 4.585356e+05 |
43109 | Jean-Paul Riopelle | Sans titre | 1923-2002 | 2017 | 1491281.0 | 1.780478e+06 |
29026 | Rudolf Stingel | UNTITLED | NaN | 2009 | 15677.0 | 2.138537e+04 |
34294 | Bernard Buffet | Pont dans un village | 1928-1999 | 2018 | 79099.0 | 9.218667e+04 |
40137 | Pierre Bonnard | Femme au chapeau | 1867-1947 | 2007 | 360000.0 | 5.081257e+05 |
34808 | Bernard Buffet | Roses | 1982 | 2013 | 97432.0 | 1.224001e+05 |
15081 | Jean Dubuffet | Réchaud-Four à gaz V | 1966 | 2011 | 636000.0 | 8.274625e+05 |
7798 | Gerhard Richter | Souvenir (no 48) | 1995 | 2014 | 80000.0 | 9.889666e+04 |
46506 | Anselm Kiefer | LILITH | NaN | 2011 | 494500.0 | 6.433651e+05 |
12174 | Marc Chagall | EVOCATION DU VILLAGE NATAL OU ETUDE POUR LA FE... | NaN | 2018 | 259384.0 | 3.023015e+05 |
7161 | Zao Wou-Ki | Sans titre | 1983 | 2003 | 55513.0 | 8.829433e+04 |
9871 | Yayoi Kusama | Infinity Nets OQWWS | 2006 | 2016 | 909052.0 | 1.108462e+06 |
17386 | Henri Matisse | Odalisque, brasero et coupe de fruits | 1929 | 1991 | NaN | NaN |
44705 | Tom Wesselmann | Study for Helen nude | 1981 | 2006 | 230000.0 | 3.338822e+05 |
50151 | Robert Ryman | Whitney revision painting 2 - diptych | 1969 | 1998 | 120000.0 | 2.154515e+05 |
38974 | Donald Judd | Untitled (1968-76) | NaN | 2012 | 86500.0 | 1.102584e+05 |
35428 | Bernard Buffet | Vase de fleurs | 1928-1999 | 2006 | 84000.0 | 1.219396e+05 |
45905 | Georges Braque | Les balances | NaN | 1987 | NaN | NaN |
That’s it! We can see that the adjusted_price
feature now has inflated each price_realized_USD
feature based on the auction_year
.